LREC 2000 2nd International Conference on Language Resources & Evaluation  
Home Basic Info Archaeological Zappeion Registration Conference

Conference Papers

Program
Papers
Sessions
Abstracts
Authors
Keywords
Search

Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377.

List of all papers and abstracts.


Previous Paper   Next Paper  

Title Development of Acoustic and Linguistic Resources for Research and Evaluation in Interactive Vocal Information Servers
Authors Bernardis Giulia (Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland, Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP), Martigny, Switzerland, giulia@idiap.ch)
Bourlard Herve (Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland, Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP), Martigny, Switzerland)
Rajman Martin (Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland)
Chappelier Jean-Cedric (Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland)
Keywords Knowledge Extraction, Named Entity Tagging, Orthographic Labeling, Speech Data Annotation, Unconstrained Speech Recognition
Session Session SP3 - Spoken Language Resources' Projects
Abstract This paper describes the setting up of a resource database for research and evaluation in the domain of interactive vocal information servers. All this resource development work took place in a research project aiming at the development of an advanced speech recognition system for the automatic processing of telephone directory requests and was performed on the basis of the Swiss-French Polyphone database (collected in the framework of the European SpeechDat project). Due to the unavailability of a properly orthographically transcribed, consistently labeled and tagged database of unconstrained speech (together with its associated lexicon) for the targeted area, we first concentrated on the annotation and structuration of the spoken requests data in order to make it profitable for lexical and linguistic modeling and for the evaluation of recognition results. A baseline speech recognition system was then trained on the newly developed resources and tested. Preliminary recognition experiments showed a relative improvement of 46% for the Word Error Rate (WER) compared to the results previously obtained with a baseline system very similar but working on the unconsistent natural speech database that was originally available.

 

="Verdana">