| Title | 
  Partial Parsing of Spontaneous Spoken French | 
  
  
  | Authors | 
  Olivier Blanc, Matthieu Constant, Anne Dister and Patrick Watrin | 
  
  
  | Abstract | 
  This paper describes the process and the resources used to automatically annotate a French corpus of spontaneous speech transcriptions in super-chunks. Super-chunks are enhanced chunks that can contain lexical multiword units. This partial parsing is based on a preprocessing stage of the spoken data that consists in reformatting and tagging utterances that break the syntactic structure of the text, such as disfluencies. Spoken specificities were formalized thanks to a systematic linguistic study of a 40-hour-long speech transcription corpus. The chunker uses large-coverage and fine-grained language resources for general written language that have been augmented with resources specific to spoken French. It consists in iteratively applying finite-state lexical and syntactic resources and outputing a finite automaton representing all possible chunk analyses. The best path is then selected thanks to a hybrid disambiguation stage. We show that our system reaches scores that are comparable with state-of-the-art results in the field. | 
  
  
  | Topics | 
  Parsing, Speech resource/database, MultiWord Expressions & Collocations   | 
  
  
  Full paper  | 
  Partial Parsing of Spontaneous Spoken French | 
  
  
  Slides  | 
  - | 
  
  
  | Bibtex | 
  @InProceedings{BLANC10.554, 
   author =  {Olivier Blanc and Matthieu Constant and Anne Dister and Patrick Watrin},    title =  {Partial Parsing of Spontaneous Spoken French},    booktitle =  {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)},    year =  {2010},    month =  {may},    date =  {19-21},    address =  {Valletta, Malta},    editor =  {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias},    publisher =  {European Language Resources Association (ELRA)},    isbn =  {2-9517408-6-7},    language =  {english}  }   |