| Title | Analyzing and Aligning German compound nouns | 
  
  | Authors | Marion Weller and Ulrich Heid | 
  
  | Abstract | In this paper, we present and evaluate an approach for the compositional alignment of compound nouns using comparable corpora from technical domains. The task of term alignment consists in relating a source language term to its translation in a list of target language terms with the help of a bilingual dictionary. Compound splitting allows to transform a compound into a sequence of components which can be translated separately and then related to multi-word target language terms. We present and evaluate a method for compound splitting, and compare two strategies for term alignment (bag-of-word vs.  pattern-based). The simple word-based approach leads to a considerable amount of erroneous alignments, whereas the pattern-based approach reaches a decent precision. We also assess the reasons for alignment failures: in the comparable corpora used for our experiments, a substantial number of terms has no translation in the target  language data; furthermore, the non-isomorphic structures of source and target language terms cause alignment failures in many cases. | 
  
  | Topics | MultiWord Expressions & Collocations, Multilinguality, Morphology | 
  
  | Full paper  | Analyzing and Aligning German compound nouns | 
  
  | Bibtex | @InProceedings{WELLER12.817, author =  {Marion Weller and Ulrich Heid},
 title =  {Analyzing and Aligning German compound nouns},
 booktitle =  {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)},
 year =  {2012},
 month =  {may},
 date =  {23-25},
 address =  {Istanbul, Turkey},
 editor =  {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
 publisher =  {European Language Resources Association (ELRA)},
 isbn =  {978-2-9517408-7-7},
 language =  {english}
 }
 |