Summary of the paper

Title Syntactic Analysis of Phrasal Compounds in Corpora: a Challenge for NLP Tools
Authors Carola Trips
Abstract The paper introduces a “train once, use many” approach for the syntactic analysis of phrasal compounds (PC) of the type XP+N like “Would you like to sit on my knee?” nonsense. PCs are a challenge for NLP tools since they require the identification of a syntactic phrase within a morphological complex. We propose a method which uses a state-of-the-art dependency parser not only to analyse sentences (the environment of PCs) but also to compound the non-head of PCs in a well-defined particular condition which is the analysis of the non-head spanning from the left boundary (mostly marked by a determiner) to the nominal head of the PC. This method contains the following steps: (a) the use an English state-of-the-art dependency parser with data comprising sentences with PCs from the British National Corpus (BNC), (b) the detection of parsing errors of PCs, (c) the separate treatment of the non-head structure using the same model, and (d) the attachment of the non-head to the compound head. The evaluation of the method showed that the accuracy of 76% could be improved by adding a step in the PC compounder module which specified user-defined contexts being sensitive to the part of speech of the non-head parts and by using TreeTagger, in line with our approach.
Topics Morphology, Parsing, Grammar and Syntax
Full paper Syntactic Analysis of Phrasal Compounds in Corpora: a Challenge for NLP Tools
Bibtex @InProceedings{TRIPS16.535,
  author = {Carola Trips},
  title = {Syntactic Analysis of Phrasal Compounds in Corpora: a Challenge for NLP Tools},
  booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
  year = {2016},
  month = {may},
  date = {23-28},
  location = {Portoro┼ż, Slovenia},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {978-2-9517408-9-1},
  language = {english}
 }
Powered by ELDA © 2016 ELDA/ELRA