Summary of the paper

Title Bootstrapping Language Neutral Term Extraction
Authors Wauter Bosma and Piek Vossen
Abstract A variety of methods exist for extracting terms and relations between terms from a corpus, each of them having strengths and weaknesses. Rather than just using the joint results, we apply different extraction methods in a way that the results of one method are input to another. This gives us the leverage to find terms and relations that otherwise would not be found. Our goal is to create a semantic model of a domain. To that end, we aim to find the complete terminology of the domain, consisting of terms and relations such as hyponymy and meronymy, and connected to generic wordnets and ontologies. Terms are ranked by domain-relevance only as a final step, after terminology extraction is completed. Because term relations are a large part of the semantics of a term, we estimate the relevance from its relation to other terms, in addition to occurrence and document frequencies. In the KYOTO project, we apply language-neutral terminology extraction from a parsed corpus for seven languages.
Topics Lexicon, lexical database, MultiWord Expressions & Collocations, Multilinguality
Full paper Bootstrapping Language Neutral Term Extraction
Slides Bootstrapping Language Neutral Term Extraction
Bibtex @InProceedings{BOSMA10.902,
  author = {Wauter Bosma and Piek Vossen},
  title = {Bootstrapping Language Neutral Term Extraction},
  booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
Powered by ELDA © 2010 ELDA/ELRA