FreeLing: An Open-Source Suite of Language Analyzers


Xavier Carreras, Isaac Chao, Lluís Padró, Muntsa Padró

TALP Research Center, Universitat Politècnica de Catalunya, C/ Jordi Girona 1-3, 08034 Barcelona, Spain - {carreras,ichao,padro,mpadro}@lsi.upc.es




Basic language processing such as tokenizing, morphological analyzers, lemmatizing, PoS tagging, chunking, etc. is a need for most NL applications such as Machine Translation, Summarization, Dialogue systems, etc. A large part of the effort required to develop such applications is devoted to the adaptation of existing software resources to the platform, programming language, format or API of the final system. In LREC'02, we presented the object architecture that we are currently using (Carreras & Padró 02), which enables the quick and easy integration of basic language analyzers in any NLP application. Now we present a suite of analysis tools based on that architecture, which is distributed under Lesser General Public License (LGPL) (Free Software Foundation 99). The first release of the suite will include morphological analyzer and Part-of-Speech tagger for English, Spanish, and Catalan.


Morphological analisys, number/date/amounts recognition, multiword detection, PoS Tagging, open-source, free software, LGPL.

Language(s) Spanish, Catalan, English.
Full Paper