Integrating Spanish Linguistic Resources in a Web Site Assistant
Paloma Martínez (Universidad Carlos III de Madrid Avd. Universidad 30, 28911 Leganés, Madrid, Spain)
Ana García-Serrano (Universidad Politécnica de Madrid Campus de Montegancedo s/n, 28660 Boadilla del Monte, Madrid, Spain)
Alberto Ruiz-Cristina (Universidad Politécnica de Madrid Campus de Montegancedo s/n, 28660 Boadilla del Monte, Madrid, Spain)
WP3: Tools & Components
This work describes a proposal to improve web document retrieval by facing the main problems in document searching: first, traditional web search engines miss documents that are relevant to the user query and retrieve many that are not. Second, the query formulation is not as accessible as it could be, and some users have difficulties in expressing boolean queries. To improve the quality of Internet search engines, two main approaches have typically been adopted: One is the creation of a metasearch engine that makes use of multiple search engines by unifying both the query language and the type of results returned by the different search engines; the other one involves applying NLP techniques for query extensions in order to handle morphological, lexical, semantic and syntactic variations. Focusing on the second approach, we present the research project MESIA (project CAM 07T/0017/1998) for the Madrid Local Government web site (www.comadrid.es). Its main goal is to exploit general purpose linguistic resources to extend user queries in order to enhance the answers provided by AltaVista search engine.
Spanish linguistic resources