The Web as a Resource for Question Answering: Perspectives and Challenges


Jimmy Lin (MIT Artificial Intelligence Laboratory)


WO24: Applications Based On Written LRs


The vast amounts of information readily available on the World Wide Web can be effectively used for question answering in two fundamentally different ways. In the federated approach, techniques for handling semistructured data are applied to access Web sources as if they were databases, allowing large classes of common questions to be  answered uniformly. In the distributed approach, large-scale text-processing techniques are used to extract answers directly from unstructured Web documents. Because the Web  is orders of magnitude larger than any human-collected corpus, question answering systems can capitalize on its unparalleled-levels of data redundancy. Analysis of real-world user questions reveals that the federated and distributed approaches complement each other nicely, suggesting a hybrid approach in future question answering systems.


Question answering, Database federation, Information retrieval, TREC

Full Paper