LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title The TREC-8 Question Answering Track
Authors Voorhees Ellen M. (National Institute of Standards and Technology, Gaithersburg, MD 20899, ellen.voorhees@nist.gov)
Tice Dawn M. (National Institute of Standards and Technology, Gaithersburg, MD 20899, dawn.tice@nist.gov)
Keywords Human Assessors, Question Answering, Validation
Session Session EO5 - Information Retrieval and Question Answering Evaluation
Full Paper 26.ps, 26.pdf
Abstract The TREC-8 Question Answering track was the first large-scale evaluation of domain-independent question answering systems. This paper summarizes the results of the track, including both an overview of the approaches taken to the problem and an analysis of the evaluation methodology. Retrieval results for the more stringent condition in which system responses were limited to 50 bytes showed that explicit linguistic processing was more effective than the bag-of-words approaches that are effective for document retrieval. The use of multiple human assessors to judge the correctness of the systems' responses demonstrated that assessors have legitimate differences of opinion as to correctness even for fact-based, short-answer questions. Evaluations of question answering technology will need to accommodate these differences since eventual end-users of the technology will have similar differences.