Cooperation between black box and glass box approaches for the evaluation of a question answering system


Martine Hurault-Plantet (LIMSI-CNRS Bât 508, Université de Paris-Sud  91403 ORSAY (France))

Laura Monceaux (LIMSI-CNRS Bât 508, Université de Paris-Sud 91403 ORSAY (France))


EP1: Evaluation


For the past three years, the question answering system QALC, currently developed in our team, has been taking part in the Question Answering (QA) track of evaluation campaigns TREC (Text REtrieval Conference). In the QA track, each system is evaluated according to a black box approach: as input, a set of questions, and as output, for each question, five answers ranked with regard to decreasing relevance. A score is then computed with regard to the correctness of the answers. Such an evaluation is attractive for comparing systems to each other, as well as for comparing a system to itself after a modification. However, the capacity for knowing how to improve the system requires another approach: the glass box approach. Indeed, in complex modular systems such as question answering systems, we have to "enter" inside the system and evaluate each module in order to assess if it reaches the goal that has been set for it, or not. Nevertheless, after modifying a module, we have to apply again the back box approach on the whole system in order to judge the effect of the modifications on the overall result. In this paper, we thus present an evaluation of our system, based both on black box and glass box approaches. We will describe the methods used as well as the results that we obtain. 


Quantitative evaluation, Glass box approach, Black box approach, Cooperation, Question answering

