Multi-Document Summarization with GISTEXTER
Sanda M. Harabagiu (Language Computer Corporation Dallas TX 75206 USA)
Finley Lacatusu (Department of Computer Sciences, Universiy of Texas, Dallas, Mail Station EC31 Box 830688 Richardson, Texas 75083-0688)
Paul Morarescu (University of Texas, Dallas)
Steven J. Maiorano (University of Sheffield Sheffield S1 4DP UK)
EO3: Written Systems Evaluation
This paper presents the architecture and the multidocument summarization techniques implemented in the GISTEXTER system. The paper presents an algorithm for producing incremental multi-document summaries if extraction templates of good quality are available. An empirical method of generating ad-hoc templates that can be populated with information extracted from texts by automatically acquired extraction patterns is also presented. The results of GISTEXTER in the DUC-2001 evaluations account for the advantages of using the techniques presented in this paper.