Summary of the paper

Title Effects of Document Clustering in Modeling Wikipedia-style Term Descriptions
Authors Atsushi Fujii, Yuya Fujii and Takenobu Tokunaga
Abstract Reflecting the rapid growth of science, technology, and culture, it has become common practice to consult tools on the World Wide Web for various terms. Existing search engines provide an enormous volume of information, but retrieved information is not organized. Hand-compiled encyclopedias provide organized information, but the quantity of information is limited. In this paper, aiming to integrate the advantages of both tools, we propose a method to organize a search result based on multiple viewpoints as in Wikipedia. Because viewpoints required for explanation are different depending on the type of a term, such as animal and disease, we model articles in Wikipedia to extract a viewpoint structure for each term type. To identify a set of term types, we independently use manual annotation and automatic document clustering for Wikipedia articles. We also propose an effective feature for clustering of Wikipedia articles. We experimentally show that the document clustering reduces the cost for the manual annotation while maintaining the accuracy for modeling Wikipedia articles.
Topics Summarisation, Information Extraction, Information Retrieval, Document Classification, Text categorisation
Full paper Effects of Document Clustering in Modeling Wikipedia-style Term Descriptions
Bibtex @InProceedings{FUJII12.714,
  author = {Atsushi Fujii and Yuya Fujii and Takenobu Tokunaga},
  title = {Effects of Document Clustering in Modeling Wikipedia-style Term Descriptions},
  booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)},
  year = {2012},
  month = {may},
  date = {23-25},
  address = {Istanbul, Turkey},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-7-7},
  language = {english}
Powered by ELDA © 2012 ELDA/ELRA