LREC 2016 Proceedings

Summary of the paper

Title	Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization
Authors	Muhammad Humayoun and Hwanjo Yu
Abstract	Preprocessing is a preliminary step in many fields including IR and NLP. The effect of basic preprocessing settings on English for text summarization is well-studied. However, there is no such effort found for the Urdu language (with the best of our knowledge). In this study, we analyze the effect of basic preprocessing settings for single-document text summarization for Urdu, on a benchmark corpus using various experiments. The analysis is performed using the state-of-the-art algorithms for extractive summarization and the effect of stopword removal, lemmatization, and stemming is analyzed. Results showed that these pre-processing settings improve the results.
Topics	Summarisation, Information Extraction, Information Retrieval, Evaluation Methodologies
Full paper	Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization
Bibtex	@InProceedings{HUMAYOUN16.82, author = {Muhammad Humayoun and Hwanjo Yu}, title = {Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization}, booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)}, year = {2016}, month = {may}, date = {23-28}, location = {Portorož, Slovenia}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, address = {Paris, France}, isbn = {978-2-9517408-9-1}, language = {english} }