Summary of the paper

Title Clause-based Discourse Segmentation of Arabic Texts
Authors Iskandar Keskes, Farah Benamara and Lamia Hadrich Belguith
Abstract This paper describes a rule-based approach to segment Arabic texts into clauses. Our method relies on an extensive analysis of a large set of lexical cues as well as punctuation marks. Our analysis was carried out on two different corpus genres: news articles and elementary school textbooks. We propose a three steps segmentation algorithm: first by using only punctuation marks, then by relying only on lexical cues and finally by using both typology and lexical cues. The results were compared with manual segmentations elaborated by experts.
Topics Discourse annotation, representation and processing, Corpus (creation, annotation, etc.), Tools, systems, applications
Full paper Clause-based Discourse Segmentation of Arabic Texts
Bibtex @InProceedings{KESKES12.939,
  author = {Iskandar Keskes and Farah Benamara and Lamia Hadrich Belguith},
  title = {Clause-based Discourse Segmentation of Arabic Texts},
  booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)},
  year = {2012},
  month = {may},
  date = {23-25},
  address = {Istanbul, Turkey},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-7-7},
  language = {english}
Powered by ELDA © 2012 ELDA/ELRA