Summary of the paper

Title SuperCAT: The (New and Improved) Corpus Analysis Toolkit
Authors K. Bretonnel Cohen, William A. Baumgartner Jr. and Irina Temnikova
Abstract This paper reports SuperCAT, a corpus analysis toolkit. It is a radical extension of SubCAT, the Sublanguage Corpus Analysis Toolkit, from sublanguage analysis to corpus analysis in general. The idea behind SuperCAT is that representative corpora have no tendency towards closure―that is, they tend towards infinity. In contrast, non-representative corpora have a tendency towards closure―roughly, finiteness. SuperCAT focuses on general techniques for the quantitative description of the characteristics of any corpus (or other language sample), particularly concerning the characteristics of lexical distributions. Additionally, SuperCAT features a complete re-engineering of the previous SubCAT architecture.
Topics Corpus (Creation, Annotation, etc.), Tools, Systems, Applications, Other
Full paper SuperCAT: The (New and Improved) Corpus Analysis Toolkit
Bibtex @InProceedings{COHEN16.742,
  author = {K. Bretonnel Cohen and William A. Baumgartner Jr. and Irina Temnikova},
  title = {SuperCAT: The (New and Improved) Corpus Analysis Toolkit},
  booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
  year = {2016},
  month = {may},
  date = {23-28},
  location = {Portorož, Slovenia},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {978-2-9517408-9-1},
  language = {english}
 }
Powered by ELDA © 2016 ELDA/ELRA