LREC 2014 Proceedings

Summary of the paper

Title	An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis
Authors	Eshrag Refaee and Verena Rieser
Abstract	We present a newly collected data set of 8,868 gold-standard annotated Arabic feeds. The corpus is manually labelled for subjectivity and sentiment analysis (SSA) ( = 0:816). In addition, the corpus is annotated with a variety of motivated feature-sets that have previously shown positive impact on performance. The paper highlights issues posed by twitter as a genre, such as mixture of language varieties and topic-shifts. Our next step is to extend the current corpus, using online semi-supervised learning. A first sub-corpus will be released via the ELRA repository as part of this submission.
Topics	Social Media Processing, Opinion Mining / Sentiment Analysis
Full paper	An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis
Bibtex	@InProceedings{REFAEE14.317, author = {Eshrag Refaee and Verena Rieser}, title = {An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} }