LREC 2018 Proceedings

Summary of the paper

Title	Utilizing Large Twitter Corpora to Create Sentiment Lexica
Authors	Valerij Fredriksen, Brage Jahren and Björn Gambäck
Abstract	The paper describes an automatic Twitter sentiment lexicon creator and a lexicon-based sentiment analysis system. The lexicon creator is based on a Pointwise Mutual Information approach, utilizing 6.25 million automatically labeled tweets and 103 million unlabeled, with the created lexicon consisting of about 3 000 entries. In a comparison experiment, this lexicon beat a manually annotated lexicon. A sentiment analysis system utilizing the created lexicon, and handling both negation and intensification, produces results almost on par with sophisticated machine learning-based systems, while significantly outperforming those in terms of run-time.
Topics	Social Media Processing, Opinion Mining / Sentiment Analysis, Statistical And Machine Learning Methods
Full paper	Utilizing Large Twitter Corpora to Create Sentiment Lexica
Bibtex	@InProceedings{FREDRIKSEN18.1028, author = {Valerij Fredriksen and Brage Jahren and Björn Gambäck}, title = "{Utilizing Large Twitter Corpora to Create Sentiment Lexica}", booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)}, year = {2018}, month = {May 7-12, 2018}, address = {Miyazaki, Japan}, editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga}, publisher = {European Language Resources Association (ELRA)}, isbn = {979-10-95546-00-9}, language = {english} }