LREC 2020 Proceedings Home | Workshops | LREC 2020 WEBSITE | ELRA WEBSITE

Proceedings of the 12th Web as Corpus Workshop

ISBN: 979-10-95546-68-9
EAN: 9791095546689

List of Papers


Full proceedings volume (PDF) | Workshop Site | Home | Programme | Author index | Bibliography (BibTeX) | Editors



pdf bib Papers pages
pdf bib Current Challenges in Web Corpus Building
Miloš Jakubíček, Vojtěch Kovář, Pavel Rychlý and Vit Suchomel
pp. 1‑4
pdf bib Out-of-the-Box and into the Ditch? Multilingual Evaluation of Generic Text Extraction Tools
Adrien Barbaresi and Gaël Lejeune
pp. 5‑13
pdf bib From Web Crawl to Clean Register-Annotated Corpora
Veronika Laippala, Samuel Rönnqvist, Saara Hellström, Juhani Luotolahti, Liina Repo, Anna Salmela, Valtteri Skantsi and Sampo Pyysalo
pp. 14‑22
pdf bib Building Web Corpora for Minority Languages
Heidi Jauhiainen, Tommi Jauhiainen and Krister Lindén
pp. 23‑32
pdf bib The ELTE.DH Pilot Corpus – Creating a Handcrafted Gigaword Web Corpus with Metadata
Balázs Indig, Árpád Knap, Zsófia Sárközi-Lindner, Mária Timári and Gábor Palkó
pp. 33‑41
pdf bib Hypernym-LIBre: A Free Web-based Corpus for Hypernym Detection
Shaurya Rawat, Mariano Rico and Oscar Corcho
pp. 42‑49
pdf bib A Cross-Genre Ensemble Approach to Robust Reddit Part of Speech Tagging
Shabnam Behzad and Amir Zeldes
pp. 50‑56
pdf bib Streaming Language-Specific Twitter Data with Optimal Keywords
Tim Kreutz and Walter Daelemans
pp. 57‑64

Powered by ELDA © 2020 ELDA/ELRA