LREC 2020 Proceedings Home | Workshops | LREC 2020 WEBSITE | ELRA WEB SITE


The 12th Web as Corpus Workshop


Full proceedings volume (PDF) | Workshop Site | Home | Programme | Author index | Bibliography (BibTeX) | Editors

PROGRAM

 Current Challenges in Web Corpus Building
Miloš Jakubíček, Vojtěch Kovář, Pavel Rychlý and Vit Suchomel
 Out-of-the-Box and into the Ditch? Multilingual Evaluation of Generic Text Extraction Tools
Adrien Barbaresi and Gaël Lejeune
 From Web Crawl to Clean Register-Annotated Corpora
Veronika Laippala, Samuel Rönnqvist, Saara Hellström, Juhani Luotolahti, Liina Repo, Anna Salmela, Valtteri Skantsi and Sampo Pyysalo
 Building Web Corpora for Minority Languages
Heidi Jauhiainen, Tommi Jauhiainen and Krister Lindén
 The ELTE.DH Pilot Corpus – Creating a Handcrafted Gigaword Web Corpus with Metadata
Balázs Indig, Árpád Knap, Zsófia Sárközi-Lindner, Mária Timári and Gábor Palkó
 Hypernym-LIBre: A Free Web-based Corpus for Hypernym Detection
Shaurya Rawat, Mariano Rico and Oscar Corcho
 A Cross-Genre Ensemble Approach to Robust Reddit Part of Speech Tagging
Shabnam Behzad and Amir Zeldes
 Streaming Language-Specific Twitter Data with Optimal Keywords
Tim Kreutz and Walter Daelemans