LREC 2022 Proceedings Home | Workshops | LREC 2022 WEBSITE | ELRA WEBSITE


LREC 2022

SIGUL Workshop


Full proceedings volume (PDF) | Workshop Site | Home | Programme | Author index | Bibliography (BibTeX) | Editors

PROGRAM

Friday, June 24, 2022

 14:00–15:00 Opening
 Keynote talk
 15:00–16:00 Speech
15:00–15:15Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Marcely Zanon Boito, Bolaji Yusuf, Lucas Ondel, Aline Villavicencio and Laurent Besacier
15:15–15:30An Open Source Web Reader for Under-Resourced Languages
Judy Fong, Þorsteinn Daði Gunnarsson, Sunneva Þorsteinsdóttir, Gunnar Thor Örnólfsson and Jon Gudnason
15:30–15:45Text-to-Speech for Under-Resourced Languages: Phoneme Mapping and Source Language Selection in Transfer Learning
Phat Do, Matt Coler, Jelske Dijkstra and Esther Klabbers
15:45–16:00ReadAlong Studio: Practical Zero-Shot Text-Speech Alignment for Indigenous Language Audiobooks
Patrick Littell, Eric Joanis, Aidan Pine, Marc Tessier, David Huggins Daines and Delasie Torkornoo
 16:00–16:30 Coffee break
 16:30–17:45 Data
16:30–16:45Corpus Creation for Sentiment Analysis in Code-Mixed Tulu Text
Asha Hegde, Mudoor Devadas Anusha, Sharal Coelho, Hosahalli Lakshmaiah Shashirekha and Bharathi Raja Chakravarthi
16:45–17:00Crowd-sourcing for Less-resourced Languages: Lingua Libre for Polish
Mathilde Hutin and Marc Allassonnière-Tang
17:00–17:15Tupían Language Ressources: Data, Tools, Analyses
Lorena Martín Rodríguez, Tatiana Merzhevich, Wellington Silva, Tiago Tresoldi, Carolina Aragon and Fabrício F. Gerardi
17:15–17:30Quality versus Quantity: Building Catalan-English MT Resources
Ona de Gibert Bonet, Ksenia Kharitonova, Blanca Calvo Figueras, Jordi Armengol-Estapé and Maite Melero
17:30–17:45A Sentiment Corpus for South African Under-Resourced Languages in a Multilingual Context
Ronny Mabokela and Tim Schlippe

Saturday, June 25, 2022

 9:00–10:00 MT4All
 CUNI Submission to MT4All Shared Task
Ivana Kvapilíková and Ondrej Bojar
 10:00–10:30 General
10:00–10:15Resource: Indicators on the Presence of Languages in Internet
Daniel Pimienta
10:15–10:30Language Technologies for Low Resource Languages: Sociolinguistic and Multilingual Insights
A. Seza Doğruöz and Sunayana Sitaram
 10:30–11:00 Coffee break
 11:00–12:45 NLP
11:00–11:15Sentiment Analysis for Hausa: Classifying Students’ Comments
Ochilbek Rakhmanov and Tim Schlippe
11:15–11:30Nepali Encoder Transformers: An Analysis of Auto Encoding Transformer Language Models for Nepali Text Classification
Utsav Maskey, Manish Bhatta, Shiva Bhatt, Sanket Dhungel and Bal Krishna Bal
11:30–11:45CoSwID, a Code Switching Identification Method Suitable for Under-Resourced Languages
Laurent Kevers
11:45–12:00A Neural Network Approach to Create Minangkabau-Indonesia Bilingual Dictionary
Kartika Resiandi, Yohei Murakami and Arbi Haza Nasution
12:00–12:15Machine Translation from Standard German to Alemannic Dialects
Louisa Lambrecht, Felix Schneider and Alexander Waibel
12:15–12:30Question Answering Classification for Amharic Social Media Community Based Questions
Tadesse Destaw, Seid Muhie Yimam, Abinew Ayele and Chris Biemann
12:30–12:45Automatic Detection of Morphological Processes in the Yorùbá Language
Tunde Adegbola
 12:45–14:00 Lunch break
 14:00–14:50 Joint SIGUL2022-MWE Poster session
 Evaluating Unsupervised Approaches to Morphological Segmentation for Wolastoqey
Diego Bear and Paul Cook
 Baseline English and Maltese-English Classification Models for Subjectivity Detection, Sentiment Analysis, Emotion Analysis, Sarcasm Detection, and Irony Detection
Keith Cortis and Brian Davis
 Building Open-source Speech Technology for Low-resource Minority Languages with SáMi as an Example – Tools, Methods and Experiments
Katri Hiovain-Asikainen and Sjur Moshagen
 Investigating the Quality of Static Anchor Embeddings from Transformers for Under-Resourced Languages
Pranaydeep Singh, Orphee De Clercq and Els Lefever
 Introducing YakuToolkit. Yakut Treebank and Morphological Analyzer.
Tatiana Merzhevich and Fabrício Ferraz Gerardi
 A Language Model for Spell Checking of Educational Texts in Kurdish (Sorani)
Roshna Abdulrahman and Hossein Hassani
 SimRelUz: Similarity and Relatedness Scores as a Semantic Evaluation Dataset for Uzbek Language
Ulugbek Salaev, Elmurod Kuriyozov and Carlos Gómez-Rodríguez
 ENRICH4ALL: A First Luxembourgish BERT Model for a Multilingual Chatbot
Dimitra Anastasiou
 14:50–15:40 Joint SIGUL2022-MWE Keynote speech
 15:40–16:00 Joint SIGUL2022-MWE Common Discussion
 16:00–16:30 Coffee break
 16:30–17:30 Panel discussion
 17:30–17:50 General discussion
 17:50–18:00 Closing