Summary of the paper

Title YouDACC: the Youtube Dialectal Arabic Comment Corpus
Authors Ahmed Salama, Houda Bouamor, Behrang Mohit and Kemal Oflazer
Abstract This paper presents YOUDACC, an automatically annotated large-scale multi-dialectal Arabic corpus collected from user comments on Youtube videos. Our corpus covers different groups of dialects: Egyptian (EG), Gulf (GU), Iraqi (IQ), Maghrebi (MG) and Levantine (LV). We perform an empirical analysis on the crawled corpus and demonstrate that our location-based proposed method is effective for the task of dialect labeling.
Topics Acquisition, Multilinguality
Full paper YouDACC: the Youtube Dialectal Arabic Comment Corpus
