AUTHOR: Browse articles of the conference sorted by author

A - B - C - D - E - F - G - H - I - J - K - L - M - N - O - P - Q - R - S - T - U - V - W - X - Y - Z - Å - Ç - Ö - Ø - Š - Ż - Ž

A
Abdelali, Ahmed Multi-Dialect Arabic POS Tagging: A CRF Approach
Part-of-Speech Tagging for Arabic Gulf Dialect Using Bi-LSTM
The WAW Corpus: The First Corpus of Interpreted Speeches and their Translations for English and Arabic
Abdennadher, Slim Collection and Analysis of Code-switch Egyptian Arabic-English Speech Corpus
Abdou, Sherif Improving Dialogue Act Classification for Spontaneous Arabic Speech and Instant Messages at Utterance Level
Abdou, Mostafa MGAD: Multilingual Generation of Analogy Datasets
Abdul-Mageed, Muhammad You Tweet What You Speak: A City-Level Dataset of Arabic Dialects
Abdulkareem, Basma Unified Guidelines and Resources for Arabic Dialect Orthography
Abdulrahim, Dana The MADAR Arabic Dialect Corpus and Lexicon
Unified Guidelines and Resources for Arabic Dialect Orthography
A Morphologically Annotated Corpus of Emirati Arabic
Abedi Firouzjaee, Hossein MirasText: An Automatically Generated Text Corpus for Persian
Abercrombie, Gavin 'Aye' or 'No'? Speech-level Sentiment Analysis of Hansard UK Parliamentary Debate Transcripts
Abner, Natasha Sign Languages and the Online World Online Dictionaries & Lexicostatistics
Aboelezz, Mariam Arabic Dialect Identification in the Context of Bivalency and Code-Switching
Abrami, Giuseppe TreeAnnotator: Versatile Visual Annotation of Hierarchical Text Relations
A UIMA Database Interface for Managing NLP-related Text Annotations
Abromeit, Frank Interoperability of Language-related Information: Mapping the BLL Thesaurus to Lexvo and Glottolog
Universal Morphologies for the Caucasus region
Abudukelimu, Halidanmu Error Analysis of Uyghur Name Tagging: Language-specific Techniques and Remaining Challenges
Abulizi, Adudoukelimu Error Analysis of Uyghur Name Tagging: Language-specific Techniques and Remaining Challenges
Abzianidze, Lasha Evaluating Scoped Meaning Representations
Adams, Oliver Evaluation Phonemic Transcription of Low-Resource Tonal Languages for Language Documentation
Adda, Gilles A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
Adda-Decker, Martine A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
The French-Algerian Code-Switching Triggered audio corpus (FACST)
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
Aduriz, Itziar Konbitzul: an MWE-specific database for Spanish-Basque
Afli, Haithem FooTweets: A Bilingual Parallel Corpus of World Cup Tweets
Agarwal, Sumeet SandhiKosh: A Benchmark Corpus for Evaluating Sanskrit Sandhi Tools
Agerri, Rodrigo Building Named Entity Recognition Taggers via Parallel Corpora
Annotating Abstract Meaning Representations for Spanish
Developing New Linguistic Resources and Tools for the Galician Language
Agić, Željko Baselines and Test Data for Cross-Lingual Inference
Agrawal, Ruchit No more beating about the bush : A Step towards Idiom Handling for Indian Language NLP
Aharodnik, Katsiaryna Designing a Russian Idiom-Annotated Corpus
Ahmad, Wasi A Corpus to Learn Refer-to-as Relations for Nominals
Ahmia, Oussama Two Multilingual Corpora Extracted from the Tenders Electronic Daily for Machine Learning and Machine Translation Applications.
Ahrens, Kathleen Using a Corpus of English and Chinese Political Speeches for Metaphor Analysis
Ai, Renlong TQ-AutoTest – An Automated Test Suite for (Machine) Translation Quality
Aizawa, Akiko Universal Dependencies for Ainu
Akbik, Alan ZAP: An Open-Source Multilingual Annotation Projection Framework
FEIDEGGER: A Multi-modal Corpus of Fashion Images and Descriptions in German
Aker, Ahmet Multi-lingual Argumentative Corpora in English, Turkish, Greek, Albanian, Croatian, Serbian, Macedonian, Bulgarian, Romanian and Arabic
Akhtar, Syed Sarfaraz Humor Detection in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System
Al Kaabi, Meera A Morphologically Annotated Corpus of Emirati Arabic
Al Khalil, Muhamed A Leveled Reading Corpus of Modern Standard Arabic
Al shargi, Faisal Unified Guidelines and Resources for Arabic Dialect Orthography
AlGhamdi, Fahad WASA: A Web Application for Sequence Annotation
Alacam, Özge Incorporating Contextual Information for Language-Independent, Dynamic Disambiguation Tasks
Albert, Pierre The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Aldabe, Itziar Building Named Entity Recognition Taggers via Parallel Corpora
Alexandersson, Simon A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
Alexandersson, Jan The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Alfalasi, Latifa A Leveled Reading Corpus of Modern Standard Arabic
Algahtani, Abeer Annotating Attribution Relations in Arabic
Alharbi, Randah Multi-Dialect Arabic POS Tagging: A CRF Approach
Part-of-Speech Tagging for Arabic Gulf Dialect Using Bi-LSTM
Alhuzali, Hassan You Tweet What You Speak: A City-Level Dataset of Arabic Dialects
Ali Raza, Agha PronouncUR: An Urdu Pronunciation Lexicon Generator
Alkhereyf, Sakhar Unified Guidelines and Resources for Arabic Dialect Orthography
Almuzaini, Huda Annotating Attribution Relations in Arabic
Alonso Alemany, Laura Increasing Argument Annotation Reproducibility by Using Inter-annotator Agreement to Improve Guidelines
Alonso-Ramos, Margarita A Lexical Tool for Academic Writing in Spanish based on Expert and Novice Corpora
Alosaimy, Abdulrahman Web-based Annotation Tool for Inflectional Language Resources
Alotaibi, Madawi Annotating Attribution Relations in Arabic
Alsaif, Amal Annotating Attribution Relations in Arabic
Alsarsour, Israa DART: A Large Dataset of Dialectal Arabic Tweets
Alsuhaibani, Mohammed Joint Learning of Sense and Word Embeddings
Alvez, Javier Cross-checking WordNet and SUMO Using Meronymy
Alyahya, Tasniem Annotating Attribution Relations in Arabic
Alzetta, Chiara Universal Dependencies and Quantitative Typological Trends. A Case Study on Word Order
Amsili, Pascal A Gold Anaphora Annotation Layer on an Eye Movement Corpus
Ananiadou, Sophia A New Corpus to Support Text Mining for the Curation of Metabolites in the ChEBI Database
Anchiêta, Rafael Towards AMR-BR: A SemBank for Brazilian Portuguese Language
Andersson, Michael Medical Entity Corpus with PICO elements and Sentiment Analysis
Andersson, Linda Medical Entity Corpus with PICO elements and Sentiment Analysis
Androulakaki, Theofronia KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue
Androutsopoulos, Ion BioRead: A New Dataset for Biomedical Reading Comprehension
Andruszkiewicz, Piotr Annotated Corpus of Scientific Conference's Homepages for Information Extraction
Andryushechkin, Vladimir A Comparison Of Emotion Annotation Schemes And A New Annotated Data Set
Ansari, Ebrahim Extracting an English-Persian Parallel Corpus from Comparable Corpora
Antonelli, Oronzo PoSTWITA-UD: an Italian Twitter Treebank in Universal Dependencies
Antonsen, Lene Modeling Northern Haida Verb Morphology
Building a Constraint Grammar Parser for Plains Cree Verbs and Arguments
Aoyama, Hiroyuki A Parallel Corpus of Arabic-Japanese News Articles
Araki, Kenji Comparison of Pun Detection Methods Using Japanese Pun Corpus
Araki, Masahiro Collection of Multimodal Dialog Data and Analysis of the Result of Annotation of Users' Interest Level
Aramaki, Eiji J-MeDic: A Japanese Disease Name Dictionary based on Real Clinical Usage
Aranberri, Nora Building Named Entity Recognition Taggers via Parallel Corpora
Arase, Yuki SPADE: Evaluation Dataset for Monolingual Phrase Alignment
CEFR-based Lexical Simplification Dataset
Arcan, Mihael Automatic Enrichment of Terminological Resources: the IATE RDF Example
Arivazhagan, Naveen CogCompNLP: Your Swiss Army Knife for NLP
Arnold, Thomas Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data
Arnold, Alexandre A Real-life, French-accented Corpus of Air Traffic Control Communications
Aroyo, Lora Resource Interoperability for Sustainable Benchmarking: The Case of Events
Arppe, Antti A Computational Architecture for the Morphology of Upper Tanana
Modeling Northern Haida Verb Morphology
Building a Constraint Grammar Parser for Plains Cree Verbs and Arguments
Arps, David A Parser for LTAG and Frame Semantics
Arranz, Victoria New directions in ELRA activities
Artstein, Ron Dialogue Structure Annotation for Multi-Floor Interaction
Chahta Anumpa: A multimodal corpus of the Choctaw Language
Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing
The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Asahara, Masayuki Universal Dependencies Version 2 for Japanese
All-words Word Sense Disambiguation Using Concept Embeddings
Asai, Akari HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Asao, Yoshihiko Annotating Zero Anaphora for Question Answering
Asghari, Habibollah Parsivar: A Language Processing Toolkit for Persian
Assylbekov, Zhenisbek Manual vs Automatic Bitext Extraction
Astésano, Corine Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Athar, Awais PronouncUR: An Urdu Pronunciation Lexicon Generator
Atkinson, Katie A Dataset for Inter-Sentence Relation Extraction using Distant Supervision
Attia, Mohammed Multi-Dialect Arabic POS Tagging: A CRF Approach
Multilingual Multi-class Sentiment Classification Using Convolutional Neural Networks
The Morpho-syntactic Annotation of Animacy for a Dependency Parser
Atwell, Eric Web-based Annotation Tool for Inflectional Language Resources
Auguste, Jeremy Semantic Frame Parsing for Information Extraction : the CALOR corpus
Auguste, Jérémy Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text
Auziņa, Ilze The Use of Text Alignment in Semi-Automatic Error Analysis: Use Case in the Development of the Corpus of the Latvian Language Learners
Avanzi, Mathieu Crowdsourcing Regional Variation Data and Automatic Geolocalisation of Speakers of European French
Strategies and Challenges for Crowdsourcing Regional Dialect Perception Data for Swiss German and Swiss French
Avramova, Vanya A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
Azpeitia, Andoni Evaluating Domain Adaptation for Machine Translation Across Scenarios

 

B
BECHET, FREDERIC Adding Syntactic Annotations to Flickr30k Entities Corpus for Multimodal Ambiguous Prepositional-Phrase Attachment Resolution
BESSAGNET, Marie-Noelle Automatic Identification of Research Fields in Scientific Papers
Badarau, Bianca Abstract Meaning Representation of Constructions: The More We Include, the Better the Representation
Badia, Toni MultiBooked: A Corpus of Basque and Catalan Hotel Reviews Annotated for Aspect-level Sentiment Classification
Baeriswyl, Michael Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
Baird, Austin Classifying Sluice Occurrences in Dialogue
Balaguer, Mathieu Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Bali, Kalika An Integrated Representation of Linguistic and Social Functions of Code-Switching
Discovering Canonical Indian English Accents: A Crowdsourcing-based Approach
Banski, Piotr Lightweight Grammatical Annotation in the TEI: New Perspectives
Bar-Haim, Roy SLIDE - a Sentiment Lexicon of Common Idioms
Barbaresi, Adrien A corpus of German political speeches from the 21st century
A database of German definitory contexts from selected web sources
Barbu Mititelu, Verginica Ensemble Romanian Dependency Parsing with Neural Networks
The Reference Corpus of the Contemporary Romanian Language (CoRoLa)
Barkarson, Starkaður Risamálheild: A Very Large Icelandic Text Corpus
Barnes, Jeremy MultiBooked: A Corpus of Basque and Catalan Hotel Reviews Annotated for Aspect-level Sentiment Classification
Barteld, Fabian HiNTS: A Tagset for Middle Low German
Bartolini, Roberto The LREC Workshops Map
Bartosiak, Tomasz A New Version of the Składnica Treebank of Polish Harmonised with the Walenty Valency Dictionary
Barzegar, Siamak A Multilingual Test Collection for the Semantic Search of Entity Categories
SemR-11: A Multi-Lingual Gold-Standard for Semantic Similarity and Relatedness for Eleven Languages
Indra: A Word Embedding and Semantic Relatedness Server
Batanović, Vuk Fine-grained Semantic Textual Similarity for Serbian
Batista-Navarro, Riza 'Aye' or 'No'? Speech-level Sentiment Analysis of Hansard UK Parliamentary Debate Transcripts
Batouche, Brahim PMKI: an European Commission action for the interoperability, maintainability and sustainability of Language Resources
Batra, Vishwash Neural Caption Generation for News Images
Baumann, Martin Effects of Gender Stereotypes on Trust and Likability in Spoken Human-Robot Interaction
Baumartz, Daniel FastSense: An Efficient Word Sense Disambiguation Classifier
Bayatli, Sevilay Finite-state morphological analysis for Gagauz
Bayomi, Mostafa C-HTS: A Concept-based Hierarchical Text Segmentation approach
Bechet, Frederic Semantic Frame Parsing for Information Extraction : the CALOR corpus
Becker, Karin A Large Parallel Corpus of Full-Text Scientific Articles
Behnke, Maximiliana Improving Machine Translation of Educational Content via Crowdsourcing
Bejček, Eduard ForFun 1.0: Prague Database of Forms and Functions -- An Invaluable Resource for Linguistic Research
Bel, Núria Can Domain Adaptation be Handled as Analogies?
Beliga, Slobodan Evaluation of Croatian Word Embeddings
Belik, Patrizia Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective
Bell, Dane Grounding Gradable Adjectives through Crowdsourcing
Bella, Gábor Using Crowd Agreement for Wordnet Localization
Bellandi, Andrea One Language to rule them all: modelling Morphological Patterns in a Large Scale Italian Lexicon with SWRL
Benjumea, Juan A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Benzitoun, Christophe Crowdsourcing Regional Variation Data and Automatic Geolocalisation of Speakers of European French
Bergem, Eivind Alexander NoReC: The Norwegian Review Corpus
Berkling, Kay A 2nd Longitudinal Corpus for Children's Writing with Enhanced Output for Specific Spelling Patterns
Berlingerio, Michele Towards a music-language mapping
Bermeitinger, Bernhard A Multilingual Test Collection for the Semantic Search of Entity Categories
Bernard, Guillaume Matics Software Suite: New Tools for Evaluation and Data Exploration
Bernhard, Delphine Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Pronunciation Dictionaries for the Alsatian Dialects to Analyze Spelling and Phonetic Variation
Bernstam, Elmer A FrameNet for Cancer Information in Clinical Narratives: Schema and Annotation
Bertoldi, Nicola ESCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing
Besacier, Laurent Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation
A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
Beser, Deniz Low-resource Post Processing of Noisy OCR Output for Historical Corpus Digitisation
Beskow, Jonas A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
Crowdsourced Multimodal Corpora Collection Tool
Beukeboom, Camiel Studying Muslim Stereotyping through Microportrait Extraction
Bhandari, Sujeet Text Normalization Infrastructure that Scales to Hundreds of Language Varieties
Bharadwaj, Varun Discovering Canonical Indian English Accents: A Crowdsourcing-based Approach
Bhardwaj, Shubham SandhiKosh: A Benchmark Corpus for Evaluating Sanskrit Sandhi Tools
Bhatia, Akshit Aggression-annotated Corpus of Hindi-English Code-mixed Data
Bhattacharyya, Pushpak Sentence Level Temporality Detection using an Implicit Time-sensed Resource
TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection
A Deep Neural Network based Approach for Entity Extraction in Code-Mixed Indian Social Media Text
The IIT Bombay English-Hindi Parallel Corpus
Morphology Injection for English-Malayalam Statistical Machine Translation
ASAP++: Enriching the ASAP Automated Essay Grading Dataset with Essay Attribute Scores
Medical Sentiment Analysis using Social Media: Towards building a Patient Assisted System
Sarcasm Target Identification: Dataset and An Introductory Approach
Indian Language Wordnets and their Linkages with Princeton WordNet
Towards a Standardized Dataset for Noun Compound Interpretation
MMQA: A Multi-domain Multi-lingual Question-Answering Framework for English and Hindi
Biemann, Chris Building a Web-Scale Dependency-Parsed Corpus from CommonCrawl
Enriching Frame Representations with Distributionally Induced Senses
An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages
Retrofitting Word Representations for Unsupervised Sense Aware Word Similarities
Improving Hypernymy Extraction with Distributional Semantic Classes
Bies, Ann Cross-Document, Cross-Language Event Coreference Annotation Using Event Hoppers
Simple Semantic Annotation and Situation Frames: Two Approaches to Basic Text Understanding in LORELEI
Bin Zia, Haris PronouncUR: An Urdu Pronunciation Lexicon Generator
Bird, Steven Evaluation Phonemic Transcription of Low-Resource Tonal Languages for Language Documentation
Bisazza, Arianna Examining the Tip of the Iceberg: A Data Set for Idiom Translation
Evaluation of Machine Translation Performance Across Multiple Genres and Languages
Blache, Philippe A Semi-autonomous System for Creating a Human-Machine Interaction Corpus in Virtual Reality: Application to the ACORFORMed System for Training Doctors to Break Bad News
Bladier, Tatiana AET: Web-based Adjective Exploration Tool for German
Blanco, Eduardo Annotating Temporally-Anchored Spatial Knowledge by Leveraging Syntactic Dependencies
Annotating If the Authors of a Tweet are Located at the Locations They Tweet About
Blank, Idan The Natural Stories Corpus
Bleier, Arnim ILCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Blessing, Andre The GermaParl Corpus of Parliamentary Protocols
Blodgett, Austin Semantic Supersenses for English Possessives
Blätte, Andreas The GermaParl Corpus of Parliamentary Protocols
Boberg, Jill The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Bock, Roger When ACE met KBP: End-to-End Evaluation of Knowledge Base Population with Component-level Annotation
Bojanowski, Piotr Learning Word Vectors for 157 Languages
Advances in Pre-Training Distributed Word Representations
Bollegala, Danushka Joint Learning of Sense and Word Embeddings
A Dataset for Inter-Sentence Relation Extraction using Distant Supervision
Sentiment-Stance-Specificity (SSS) Dataset: Identifying Support-based Entailment among Opinions.
Bompolas, Stavros Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective
Bond, Francis Toward An Epic Epigraph Graph
Bonial, Claire Dialogue Structure Annotation for Multi-Floor Interaction
Abstract Meaning Representation of Constructions: The More We Include, the Better the Representation
Bonin, Francesca SLIDE - a Sentiment Lexicon of Common Idioms
Towards a music-language mapping
Bonn, Julia The New Propbank: Aligning Propbank with AMR through POS Unification
Bono, Mayumi Preliminary Analysis of Embodied Interactions between Science Communicators and Visitors Based on a Multimodal Corpus of Japanese Conversations in a Science Museum
Borad, Niravkumar Multi-lingual Argumentative Corpora in English, Turkish, Greek, Albanian, Croatian, Serbian, Macedonian, Bulgarian, Romanian and Arabic
Borg, Claudia Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Borin, Lars SenSALDO: Creating a Sentiment Lexicon for Swedish
Generating a Gold Standard for a Swedish Sentiment Lexicon
Bos, Johan Evaluating Scoped Meaning Representations
Bosch, Sonja Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment
Bosco, Cristina PoSTWITA-UD: an Italian Twitter Treebank in Universal Dependencies
An Italian Twitter Corpus of Hate Speech against Immigrants
Application and Analysis of a Multi-layered Scheme for Irony on the Italian Twitter Corpus TWITTIRÒ
Bothe, Chandrakant A Context-based Approach for Dialogue Act Recognition using Simple Recurrent Neural Networks
Bouamor, Houda The MADAR Arabic Dialect Corpus and Lexicon
Unified Guidelines and Resources for Arabic Dialect Orthography
MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction
Bouchekif, Abdessalam FrNewsLink : a corpus linking TV Broadcast News Segments and Press Articles
Boula de Mareüil, Philippe Crowdsourcing Regional Variation Data and Automatic Geolocalisation of Speakers of European French
A Speaking Atlas of the Regional Languages of France
Boumber, Dainis Experiments with Convolutional Neural Networks for Multi-Label Authorship Attribution
Bourgonje, Peter Automatic and Manual Web Annotations in an Infrastructure to handle Fake News and other Online Media Phenomena
Bouscarrat, Leo Towards Language Technology for Mi'kmaq
Boutz, Jennifer Arabic Data Science Toolkit: An API for Arabic Language Feature Extraction
Bowden, Kevin SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems
Bowden, Richard SMILE Swiss German Sign Language Dataset
Boyes Braem, Penny SMILE Swiss German Sign Language Dataset
Braffort, Annelies Modeling French Sign Language: a proposal for a semantically compositional system
Branco, Ruben Browsing and Supporting Pluricentric Global Wordnet, or just your Wordnet of Interest
Branco, António We Are Depleting Our Research Subject as We Are Investigating It: In Language Technology, more Replication and Diversity Are Needed
Semantic Equivalence Detection: Are Interrogatives Harder than Declaratives?
Finely Tuned, 2 Billion Token Based Word Embeddings for Portuguese
Browsing and Supporting Pluricentric Global Wordnet, or just your Wordnet of Interest
Bras, Myriam Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Braschler, Martin Overcoming the Long Tail Problem: A Case Study on CO2-Footprint Estimation of Recipes using Information Retrieval
Brasoveanu, Adrian Framing Named Entity Linking Error Types
Brassey, Jon Medical Entity Corpus with PICO elements and Sentiment Analysis
Braun, Bettina The Distribution and Prosodic Realization of Verb Forms in German Infant-Directed Speech
Braunger, Patricia Towards an Automatic Assessment of Crowdsourced Data for NLU
Bravo, Àlex PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles
Britz, Denny JESC: Japanese-English Subtitle Corpus
Brixey, Jacqueline Chahta Anumpa: A multimodal corpus of the Choctaw Language
Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing
Brizan, David Guy Candidate Ranking for Maintenance of an Online Dictionary
Interpersonal Relationship Labels for the CALLHOME Corpus
Broad, Claire Candidate Ranking for Maintenance of an Online Dictionary
Brock, Heike Deep JSLC: A Multimodal Corpus Collection for Data-driven Generation of Japanese Sign Language Expressions
Broux, Pierre-Alexandre Computer-assisted Speaker Diarization: How to Evaluate Human Corrections
Brown, Susan Windisch Integrating Generative Lexicon Event Structures into VerbNet
Bruijnes, Merijn An Information-Providing Closed-Domain Human-Agent Interaction Corpus
Brum, Henrico Building a Sentiment Corpus of Tweets in Brazilian Portuguese
Buechel, Sven Representation Mapping: A Novel Approach to Generate High-Quality Multi-Lingual Emotion Lexicons
Sharing Copies of Synthetic Clinical Corpora without Physical Distribution — A Case Study to Get Around IPRs and Privacy Constraints Featuring the German JSYNCC Corpus
Bui, Trung Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing
PhotoshopQuiA: A Corpus of Non-Factoid Questions and Answers for Why-Question Answering
Buitelaar, Paul A Comparison Of Emotion Annotation Schemes And A New Annotated Data Set
Automatic Enrichment of Terminological Resources: the IATE RDF Example
A supervised approach to taxonomy extraction using word embeddings
Teanga: A Linked Data based platform for Natural Language Processing
Bulling, Andreas A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks
Bunt, Harry Towards an ISO Standard for the Annotation of Quantification
Towards Continuous Dialogue Corpus Creation: writing to corpus and generating from it
Burchardt, Aljoscha TQ-AutoTest – An Automated Test Suite for (Machine) Translation Quality
Bures, Lukas Towards Processing of the Oral History Interviews and Related Printed Documents
Burga, Alicia Compilation of Corpora for the Study of the Information Structure–Prosody Interface
Butt, Miriam A Multilingual Approach to Question Classification
Bystedt, Mattias FARMI: A FrAmework for Recording Multi-Modal Interactions
Béchet, Nicolas Two Multilingual Corpora Extracted from the Tenders Electronic Daily for Machine Learning and Machine Translation Applications.
Béchet, Frédéric Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text
Bērziņš, Aivars Collecting Language Resources from Public Administrations in the Nordic and Baltic Countries
Bělohlávek, Petr Using Adversarial Examples in Natural Language Processing

 

C
CHOI, KEY-SUN Incorporating Global Contexts into Sentence Embedding for Relational Extraction at the Paragraph Level with Distant Supervision
Semi-automatic Korean FrameNet Annotation over KAIST Treebank
Automatic Wordnet Mapping: from CoreNet to Princeton WordNet
Unsupervised Korean Word Sense Disambiguation using CoreNet
Cabezas-García, Melania Towards the Inference of Semantic Relations in Complex Nominals: a Pilot Study
Cai, Zhongxi Statistical Analysis of Missing Translation in Simultaneous Interpretation Using A Large-scale Bilingual Speech Corpus
Callahan, Tiffany Three Dimensions of Reproducibility in Natural Language Processing
Callison-Burch, Chris Introducing NIEUW: Novel Incentives and Workflows for Eliciting Linguistic Data
Calvo, Hiram Distribution of Emotional Reactions to News Articles in Twitter
Calvo, Arturo The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Calzolari, Nicoletta LREMap, a Song of Resources and Evaluation
Camelin, Nathalie FrNewsLink : a corpus linking TV Broadcast News Segments and Press Articles
Simulating ASR errors for training SLU systems
Camgöz, Necati Cihan SMILE Swiss German Sign Language Dataset
Camilleri, Kenneth Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Campbell, Nick The Metalogue Debate Trainee Corpus: Data Collection and Annotations
The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Chats and Chunks: Annotation and Analysis of Multiparty Long Casual Conversations
Development of an Annotated Multimodal Dataset for the Investigation of Classification and Summarisation of Presentations using High-Level Paralinguistic Features
Speech Rate Calculations with Short Utterances: A Study from a Speech-to-Speech, Machine Translation Mediated Map Task
Candito, Marie Cheating a Parser to Death: Data-driven Cross-Treebank Annotation Transfer
Cao, Shuyuan Using Discourse Information for Education with a Spanish-Chinese Parallel Corpus
Cao, Xuan-Nga BabyCloud, a Technological Platform for Parents and Researchers
Cao, Yan Analyzing Vocabulary Commonality Index Using Large-scaled Database of Child Language Development
Cao, Kai Sound Signal Processing with Seq2Tree Network
Cardellino, Fernando Increasing Argument Annotation Reproducibility by Using Inter-annotator Agreement to Improve Guidelines
Cardellino, Cristian Increasing Argument Annotation Reproducibility by Using Inter-annotator Agreement to Improve Guidelines
Cardie, Claire A Corpus of eRulemaking User Comments for Measuring Evaluability of Arguments
Cardier, Beth Annotating High-Level Structures of Short Stories and Personal Anecdotes
Carl, Michael Literality and cognitive effort: Japanese and Spanish
Carlini, Roberto Generation of a Spanish Artificial Collocation Error Corpus
Carman, Mark Sarcasm Target Identification: Dataset and An Introductory Approach
Carrive, Jean Computer-assisted Speaker Diarization: How to Evaluate Human Corrections
Caruso, Christopher Cross-Document, Cross-Language Event Coreference Annotation Using Event Hoppers
Caseli, Helena de Medeiros The Effects of Unimodal Representation Choices on Multimodal Learning
Caselli, Tommaso Systems’ Agreements and Disagreements in Temporal Processing: An Extensive Error Analysis of the TempEval-3 Task
The Circumstantial Event Ontology (CEO) and ECB+/CEO: an Ontology and Corpus for Implicit Causal Relations between Events
Cassidy, Steve Signbank: Software to Support Web Based Dictionaries of Sign Language
Castilho, Sheila Improving Machine Translation of Educational Content via Crowdsourcing
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Cavalcanti, Maria Cláudia RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Cerrato, Loredana The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Chagnaa, Altangerel Using Crowd Agreement for Wordnet Localization
Chamberlain, Jon Scalable Visualisation of Sentiment and Stance
Chang, Kai-Wei A Corpus to Learn Refer-to-as Relations for Nominals
A Corpus of Drug Usage Guidelines Annotated with Type of Advice
Chang, Baobao EventWiki: A Knowledge Base of Major Events
Chang, Liping Building a TOCFL Learner Corpus for Chinese Grammatical Error Diagnosis
Chang, Walter Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing
PhotoshopQuiA: A Corpus of Non-Factoid Questions and Answers for Why-Question Answering
A Repository of Corpora for Summarization
Charfi, Anis Arap-Tweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification
Charlet, Delphine Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text
FrNewsLink : a corpus linking TV Broadcast News Segments and Press Articles
Charnoz, Audrey Recognizing Behavioral Factors while Driving: A Real-World Multimodal Corpus to Monitor the Driver’s Affective State
Chathuranga, Janaka Annotating Opinions and Opinion Targets in Student Course Feedback
Chatterjee, Rajen ESCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing
Chaturvedi, Nikhil SandhiKosh: A Benchmark Corpus for Evaluating Sanskrit Sandhi Tools
Chatzikyriakidis, Stergios Shami: A Corpus of Levantine Arabic Dialects
Chaudiron, Stéphane Automatic Identification of Research Fields in Scientific Papers
Chen, Hsin-Hsi Learning to Map Natural Language Statements into Knowledge Base Representations for Knowledge Base Construction
Transfer of Frames from English FrameNet to Construct Chinese FrameNet: A Bilingual Corpus-Based Approach
Chen, Wenliang M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains
Chen, Chi-Yen Word Embedding Evaluation Datasets and Wikipedia Title Embedding for Chinese
Chen, Zhipeng Dataset for the First Evaluation on Chinese Machine Reading Comprehension
Chen, Le The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media
Chen, Sheng-Yeh EmotionLines: An Emotion Corpus of Multi-Party Conversations
Chen, Jia-Jun Dynamic Oracle for Neural Machine Translation in Decoding Phase
Chen, Emily A Morphological Analyzer for St. Lawrence Island / Central Siberian Yupik
Chen, Lei CONDUCT: An Expressive Conducting Gesture Dataset for Sound Control
Chenthil Kumar, Vighnesh No more beating about the bush : A Step towards Idiom Handling for Indian Language NLP
Chernov, Alexandr Handling Big Data and Sensitive Data Using EUDAT's Generic Execution Framework and the WebLicht Workflow Engine.
Chersnoskutov, Mikhail An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages
Cheung, Jackie Chi Kit Constructing a Lexicon of Relational Nouns
Chiang, Chiung-Yu Fluid Annotation: A Granularity-aware Annotation Tool for Chinese Word Fluidity
Chiarcos, Christian Interoperability of Language-related Information: Mapping the BLL Thesaurus to Lexvo and Glottolog
Universal Morphologies for the Caucasus region
Towards a Linked Open Data Edition of Sumerian Corpora
The ACoLi CoNLL Libraries: Beyond Tab-Separated Values
Analyzing Middle High German Syntax with RDF and SPARQL
Chin, Peter Sound Signal Processing with Seq2Tree Network
Chiruzzo, Luis Spanish HPSG Treebank based on the AnCora Corpus
Cho, Hyunsouk Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Cho, Eunjoon Text Normalization Infrastructure that Scales to Hundreds of Language Varieties
Choi, Gyu Hyeon Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages
Choi, Ho-Jin Korean TimeBank Including Relative Temporal Information
Choi, Jinho D. Building Universal Dependency Treebanks in Korean
Choi, Seungtaek Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Chordia, Sushil Translating Web Search Queries into Natural Language Questions
Choudhury, Monojit An Integrated Representation of Linguistic and Social Functions of Code-Switching
Discovering Canonical Indian English Accents: A Crowdsourcing-based Approach
Choukri, Khalid Data Management Plan (DMP) for Language Data under the New General Da-ta Protection Regulation (GDPR)
Automatic Identification of Maghreb Dialects Using a Dictionary-Based Approach
European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Christodoulopoulos, Christos Simple Large-scale Relation Extraction from Unstructured Text
CogCompNLP: Your Swiss Army Knife for NLP
Chu, Xiaomin Building a Macro Chinese Discourse Treebank
Chua, Mason Text Normalization Infrastructure that Scales to Hundreds of Language Varieties
Chun, Jayeol Building Universal Dependency Treebanks in Korean
Chung, Yiling Building Named Entity Recognition Taggers via Parallel Corpora
Chung, Youngjoo JESC: Japanese-English Subtitle Corpus
Cianflone, Andre Attention for Implicit Discourse Relation Recognition
Cieliebak, Mark SB-CH: A Swiss German Corpus with Sentiment Annotations
Cieri, Christopher From ‘Solved Problems’ to New Challenges: A Report on LDC Activities
Introducing NIEUW: Novel Incentives and Workflows for Eliciting Linguistic Data
Cignarella, Alessandra Teresa Application and Analysis of a Multi-layered Scheme for Irony on the Italian Twitter Corpus TWITTIRÒ
Clematide, Simon Strategies and Challenges for Crowdsourcing Regional Dialect Perception Data for Swiss German and Swiss French
Coccaro, Noah Text Normalization Infrastructure that Scales to Hundreds of Language Varieties
Coenen, Frans A Dataset for Inter-Sentence Relation Extraction using Distant Supervision
Cohen, K. Bretonnel Three Dimensions of Reproducibility in Natural Language Processing
Cohn, Trevor Evaluation Phonemic Transcription of Low-Resource Tonal Languages for Language Documentation
Cominetti, Federica The ICoN Corpus of Academic Written Italian (L1 and L2)
Conger, Kathryn The New Propbank: Aligning Propbank with AMR through POS Unification
Conneau, Alexis SentEval: An Evaluation Toolkit for Universal Sentence Representations
Cook, Paul Towards Language Technology for Mi'kmaq
Cooper-Leavitt, Jamison A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
Cotterell, Ryan UniMorph 2.0: Universal Morphology
Cox, Christopher A Computational Architecture for the Morphology of Upper Tanana
Coyne, Bob Evaluating the WordsEye Text-to-Scene System: Imaginative and Realistic Sentences
Crasborn, Onno Signbank: Software to Support Web Based Dictionaries of Sign Language
Creutz, Mathias Open Subtitles Paraphrase Corpus for Six Languages
Croijmans, Ilja Discovering the Language of Wine Reviews: A Text Mining Account
Cruz, Hilaria Evaluation Phonemic Transcription of Low-Resource Tonal Languages for Language Documentation
Cuadros, Montse Biomedical term normalization of EHRs with UMLS
Cuba Gyllensten, Amaru Distributional Term Set Expansion
Cuconato, Bruno Text Mining for History: first steps on building a large dataset
Cudré-Mauroux, Philippe Sanaphor++: Combining Deep Neural Networks with Semantics for Coreference Resolution
Cui, Yiming Dataset for the First Evaluation on Chinese Machine Reading Comprehension
Cui, Lei EventWiki: A Knowledge Base of Major Events
Cunha, Tiago A Multilingual Test Collection for the Semantic Search of Entity Categories
Curtis, Keith Development of an Annotated Multimodal Dataset for the Investigation of Classification and Summarisation of Presentations using High-Level Paralinguistic Features
Cvetanović, Miloš Fine-grained Semantic Textual Similarity for Serbian

 

D
Daelemans, Walter WordKit: a Python Package for Orthographic and Phonological Featurization
Dagan, Ido Automatic Thesaurus Construction for Modern Hebrew
Dai, Xin-Yu Dynamic Oracle for Neural Machine Translation in Decoding Phase
Daille, Béatrice Word Embedding Approach for Synonym Extraction of Multi-Word Terms
Towards a Diagnosis of Textual Difficulties for Children with Dyslexia
Dakhlia, Cyrille BabyCloud, a Technological Platform for Parents and Researchers
Dalmia, Siddharth Epitran: Precision G2P for Many Languages
Damnati, Géraldine Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text
Semantic Frame Parsing for Information Extraction : the CALOR corpus
FrNewsLink : a corpus linking TV Broadcast News Segments and Press Articles
Dan, Cristea A Bird’s-eye View of Language Processing Projects at the Romanian Academy
Dandapat, Sandipan Translating Web Search Queries into Natural Language Questions
Darwish, Kareem Multi-Dialect Arabic POS Tagging: A CRF Approach
Part-of-Speech Tagging for Arabic Gulf Dialect Using Bi-LSTM
Darģis, Roberts The Use of Text Alignment in Semi-Automatic Error Analysis: Use Case in the Development of the Corpus of the Latvian Language Learners
Das, Debopam Developing the Bangla RST Discourse Treebank
Davis, Brian A Multilingual Test Collection for the Semantic Search of Entity Categories
The SSIX Corpora: Three Gold Standard Corpora for Sentiment Analysis in English, Spanish and German Financial Microblogs
SemR-11: A Multi-Lingual Gold-Standard for Semantic Similarity and Relatedness for Eleven Languages
Indra: A Word Embedding and Semantic Relatedness Server
De Hertog, Dirk Contextualized Usage-Based Material Selection
De Jong, Franciska CLARIN: Towards FAIR and Responsible Data Science Using Language Resources
De Kuthy, Kordula QUD-Based Annotation of Discourse Structure and Information Structure: Tool and Evaluation
De La Clergerie, Eric ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations
Cheating a Parser to Death: Data-driven Cross-Treebank Annotation Transfer
De Melo, Gerard FontLex: A Typographical Lexicon based on Affective Associations
Metaphor Suggestions based on a Semantic Metaphor Repository
De Montcheuil, Grégoire A Semi-autonomous System for Creating a Human-Machine Interaction Corpus in Virtual Reality: Application to the ACORFORMed System for Training Doctors to Break Bad News
De Silva, Pasindu Voice Builder: A Tool for Building Text-To-Speech Voices
De Smedt, Koenraad CLARIN: Towards FAIR and Responsible Data Science Using Language Resources
De Vos, Hugo A Multilingual Wikified Data Set of Educational Material
Declerck, Thierry Comparing Pretrained Multilingual Word Embeddings on an Ontology Alignment Task
An Integrated Formal Representation for Terminological and Lexical Data included in Classification Schemes
European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Del Carmen, Patricia BabyCloud, a Technological Platform for Parents and Researchers
Del Gratta, Riccardo LREMap, a Song of Resources and Evaluation
Del Pozo, Arantza ES-Port: a Spontaneous Spoken Human-Human Technical Support Corpus for Dialogue Research in Spanish
Del Río Gayo, Iria A Lexicon of Discourse Markers for Portuguese – LDM-PT
Error annotation in a Learner Corpus of Portuguese
Delaborde, Agnes Matics Software Suite: New Tools for Evaluation and Data Exploration
Delais-Roussarie, Élisabeth SynPaFlex-Corpus: An Expressive French Audiobooks Corpus dedicated to expressive speech synthesis.
Delecraz, Sebastien Adding Syntactic Annotations to Flickr30k Entities Corpus for Multimodal Ambiguous Prepositional-Phrase Attachment Resolution
Delhay, Arnaud EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.
Deligiannis, Miltos Managing Public Sector Data for Multilingual Applications Development
Dell'Orletta, Felice Universal Dependencies and Quantitative Typological Trends. A Case Study on Word Order
Delpech, Estelle A Real-life, French-accented Corpus of Air Traffic Control Communications
Delvaux, Veronique The MonPaGe_HA Database for the Documentation of Spoken French Throughout Adulthood
Deléger, Louise Combining rule-based and embedding-based approaches to normalize textual entities with an ontology
Demberg, Vera A vision-grounded dataset for predicting typical locations for verbs
Rollenwechsel-English: a large-scale semantic role corpus
Den, Yasuharu Construction of the Corpus of Everyday Japanese Conversation: An Interim Report
Dennison, Mark Unfolding the External Behavior and Inner Affective State of Teammates through Ensemble Learning: Experimental Evidence from a Dyadic Team Corpus
Deriu, Jan SB-CH: A Swiss German Corpus with Sentiment Annotations
Dermouche, Soumia From analysis to modeling of engagement as sequences of multimodal behaviors
Dernoncourt, Franck Transfer Learning for Named-Entity Recognition with Neural Networks
A Repository of Corpora for Summarization
Derungs, Curdin Towards faithfully visualizing global linguistic diversity
Desmet, Piet Contextualized Usage-Based Material Selection
Di Nunzio, Giorgio Maria TriMED: A Multilingual Terminological Database
Di Tommaso, Giorgia A Large Multilingual and Multi-domain Dataset for Recommender Systems
DiPersio, Denise From ‘Solved Problems’ to New Challenges: A Report on LDC Activities
Diab, Mona WASA: A Web Application for Sequence Annotation
Sentence and Clause Level Emotion Annotation, Detection, and Classification in a Multi-Genre Corpus
Dias, Rafael Building a Corpus for Personality-dependent Natural Language Understanding and Generation
Author Profiling from Facebook Corpora
Dias, Gihan Improving domain-specific SMT for low-resourced languages using data from different domains
Diaz de Ilarraza, Arantza Konbitzul: an MWE-specific database for Spanish-Basque
Annotating Abstract Meaning Representations for Spanish
Dietz, Feike Linguistic and Sociolinguistic Annotation of 17th Century Dutch Letters
Difallah, Djellel Eddine Sanaphor++: Combining Deep Neural Networks with Semantics for Coreference Resolution
Dilsizian, Mark Linguistically-driven Framework for Computationally Efficient and Scalable Sign Recognition
Dima, Emanuel Handling Big Data and Sensitive Data Using EUDAT's Generic Execution Framework and the WebLicht Workflow Engine.
Dimitriadis, Alexis The AnnCor CHILDES Treebank
Dimitrova, Vanya Interoperability of Language-related Information: Mapping the BLL Thesaurus to Lexvo and Glottolog
Dinarelli, Marco ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations
Djegdjiga, Amazouz The French-Algerian Code-Switching Triggered audio corpus (FACST)
Dmitriev, Ivan Recognizing Behavioral Factors while Driving: A Real-World Multimodal Corpus to Monitor the Driver’s Affective State
Do, Quoc Truong Construction of English-French Multimodal Affective Conversational Corpus from TV Dramas
Do, Quang CogCompNLP: Your Swiss Army Knife for NLP
Dobnik, Simon Shami: A Corpus of Levantine Arabic Dialects
Dombek, Felix A Lexicon of Discourse Markers for Portuguese – LDM-PT
Dominguez, Monica Compilation of Corpora for the Study of the Information Structure–Prosody Interface
Donandt, Kathrin Universal Morphologies for the Caucasus region
Donato, Giulia Classifying the Informative Behaviour of Emoji in Microblogs
Donnelly, Kevin Leveraging Lexical Resources and Constraint Grammar for Rule-Based Part-of-Speech Tagging in Welsh
Dore, Giulia A Legal Perspective on Training Models for Natural Language Processing
Dou, Zi-Yi Dynamic Oracle for Neural Machine Translation in Decoding Phase
Doudagiri, Vivek Reddy Annotating If the Authors of a Tweet are Located at the Locations They Tweet About
Doukhan, David Computer-assisted Speaker Diarization: How to Evaluate Human Corrections
Dras, Mark A Fast and Accurate Vietnamese Word Segmenter
Dreessen, Katharina HiNTS: A Tagset for Middle Low German
Drenth, Eduard The Boarnsterhim Corpus: A Bilingual Frisian-Dutch Panel and Trend Study
Droganova, Kira Parse Me if You Can: Artificial Treebanks for Parsing Experiments on Elliptical Constructions
Drouin, Patrick Lexical Profiling of Environmental Corpora
Duan, Manjuan Test Sets for Chinese Nonlocal Dependency Parsing
Dubinskaite, Ieva GenDR: A Generic Deep Realizer with Complex Lexicalization
Dubuisson Duplessis, Guillaume An Information-Providing Closed-Domain Human-Agent Interaction Corpus
Duc-Anh, Phan EMTC: Multilabel Corpus in Movie Domain for Emotion Analysis in Conversational Text
Dulceanu, Andrei PhotoshopQuiA: A Corpus of Non-Factoid Questions and Answers for Why-Question Answering
Dupoux, Emmanuel BabyCloud, a Technological Platform for Parents and Researchers
Duthie, Rory Intertextual Correspondence for Integrating Corpora
Dürlich, Luise EFLLex: A Graded Lexical Resource for Learners of English as a Foreign Language

 

E
Ebling, Sarah SMILE Swiss German Sign Language Dataset
Eckart, Kerstin Moving TIGER beyond Sentence-Level
German Radio Interviews: The GRAIN Release of the SFB732 Silver Standard Collection
Eckart, Thomas Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment
Eckart de Castilho, Richard A Legal Perspective on Training Models for Natural Language Processing
Ediriweera, Shanika Annotating Opinions and Opinion Targets in Student Course Feedback
Edlund, Jens Bringing Order to Chaos: A Non-Sequential Approach for Browsing Large Sets of Found Audio Data
Egg, Markus A Large Automatically-Acquired All-Words List of Multiword Expressions Scored for Compositionality
Improving Machine Translation of Educational Content via Crowdsourcing
A Multilingual Wikified Data Set of Educational Material
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Egorova, Kseniya An Integrated Formal Representation for Terminological and Lexical Data included in Classification Schemes
Ehara, Yo Building an English Vocabulary Knowledge Dataset of Japanese English-as-a-Second-Language Learners Using Crowdsourcing
Eibl, Maximilian CoLoSS: Cognitive Load Corpus with Speech and Performance Data from a Symbol-Digit Dual-Task
Ein Dor, Liat Semantic Relatedness of Wikipedia Concepts -- Benchmark Data and a Working Solution
Eisner, Jason UniMorph 2.0: Universal Morphology
Ek, Adam Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction
Ekbal, Asif Sentence Level Temporality Detection using an Implicit Time-sensed Resource
TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection
A Deep Neural Network based Approach for Entity Extraction in Code-Mixed Indian Social Media Text
Medical Sentiment Analysis using Social Media: Towards building a Patient Assisted System
MMQA: A Multi-domain Multi-lingual Question-Answering Framework for English and Hindi
El Amel Boussaha, Basma A Multi-Domain Framework for Textual Similarity. A Case Study on Question-to-Question and Question-Answering Similarity Tasks
El-Haj, Mahmoud Arabic Dialect Identification in the Context of Bivalency and Code-Switching
Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger
Elahi, Mohammad Fazleh Bridging the LAPPS Grid and CLARIN
Elaraby, Mohamed You Tweet What You Speak: A City-Level Dataset of Arabic Dialects
Eldesouki, Mohamed Multi-Dialect Arabic POS Tagging: A CRF Approach
Elia, Francesco Huge Automatically Extracted Training-Sets for Multilingual Word SenseDisambiguation
Elkahky, Ali Multilingual Multi-class Sentiment Classification Using Convolutional Neural Networks
The Morpho-syntactic Annotation of Animacy for a Dependency Parser
Ellis, Joe Laying the Groundwork for Knowledge Base Population: Nine Years of Linguistic Resources for TAC KBP
Elmadany, AbdelRahim Improving Dialogue Act Classification for Spontaneous Arabic Speech and Instant Messages at Utterance Level
Elmahdy, Mohamed Collection and Analysis of Code-switch Egyptian Arabic-English Speech Corpus
Elsahar, Hady T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples
Elsayed, Tamer DART: A Large Dataset of Dialectal Arabic Tweets
Elßmann, Benedikt LiDo RDF: From a Relational Database to a Linked Data Graph of Linguistic Terms and Bibliographic Data
Engelmann, Jonas BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools
Erdmann, Alexander The MADAR Arabic Dialect Corpus and Lexicon
Unified Guidelines and Resources for Arabic Dialect Orthography
Erdogan, Kenan ILCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Erhart, Pascale Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Erjavec, Tomaž CLARIN’s Key Resource Families
Ernst, Michael D. NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System
Eryani, Fadhl The MADAR Arabic Dialect Corpus and Lexicon
Unified Guidelines and Resources for Arabic Dialect Orthography
A Morphologically Annotated Corpus of Emirati Arabic
Eskander, Ramy Unified Guidelines and Resources for Arabic Dialect Orthography
SentiArabic: A Sentiment Analyzer for Standard Arabic
Esteve, Yannick FrNewsLink : a corpus linking TV Broadcast News Segments and Press Articles
Esteves, Diego LIdioms: A Multilingual Linked Idioms Data Set
Estève, Yannick Simulating ASR errors for training SLU systems
Evaluation of Feature-Space Speaker Adaptation for End-to-End Acoustic Models
Etchegoyhen, Thierry Evaluating Domain Adaptation for Machine Translation Across Scenarios
Even, Susan Signbank: Software to Support Web Based Dictionaries of Sign Language
Evensen, Sara HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Evert, Stefan Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods

 

F
Fabre, Cécile Extending the gold standard for a lexical substitution task: is it worth it?
Fadaee, Marzieh Examining the Tip of the Iceberg: A Data Set for Idiom Translation
Fairon, Cédrick SW4ALL: a CEFR Classified and Aligned Corpus for Language Learning
An SLA Corpus Annotated with Pedagogically Relevant Grammatical Structures
Falenska, Agnieszka Moving TIGER beyond Sentence-Level
German Radio Interviews: The GRAIN Release of the SFB732 Silver Standard Collection
Fallgren, Per FARMI: A FrAmework for Recording Multi-Modal Interactions
Bringing Order to Chaos: A Non-Sequential Approach for Browsing Large Sets of Found Audio Data
Fam, Rashel Tools for The Production of Analogical Grids and a Resource of N-gram Analogical Grids in 11 Languages
Fancellu, Federico Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method
NegPar: A parallel corpus annotated for negation
Faraj, Reem Unified Guidelines and Resources for Arabic Dialect Orthography
Faralli, Stefano Building a Web-Scale Dependency-Parsed Corpus from CommonCrawl
MIsA: Multilingual "IsA" Extraction from Corpora
Enriching Frame Representations with Distributionally Induced Senses
A Large Multilingual and Multi-domain Dataset for Recommender Systems
Improving Hypernymy Extraction with Distributional Semantic Classes
Farhath, Fathima Improving domain-specific SMT for low-resourced languages using data from different domains
Farinas, Jérôme Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Farkas, Richárd SzegedKoref: A Hungarian Coreference Corpus
E-magyar -- A Digital Language Processing System
Farrugia, Reuben A Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Farrús, Mireia Compilation of Corpora for the Study of the Information Structure–Prosody Interface
Faruqui, Manaal UniMorph 2.0: Universal Morphology
Farvardin, Amin Automatic Identification of Research Fields in Scientific Papers
Favre, Benoit Adding Syntactic Annotations to Flickr30k Entities Corpus for Multimodal Ambiguous Prepositional-Phrase Attachment Resolution
Fayet, Cédric EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.
Fedorenko, Evelina The Natural Stories Corpus
Fedorova, Olga A «Portrait» Approach to Multichannel Discourse
Fedotov, Dmitrii Contextual Dependencies in Time-Continuous Multidimensional Affect Recognition
Feldman, Anna Designing a Russian Idiom-Annotated Corpus
Feltracco, Anna Enriching a Lexicon of Discourse Connectives with Corpus-based Data
Feng, Song World Knowledge for Abstract Meaning Representation Parsing
Feng, Zhili CogCompNLP: Your Swiss Army Knife for NLP
Ferenczi, Zsanett Evaluation of Dictionary Creating Methods for Finno-Ugric Minority Languages
Fernández Gallardo, Laura The Nautilus Speaker Characterization Corpus: Speech Recordings and Labels of Speaker Characteristics and Voice Descriptions
Fernández Torné, Anna Evaluating Domain Adaptation for Machine Translation Across Scenarios
Ferreira, Thiago RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Ferro, Marcello Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective
Ferré, Arnaud Combining rule-based and embedding-based approaches to normalize textual entities with an ontology
Ferrés, Daniel PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles
Feußner, Hubertus Expert Evaluation of a Spoken Dialogue System in a Clinical Operating Room
Filhol, Michael Elicitation protocol and material for a corpus of long prepared monologues in Sign Language
Modeling French Sign Language: a proposal for a semantically compositional system
Fillwock, Sarah Identification of Personal Information Shared in Chat-Oriented Dialogue
Fiumara, James Introducing NIEUW: Novel Incentives and Workflows for Eliciting Linguistic Data
Fišer, Darja CLARIN: Towards FAIR and Responsible Data Science Using Language Resources
CLARIN’s Key Resource Families
Flavier, Sebastien BDPROTO: A Database of Phonological Inventories from Ancient and Reconstructed Languages
Fluhr, Christian Automatic Identification of Maghreb Dialects Using a Dictionary-Based Approach
Fokkens, Antske Neural Models of Selectional Preferences for Implicit Semantic Role Labeling
Studying Muslim Stereotyping through Microportrait Extraction
Fonseca, Alexsandro Retrieving Information from the French Lexical Network in RDF/OWL Format
Forbes, Angus Text Annotation Graphs: Annotating Complex Natural Language Phenomena
Fort, Karën Toward a Lightweight Solution for Less-resourced Languages: Creating a POS Tagger for Alsatian Using Voluntary Crowdsourcing
Fougeron, Cécile The MonPaGe_HA Database for the Documentation of Spoken French Throughout Adulthood
Fox, Chris Improving Hate Speech Detection with Deep Learning Ensembles
Francesconi, Enrico PMKI: an European Commission action for the interoperability, maintainability and sustainability of Language Resources
Franco, Wellington A Multilingual Test Collection for the Semantic Search of Entity Categories
Franco-Salvador, Marc CATS: A Tool for Customized Alignment of Text Simplification Corpora
Francois, Thomas EFLLex: A Graded Lexical Resource for Learners of English as a Foreign Language
Francon, Daniel A Semi-autonomous System for Creating a Human-Machine Interaction Corpus in Virtual Reality: Application to the ACORFORMed System for Training Doctors to Break Bad News
Francopoulo, Gil Measuring Innovation in Speech and Language Processing Publications.
Frank, Anette DeModify: A Dataset for Analyzing Contextual Constraints on Modifier Deletion
Frank, Andrew Building Literary Corpora for Computational Literary Analysis - A Prototype to Bridge the Gap between CL and DH
Fraser, Kathleen A Swedish Cookie-Theft Corpus
Fredouille, Corinne Dysarthric speech evaluation: automatic and perceptual approaches
Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Fredriksen, Valerij Utilizing Large Twitter Corpora to Create Sentiment Lexica
Freedman, Marjorie When ACE met KBP: End-to-End Evaluation of Knowledge Base Population with Component-level Annotation
Freitas, André Building a Knowledge Graph from Natural Language Definitions for Interpretable Text Entailment Recognition
A Multilingual Test Collection for the Semantic Search of Entity Categories
The SSIX Corpora: Three Gold Standard Corpora for Sentiment Analysis in English, Spanish and German Financial Microblogs
SemR-11: A Multi-Lingual Gold-Standard for Semantic Similarity and Relatedness for Eleven Languages
Indra: A Word Embedding and Semantic Relatedness Server
Freitas, Cláudia Text Mining for History: first steps on building a large dataset
Fritz, Devon DeModify: A Dataset for Analyzing Contextual Constraints on Modifier Deletion
Frontini, Francesca One Language to rule them all: modelling Morphological Patterns in a Large Scale Italian Lexicon with SWRL
Fucikova, Eva Creating a Verb Synonym Lexicon Based on a Parallel Corpus
Tools for Building an Interlinked Synonym Lexicon Network
Fujie, Shinya Collection of Multimodal Dialog Data and Analysis of the Result of Annotation of Users' Interest Level
Fukunaga, Shun-ya Analysis of Implicit Conditions in Database Search Dialogues
Fukuoka, Tomotaka JAIST Annotated Corpus of Free Conversation
Funk, Christina Community-Driven Crowdsourcing: Data Collection with Local Developers
Futrell, Richard The Natural Stories Corpus
Färber, Michael A High-Quality Gold Standard for Citation-based Tasks
Fäth, Christian Interoperability of Language-related Information: Mapping the BLL Thesaurus to Lexvo and Glottolog
Universal Morphologies for the Caucasus region
Analyzing Middle High German Syntax with RDF and SPARQL
Fürbacher, Monica GeCoTagger: Annotation of German Verb Complements with Conditional Random Fields

 

G
Gabryszak, Aleksandra A German Corpus for Fine-Grained Named Entity Recognition and Relation Extraction of Traffic and Industry Events
A Corpus Study and Annotation Schema for Named Entity Recognition and Relation Extraction of Business Products
Gaillard, Pascal Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Gaillat, Thomas The SSIX Corpora: Three Gold Standard Corpora for Sentiment Analysis in English, Spanish and German Financial Microblogs
Gainer, Alesia The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Galarreta-Piquette, Daniel GenDR: A Generic Deep Realizer with Complex Lexicalization
Galibert, Olivier Matics Software Suite: New Tools for Evaluation and Data Exploration
Galuscakova, Petra Low Resource Methods for Medieval Document Sections Analysis
Gambino, Omar Juárez Distribution of Emotional Reactions to News Articles in Twitter
Gambäck, Björn Utilizing Large Twitter Corpora to Create Sentiment Lexica
Ganbold, Amarsanaa Using Crowd Agreement for Wordnet Localization
Gandhi, Anshul A FrameNet for Cancer Information in Clinical Narratives: Schema and Annotation
Gangashetty, Suryakanth V Phonetically Balanced Code-Mixed Speech Corpus for Hindi-English Automatic Speech Recognition
Gangula, Rama Rohit Reddy Resource Creation Towards Automated Sentiment Analysis in Telugu (a low resource language) and Integrating Multiple Domain Sources to Enhance Sentiment Prediction
Gantayat, Neelamadhav SandhiKosh: A Benchmark Corpus for Evaluating Sanskrit Sandhi Tools
Gao, Yuze Cross-lingual Terminology Extraction for Translation Quality Estimation
Gao, Yanjun PyrEval: An Automated Method for Summary Content Analysis
Garcia, Marcos A Lexical Tool for Academic Writing in Spanish based on Expert and Novice Corpora
García Salido, Marcos A Lexical Tool for Academic Writing in Spanish based on Expert and Novice Corpora
García-Mendoza, Consuelo-Varinia Distribution of Emotional Reactions to News Articles in Twitter
García-Sardiña, Laura ES-Port: a Spontaneous Spoken Human-Human Technical Support Corpus for Dialogue Research in Spanish
Garg, Rahul SandhiKosh: A Benchmark Corpus for Evaluating Sanskrit Sandhi Tools
Garmendia Arratibel, Lierni English-Basque Statistical and Neural Machine Translation
Gaspari, Federico Improving Machine Translation of Educational Content via Crowdsourcing
Gatt, Albert Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Gatti, Lorenzo An Information-Providing Closed-Domain Human-Agent Interaction Corpus
Gauthier, Elodie Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
Gawlik, Ireneusz An Application for Building a Polish Telephone Speech Corpus
Ge, Tao EventWiki: A Knowledge Base of Major Events
Geiger, Melanie Overcoming the Long Tail Problem: A Case Study on CO2-Footprint Estimation of Recipes using Information Retrieval
Georgakopoulou, Panayota Improving Machine Translation of Educational Content via Crowdsourcing
A Multilingual Wikified Data Set of Educational Material
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Georgi, Ryan PDF-to-Text Reanalysis for Linguistic Data Mining
Georgila, Kallirroi Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing
Geraci, Carlo Sign Languages and the Online World Online Dictionaries & Lexicostatistics
Gerstenlauer, Nadine Expert Evaluation of a Spoken Dialogue System in a Clinical Operating Room
Gervits, Felix Dialogue Structure Annotation for Multi-Floor Interaction
Towards a Conversation-Analytic Taxonomy of Speech Overlap
Gete, Harritxu Using Discourse Information for Education with a Spanish-Chinese Parallel Corpus
Getman, Jeremy Laying the Groundwork for Knowledge Base Population: Nine Years of Linguistic Resources for TAC KBP
Geyken, Alexander A database of German definitory contexts from selected web sources
Gezmu, Andargachew Mekonnen Portable Spelling Corrector for a Less-Resourced Language: Amharic
Ghaddar, Abbas Transforming Wikipedia into a Large-Scale Fine-Grained Entity Type Corpus
Ghannay, Sahar Simulating ASR errors for training SLU systems
Ghassemi, Mohammad A Repository of Corpora for Summarization
Gheith, Mervat Improving Dialogue Act Classification for Spontaneous Arabic Speech and Instant Messages at Utterance Level
Ghio, Alain Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Ghobadi, Mina Multi-lingual Argumentative Corpora in English, Turkish, Greek, Albanian, Croatian, Serbian, Macedonian, Bulgarian, Romanian and Arabic
Ghosal, Tirthankar TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection
Giagkou, Maria Managing Public Sector Data for Multilingual Applications Development
Gibet, Sylvie CONDUCT: An Expressive Conducting Gesture Dataset for Sound Control
Gibson, Edward The Natural Stories Corpus
Gilmartin, Emer The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Chats and Chunks: Annotation and Analysis of Multiparty Long Casual Conversations
Ginter, Filip Parse Me if You Can: Artificial Treebanks for Parsing Experiments on Elliptical Constructions
Gleim, Rüdiger WikiDragon: A Java Framework For Diachronic Content And Network Analysis Of MediaWikis
Glikman, Julie Crowdsourcing Regional Variation Data and Automatic Geolocalisation of Speakers of European French
Globo, Achille Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic Lexical Resources
Godard, Pierre A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
Godea, Andreea Annotating Educational Questions for Student Response Analysis
Goel, Pranav Sarcasm Target Identification: Dataset and An Introductory Approach
Goeuriot, Lorraine Building Evaluation Datasets for Cultural Microblog Retrieval
Goggi, Sara LREMap, a Song of Resources and Evaluation
The LREC Workshops Map
Gokirmak, Memduh Finite-state morphological analysis for Gagauz
Goldhahn, Dirk Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment
Goldman, Jean-Philippe Crowdsourcing Regional Variation Data and Automatic Geolocalisation of Speakers of European French
Strategies and Challenges for Crowdsourcing Regional Dialect Perception Data for Swiss German and Swiss French
MIAPARLE: Online training for the discrimination of stress contrasts
Golshan, Behzad HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Gomez, Héctor Corpus Building and Evaluation of Aspect-based Opinion Summaries from Tweets in Spanish
Gonzalez-Dios, Itziar Cross-checking WordNet and SUMO Using Meronymy
Gonçalves, Teresa A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language
Goodman, Michael Wayne PDF-to-Text Reanalysis for Linguistic Data Mining
Gorman, Kyle Improving homograph disambiguation with supervised machine learning
Goss, Foster Three Dimensions of Reproducibility in Natural Language Processing
Goutte, Cyril EuroGames16: Evaluating Change Detection in Online Conversation
Goyal, Pawan Network Features Based Co-hyponymy Detection
Building a Word Segmenter for Sanskrit Overnight
Granet, Adeline Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Gratch, Jonathan The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Grave, Edouard Learning Word Vectors for 157 Languages
Advances in Pre-Training Distributed Word Representations
Gravier, Christophe T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples
Green, Nathan The First 100 Days: A Corpus Of Political Agendas on Twitter
Gref, Michael Improved Transcription and Indexing of Oral History Interviews for Digital Humanities Research
Gregori, Lorenzo One event, many representations. Mapping action concepts through visual features.
Griffitt, Kira Abstract Meaning Representation of Constructions: The More We Include, the Better the Representation
Simple Semantic Annotation and Situation Frames: Two Approaches to Basic Text Understanding in LORELEI
Grigonyte, Gintare Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction
Grinberg, Yuri EuroGames16: Evaluating Change Detection in Online Conversation
Grobol, Loïc ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations
Gromann, Dagmar Comparing Pretrained Multilingual Word Embeddings on an Ontology Alignment Task
Gross, Stephanie Action Verb Corpus
Grouin, Cyril Three Dimensions of Reproducibility in Natural Language Processing
Grover, Claire Up-cycling Data for Natural Language Generation
Grubenmann, Ralf SB-CH: A Swiss German Corpus with Sentiment Annotations
Gruzitis, Normunds Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Gudnason, Jon Risamálheild: A Very Large Icelandic Text Corpus
Gung, James The New Propbank: Aligning Propbank with AMR through POS Unification
Guntakandla, Nishitha Annotating Reflections for Health Behavior Change Therapy
Gupta, Deepak A Deep Neural Network based Approach for Entity Extraction in Code-Mixed Indian Social Media Text
MMQA: A Multi-domain Multi-lingual Question-Answering Framework for English and Hindi
Gupta, Prateek Building a Word Segmenter for Sanskrit Overnight
Gupta, Shashank CogCompNLP: Your Swiss Army Knife for NLP
Gupta, Prakhar Learning Word Vectors for 157 Languages
Gupta, Manish A Workbench for Rapid Generation of Cross-Lingual Summaries
Gurevych, Iryna Adapting Serious Game for Fallacious Argumentation to German: Pitfalls, Insights, and Best Practices
A Legal Perspective on Training Models for Natural Language Processing
Gustafson, Joakim A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
Crowdsourced Multimodal Corpora Collection Tool
Gustafson Capková, Sofia Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction
Gutkin, Alexander FonBund: A Library for Combining Cross-lingual Phonological Segment Data
Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Guðnason, Jón Open ASR for Icelandic: Resources and a Baseline System
Gärtner, Markus Preserving Workflow Reproducibility: The RePlay-DH Client as a Tool for Process Documentation
German Radio Interviews: The GRAIN Release of the SFB732 Silver Standard Collection
A Lightweight Modeling Middleware for Corpus Processing
Gómez Guinovart, Xavier Developing New Linguistic Resources and Tools for the Galician Language

 

H
HSIEH, Shu-Kai Fluid Annotation: A Granularity-aware Annotation Tool for Chinese Word Fluidity
Ha, Thanh-Le KIT-Multi: A Translation-Oriented Multilingual Embedding Corpus
Ha, Linne Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Community-Driven Crowdsourcing: Data Collection with Local Developers
Haaf, Susanne Lightweight Grammatical Annotation in the TEI: New Perspectives
Haagsma, Hessel Evaluating Scoped Meaning Representations
Habash, Nizar The MADAR Arabic Dialect Corpus and Lexicon
Unified Guidelines and Resources for Arabic Dialect Orthography
A Parallel Corpus of Arabic-Japanese News Articles
A Morphologically Annotated Corpus of Emirati Arabic
A Leveled Reading Corpus of Modern Standard Arabic
CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing
Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages
MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction
Habernal, Ivan Adapting Serious Game for Fallacious Argumentation to German: Pitfalls, Insights, and Best Practices
Hachicha, Marouane Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Hadfield, Simon SMILE Swiss German Sign Language Dataset
Hadiwinoto, Christian Upping the Ante: Towards a Better Benchmark for Chinese-to-English Machine Translation
Hadjadj, Mohamed nassime Modeling French Sign Language: a proposal for a semantically compositional system
Hadjadj, Mohamed Nassime Elicitation protocol and material for a corpus of long prepared monologues in Sign Language
Hagen, Kristin The LIA Treebank of Spoken Norwegian Dialects
Hahm, Younggyun Semi-automatic Korean FrameNet Annotation over KAIST Treebank
Automatic Wordnet Mapping: from CoreNet to Princeton WordNet
Unsupervised Korean Word Sense Disambiguation using CoreNet
Hahn, Uli Preserving Workflow Reproducibility: The RePlay-DH Client as a Tool for Process Documentation
Hahn, Udo Representation Mapping: A Novel Approach to Generate High-Quality Multi-Lingual Emotion Lexicons
Sharing Copies of Synthetic Clinical Corpora without Physical Distribution — A Case Study to Get Around IPRs and Privacy Constraints Featuring the German JSYNCC Corpus
Hahn-Powell, Gus Text Annotation Graphs: Annotating Complex Natural Language Phenomena
Haider, Fasih The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Haider, Samar Urdu Word Embeddings
Hajic, Jan Creating a Verb Synonym Lexicon Based on a Parallel Corpus
Diacritics Restoration Using Neural Networks
SumeCzech: Large Czech News-Based Summarization Dataset
Tools for Building an Interlinked Synonym Lexicon Network
Bridging the LAPPS Grid and CLARIN
Hajicova, Eva Discourse Coherence Through the Lens of an Annotated Text Corpus: A Case Study
Creating a Verb Synonym Lexicon Based on a Parallel Corpus
Tools for Building an Interlinked Synonym Lexicon Network
Hajlaoui, Najeh PMKI: an European Commission action for the interoperability, maintainability and sustainability of Language Resources
Hajnicz, Elżbieta A New Version of the Składnica Treebank of Polish Harmonised with the Walenty Valency Dictionary
Halevy, Alon HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Halfon, Alon Semantic Relatedness of Wikipedia Concepts -- Benchmark Data and a Working Solution
Halpern, Jack Very Large-Scale Lexical Resources to Enhance Chinese and Japanese Machine Translation
Hamed, Injy Collection and Analysis of Code-switch Egyptian Arabic-English Speech Corpus
Hamlaoui, Fatima BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools
Hamza, Anissa Classifying Sluice Occurrences in Dialogue
Han, Ting A Corpus of Natural Multimodal Spatial Scene Descriptions
Han, Kijong Unsupervised Korean Word Sense Disambiguation using CoreNet
Han, Xu The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media
Han, Na-Rae Building Universal Dependency Treebanks in Korean
Parser combinators for Tigrinya and Oromo morphology
Hanbury, Allan Medical Entity Corpus with PICO elements and Sentiment Analysis
Handschuh, Siegfried Building a Knowledge Graph from Natural Language Definitions for Interpretable Text Entailment Recognition
A Multilingual Test Collection for the Semantic Search of Entity Categories
SemR-11: A Multi-Lingual Gold-Standard for Semantic Similarity and Relatedness for Eleven Languages
Indra: A Word Embedding and Semantic Relatedness Server
Hanselowski, Andreas Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data
Hao, Tang Voice Builder: A Tool for Building Text-To-Speech Voices
Hao, Zehui ScholarGraph:a Chinese Knowledge Graph of Chinese Scholars
Hardmeier, Christian ParCorFull: a Parallel Corpus Annotated with Full Coreference
Hardt, Daniel Classifying Sluice Occurrences in Dialogue
Hare, Jonathon T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples
Hargraves, Orin Three Dimensions of Reproducibility in Natural Language Processing
Harrison, Andre Unfolding the External Behavior and Inner Affective State of Teammates through Ensemble Learning: Experimental Evidence from a Dyadic Team Corpus
Harrison, Vrindavan Exploring Conversational Language Generation for Rich Content about Hotels
Hartmann, Silvana An Integrated Representation of Linguistic and Social Functions of Code-Switching
Hartmann, Mareike A Danish FrameNet Lexicon and an Annotated Corpus Used for Training and Evaluating a Semantic Frame Classifier
Hasantha, Ravindu Annotating Opinions and Opinion Targets in Student Course Feedback
Hasegawa, Mika Social Image Tags as a Source of Word Embeddings: A Task-oriented Evaluation
Hassan, Sara Unified Guidelines and Resources for Arabic Dialect Orthography
Hathout, Nabil Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Haug, Tobias SMILE Swiss German Sign Language Dataset
Hausendorf, Heiko Building a Corpus from Handwritten Picture Postcards: Transcription, Annotation and Part-of-Speech Tagging
Hautli-Janisz, Annette A Multilingual Approach to Question Classification
Hayakawa, Akira Speech Rate Calculations with Short Utterances: A Study from a Speech-to-Speech, Machine Translation Mediated Map Task
Hayashi, Yoshihiko Social Image Tags as a Source of Word Embeddings: A Task-oriented Evaluation
Hayes, Cory Dialogue Structure Annotation for Multi-Floor Interaction
Hazan, Rafal Annotated Corpus of Scientific Conference's Homepages for Information Extraction
Hazem, Amir Word Embedding Approach for Synonym Extraction of Multi-Word Terms
A Multi-Domain Framework for Textual Similarity. A Case Study on Question-to-Question and Question-Answering Similarity Tasks
PyRATA, Python Rule-based feAture sTructure Analysis
He, Yulan Content-Based Conflict of Interest Detection on Wikipedia
Neural Caption Generation for News Images
He, Junqing Discriminating between Similar Languages on Imbalanced Conversational Texts
Hedaya, Samy The WAW Corpus: The First Corpus of Interpreted Speeches and their Translations for English and Arabic
Hedeland, Hanna Introducing the CLARIN Knowledge Centre for Linguistic Diversity and Language Documentation
Heeringa, Wilbert The Boarnsterhim Corpus: A Bilingual Frisian-Dutch Panel and Trend Study
Heffernan, Kevin Creating dialect sub-corpora by clustering: a case in Japanese for an adaptive method
Hegedűs, Klára SzegedKoref: A Hungarian Coreference Corpus
Hegele, Stefanie Language Technology for Multilingual Europe: An Analysis of a Large-Scale Survey regarding Challenges, Demands, Gaps and Needs
Heinecke, Johannes Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text
Heinzerling, Benjamin BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages
Helfrich, Philipp TreeAnnotator: Versatile Visual Annotation of Hierarchical Text Relations
Helgadóttir, Inga Rún Open ASR for Icelandic: Resources and a Baseline System
Helgadóttir, Sigrún Risamálheild: A Very Large Icelandic Text Corpus
Hellwig, Oliver Multi-layer Annotation of the Rigveda
AET: Web-based Adjective Exploration Tool for German
Hemati, Wahed FastSense: An Efficient Word Sense Disambiguation Classifier
Hemmingsson, Nils The Spot the Difference corpus: a multi-modal corpus of spontaneous task oriented spoken interactions
Hendrickx, Iris Discovering the Language of Wine Reviews: A Text Mining Account
A Multilingual Wikified Data Set of Educational Material
A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language
Henlein, Alexander FastSense: An Efficient Word Sense Disambiguation Classifier
Hennig, Leonhard A German Corpus for Fine-Grained Named Entity Recognition and Relation Extraction of Traffic and Industry Events
A Corpus Study and Annotation Schema for Named Entity Recognition and Relation Extraction of Business Products
Henrot, Geneviève TriMED: A Multilingual Terminological Database
Henry, Cassidy Dialogue Structure Annotation for Multi-Floor Interaction
Herath, Achini Handling Rare Word Problem using Synthetic Training Data for Sinhala and Tamil Neural Machine Translation
Hermann, Sibylle Preserving Workflow Reproducibility: The RePlay-DH Client as a Tool for Process Documentation
Hermjakob, Ulf Abstract Meaning Representation of Constructions: The More We Include, the Better the Representation
Herms, Robert CoLoSS: Cognitive Load Corpus with Speech and Performance Data from a Symbol-Digit Dual-Task
Hernandez, Nicolas A Multi-Domain Framework for Textual Similarity. A Case Study on Question-to-Question and Question-Answering Similarity Tasks
PyRATA, Python Rule-based feAture sTructure Analysis
Herrmannova, Drahomira Analyzing Citation-Distance Networks for Evaluating Publication Impact
Hervy, Benjamin Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Hettrich, Heinrich Multi-layer Annotation of the Rigveda
Heyer, Gerhard Page Stream Segmentation with Convolutional Neural Nets Combining Textual and Visual Features
ILCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Heylen, Dirk An Information-Providing Closed-Domain Human-Agent Interaction Corpus
Hideaki, Takeda A Vietnamese Dialog Act Corpus Based on ISO 24617-2 standard
Higashinaka, Ryuichiro Predicting Nods by using Dialogue Acts in Dialogue
Creating Large-Scale Argumentation Structures for Dialogue Systems
Higuchi, Suemi Text Mining for History: first steps on building a large dataset
Hiippala, Tuomo Enhancing the AI2 Diagrams Dataset Using Rhetorical Structure Theory
Hill, Susan Dialogue Structure Annotation for Multi-Floor Interaction
Hinrichs, Erhard Bridging the LAPPS Grid and CLARIN
Hinrichs, Marie Handling Big Data and Sensitive Data Using EUDAT's Generic Execution Framework and the WebLicht Workflow Engine.
Bridging the LAPPS Grid and CLARIN
Hirschberg, Julia Evaluating the WordsEye Text-to-Scene System: Imaginative and Realistic Sentences
Collecting Code-Switched Data from Social Media
Hirschmanner, Matthias Action Verb Corpus
Hisamoto, Sorami Sudachi: a Japanese Tokenizer for Business
Hitschler, Julian Correction of OCR Word Segmentation Errors in Articles from the ACL Collection through Neural Machine Translation Methods
Hladka, Barbora Czech Legal Text Treebank 2.0
Hoeber, Orland Scalable Visualisation of Sentiment and Stance
Hoenen, Armin Multi Modal Distance - An Approach to Stemma Generation With Weighting
From Manuscripts to Archetypes through Iterative Clustering
Knowing the Author by the Company His Words Keep
Honnet, Pierre-Edouard Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
Horbach, Andrea Semi-Supervised Clustering for Short Answer Scoring
ESCRITO - An NLP-Enhanced Educational Scoring Toolkit
Horsmann, Tobias DeepTC – An Extension of DKPro Text Classification for Fostering Reproducibility of Deep Learning Experiments
Hoste, Veronique A Gold Standard for Multilingual Automatic Term Extraction from Comparable Corpora: Term Structure and Translation Equivalents
Hruz, Marek Towards Processing of the Oral History Interviews and Related Printed Documents
Hsieh, Fernando Author Profiling from Facebook Corpora
Hsu, Chao-Chun EmotionLines: An Emotion Corpus of Multi-Party Conversations
Hu, Junfeng Constructing High Quality Sense-specific Corpus and Word Embedding via Unsupervised Elimination of Pseudo Multi-sense
Hu, Guoping Dataset for the First Evaluation on Chinese Machine Reading Comprehension
Huang, Chu-Ren Annotating Chinese Light Verb Constructions according to PARSEME guidelines
Huang, Hen-Hsen Learning to Map Natural Language Statements into Knowledge Base Representations for Knowledge Base Construction
Transfer of Frames from English FrameNet to Construct Chinese FrameNet: A Bilingual Corpus-Based Approach
Huang, Xian Discriminating between Similar Languages on Imbalanced Conversational Texts
Huang, Shilei A Pragmatic Approach for Classical Chinese Word Segmentation
Huang, Ting-Hao EmotionLines: An Emotion Corpus of Multi-Party Conversations
Huang, Shu-Jian Dynamic Oracle for Neural Machine Translation in Decoding Phase
Huangfu, Luwen Bootstrapping Polar-Opposite Emotion Dimensions from Online Reviews
Huber, Patrick Automated Evaluation of Out-of-Context Errors
Huck, Dominique Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Hudeček, Vojtěch SumeCzech: Large Czech News-Based Summarization Dataset
Huenerfauth, Matt A Corpus for Modeling Word Importance in Spoken Dialogue Transcripts
Huet, Stéphane A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task
Hulden, Mans UniMorph 2.0: Universal Morphology
A Computational Architecture for the Morphology of Upper Tanana
Hulsbosch, Micha Signbank: Software to Support Web Based Dictionaries of Sign Language
Hunter, Lawrence E. Three Dimensions of Reproducibility in Natural Language Processing
Hwang, Jena D. Building Universal Dependency Treebanks in Korean
Hwang, Seung-won Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Semi-supervised Training Data Generation for Multilingual Question Answering
Hämäläinen, Mika Combining Concepts and Their Translations from Structured Dictionaries of Uralic Minority Languages

 

I
IVANOVIC, Christine Building Literary Corpora for Computational Literary Analysis - A Prototype to Bridge the Gap between CL and DH
Ide, Nancy Three Dimensions of Reproducibility in Natural Language Processing
Mining Biomedical Publications With The LAPPS Grid
Bridging the LAPPS Grid and CLARIN
Idiart, Marco The brWaC Corpus: A New Open Resource for Brazilian Portuguese
Ihden, Sarah HiNTS: A Tagset for Middle Low German
Ihme, Klas Recognizing Behavioral Factors while Driving: A Real-World Multimodal Corpus to Monitor the Driver’s Affective State
Iida, Ryu Annotating Zero Anaphora for Question Answering
Ikeda, Noriko Chemical Compounds Knowledge Visualization with Natural Language Processing and Linked Data
Ilievski, Filip Don't Annotate, but Validate: a Data-to-Text Method for Capturing Event Data
Imamura, Kenji Multilingual Parallel Corpus for Global Communication Plan
Inago, Akari Creating Large-Scale Argumentation Structures for Dialogue Systems
Indig, Balázs E-magyar -- A Digital Language Processing System
What's Wrong, Python? -- A Visual Differ and Graph Library for NLP in Python
Inel, Oana Resource Interoperability for Sustainable Benchmarking: The Case of Events
Inoue, Go A Parallel Corpus of Arabic-Japanese News Articles
Ion, Radu Ensemble Romanian Dependency Parsing with Neural Networks
Ionov, Maxim Universal Morphologies for the Caucasus region
Ircing, Pavel Design and Development of Speech Corpora for Air Traffic Control Training
Towards Processing of the Oral History Interviews and Related Printed Documents
Irimia, Elena Ensemble Romanian Dependency Parsing with Neural Networks
The Reference Corpus of the Contemporary Romanian Language (CoRoLa)
Isahara, Hitoshi Building a List of Synonymous Words and Phrases of Japanese Compound Verbs
Isard, Amy Up-cycling Data for Natural Language Generation
Iseki, Yuriko Construction of the Corpus of Everyday Japanese Conversation: An Interim Report
Ishida, Toru Designing a Collaborative Process to Create Bilingual Dictionaries of Indonesian Ethnic Languages
A Framework for Multi-Language Service Design with the Language Grid
Ishiguro, Hiroshi Creating Large-Scale Argumentation Structures for Dialogue Systems
Ishii, Ryo Predicting Nods by using Dialogue Acts in Dialogue
Ishikawa, Yoko Dialogue Scenario Collection of Persuasive Dialogue with Emotional Expressions via Crowdsourcing
Ishimoto, Yuichi Extending Search System based on Interactive Visualization for Speech Corpora
Itahashi, Shuichi Extending Search System based on Interactive Visualization for Speech Corpora
Ito, Fernando T. The Effects of Unimodal Representation Choices on Multimodal Learning
Ito, Kaoru J-MeDic: A Japanese Disease Name Dictionary based on Real Clinical Usage
Ivanko, Denis Contextual Dependencies in Time-Continuous Multidimensional Affect Recognition
Iwakura, Tomoya Chemical Compounds Knowledge Visualization with Natural Language Processing and Linked Data
Iwao, Tomohide J-MeDic: A Japanese Disease Name Dictionary based on Real Clinical Usage
Iñurrieta, Uxoa Konbitzul: an MWE-specific database for Spanish-Basque

 

J
Jacovi, Michal A Recorded Debating Dataset
Jacquemin, Bernard Automatic Identification of Research Fields in Scientific Papers
Jadczyk, Tomasz An Application for Building a Polish Telephone Speech Corpus
Jaech, Aaron Collecting Code-Switched Data from Social Media
Jahren, Brage Utilizing Large Twitter Corpora to Create Sentiment Lexica
Jana, Abhik Network Features Based Co-hyponymy Detection
Janalizadeh Choobbasti, Ali MirasVoice: A bilingual (English-Persian) speech corpus
MirasText: An Automatically Generated Text Corpus for Persian
Jannidis, Fotis Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods
Jansche, Martin FonBund: A Library for Combining Cross-lingual Phonological Segment Data
Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Jansen, Peter WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-hop Inference
Janz, Arkadiusz Classifier-based Polarity Propagation in a WordNet
Jaouani, Mohamed-Amine BabyCloud, a Technological Platform for Parents and Researchers
Jatowt, Adam A High-Quality Gold Standard for Citation-based Tasks
Jauregi Unanue, Inigo English-Basque Statistical and Neural Machine Translation
Javed, Talha Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages
Jayasena, Sanath Improving domain-specific SMT for low-resourced languages using data from different domains
Jeong, Young-Seob Korean TimeBank Including Relative Temporal Information
Jezek, Elisabetta Enriching a Lexicon of Discourse Connectives with Corpus-based Data
Jhaveri, Nisarg A Workbench for Rapid Generation of Cross-Lingual Summaries
Ji, Heng Error Analysis of Uyghur Name Tagging: Language-specific Techniques and Remaining Challenges
Jia, Libin Text Normalization Infrastructure that Scales to Hundreds of Language Varieties
Jiang, Feng Building a Macro Chinese Discourse Treebank
Jiang, Menghan Annotating Chinese Light Verb Constructions according to PARSEME guidelines
Jiang, Tonghai A Neural Network Based Model for Loanword Identification in Uyghur
Jiang, Tingsong Revisiting Distant Supervision for Relation Extraction
Jimeno Yepes, Antonio Parallel Corpora for the Biomedical Domain
Jimerson, Robert ASR for Documenting Acutely Under-Resourced Indigenous Languages
Jin, Zhi Towards Neural Speaker Modeling in Multi-Party Conversation: The Task, Dataset, and Models
Jochim, Charles SLIDE - a Sentiment Lexicon of Common Idioms
Johannessen, Janne Bondi The LIA Treebank of Spoken Norwegian Dialects
Johnson, Trevor Signbank: Software to Support Web Based Dictionaries of Sign Language
Johnson, Mark A Fast and Accurate Vietnamese Word Segmenter
Johnson, Emmanuel The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Jokinen, Kristiina Researching Less-Resourced Languages – the DigiSami Corpus
Jonell, Patrik A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
FARMI: A FrAmework for Recording Multi-Modal Interactions
Crowdsourced Multimodal Corpora Collection Tool
Jones, Gareth Development of an Annotated Multimodal Dataset for the Investigation of Classification and Summarisation of Presentations using High-Level Paralinguistic Features
Joshi, Aditya Sarcasm Target Identification: Dataset and An Introductory Approach
Joulin, Armand Learning Word Vectors for 157 Languages
Advances in Pre-Training Distributed Word Representations
Jurafsky, Dan JESC: Japanese-English Subtitle Corpus
RtGender: A Corpus for Studying Differential Responses to Gender
Jurgens, David RtGender: A Corpus for Studying Differential Responses to Gender
Jørgensen, Fredrik NoReC: The Norwegian Review Corpus

 

K
Kabashi, Besim Albanian Part-of-Speech Tagging: Gold Standard and Evaluation
Kafle, Sushant A Corpus for Modeling Word Importance in Spoken Dialogue Transcripts
Kahmann, Christian ILCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Kahn, Juliette Matics Software Suite: New Tools for Evaluation and Data Exploration
Kaiser, Georg A. A Multilingual Approach to Question Classification
Kaiser, Katharina A Multilingual Approach to Question Classification
Kajiyama, Tomoko Extending Search System based on Interactive Visualization for Speech Corpora
Kallmeyer, Laura Multi-Dialect Arabic POS Tagging: A CRF Approach
Multilingual Multi-class Sentiment Classification Using Convolutional Neural Networks
Kalniņš, Rihards Collecting Language Resources from Public Administrations in the Nordic and Baltic Countries
Tilde MT Platform for Developing Client Specific MT Solutions
Kalouli, Aikaterini-Lida A Multilingual Approach to Question Classification
Kameko, Hirotaka Annotating Modality Expressions and Event Factuality for a Japanese Chess Commentary Corpus
Kamila, Sabyasachi Sentence Level Temporality Detection using an Implicit Time-sensed Resource
Kamocki, Pawel Data Management Plan (DMP) for Language Data under the New General Da-ta Protection Regulation (GDPR)
New directions in ELRA activities
The German Reference Corpus DeReKo: New Developments – New Opportunities
Kanayama, Hiroshi Universal Dependencies Version 2 for Japanese
Kanerva, Jenna Parse Me if You Can: Artificial Treebanks for Parsing Experiments on Elliptical Constructions
Kang, Juyeon Data Anonymization for Requirements Quality Analysis: a Reproducible Automatic Error Detection Task
Kanojia, Diptesh Indian Language Wordnets and their Linkages with Princeton WordNet
Kantor, Yoav Semantic Relatedness of Wikipedia Concepts -- Benchmark Data and a Working Solution
Kanzaki, Kyoko Building a List of Synonymous Words and Phrases of Japanese Compound Verbs
Kar, Sudipta MPST: A Corpus of Movie Plot Synopses with Tags
Karanfil, Güllü Finite-state morphological analysis for Gagauz
Karima, ABIDI An Automatic Learning of an Algerian Dialect Lexicon by using Multilingual Word Embeddings
Karimi, Akbar Extracting an English-Persian Parallel Corpus from Comparable Corpora
Kashino, Wakako Annotation and Quantitative Analysis of Speaker Information in Novel Conversation Sentences in Japanese
Construction of the Corpus of Everyday Japanese Conversation: An Interim Report
Katerenchuk, Denys Interpersonal Relationship Labels for the CALLHOME Corpus
Katinskaia, Anisia Revita: a Language-learning Platform at the Intersection of ITS and CALL
Kato, Tsuneaki Undersampling Improves Hypernymy Prototypicality Learning
Kato, Akihiko Construction of Large-scale English Verbal Multiword Expression Annotated Corpus
Katsuta, Akihiro Crowdsourced Corpus of Sentence Simplification with Core Vocabulary
Kawabata, Yoshiko Construction of the Corpus of Everyday Japanese Conversation: An Interim Report
Kawahara, Daisuke JFCKB: Japanese Feature Change Knowledge Base
Comprehensive Annotation of Various Types of Temporal Information on the Time Axis
JDCFC: A Japanese Dialogue Corpus with Feature Changes
Improving Crowdsourcing-Based Annotation of Japanese Discourse Relations
Kawahara, Noriko Sudachi: a Japanese Tokenizer for Business
Kelleher, John D. Is it worth it? Budget-related evaluation metrics for model selection
Kergosien, Eric Automatic Identification of Research Fields in Scientific Papers
Kermanidis, Katia Lida Improving Machine Translation of Educational Content via Crowdsourcing
A Multilingual Wikified Data Set of Educational Material
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Khac Linh, Pham A Vietnamese Dialog Act Corpus Based on ISO 24617-2 standard
Khait, Ilya Towards a Linked Open Data Edition of Sumerian Corpora
Khalifa, Salam The MADAR Arabic Dialect Corpus and Lexicon
Unified Guidelines and Resources for Arabic Dialect Orthography
A Morphologically Annotated Corpus of Emirati Arabic
MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction
Khan, Arif A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks
Khan, Fahad One Language to rule them all: modelling Morphological Patterns in a Large Scale Italian Lexicon with SWRL
Khandelwal, Ankush Humor Detection in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System
Khashabi, Daniel CogCompNLP: Your Swiss Army Knife for NLP
Khodak, Mikhail A Large Self-Annotated Corpus for Sarcasm
Khooshabeh, Peter Unfolding the External Behavior and Inner Affective State of Teammates through Ensemble Learning: Experimental Evidence from a Dyadic Team Corpus
Khorasani, Elahe QUEST: A Natural Language Interface to Relational Databases
Kibrik, Andrej A «Portrait» Approach to Multichannel Discourse
Kiela, Douwe SentEval: An Evaluation Toolkit for Universal Sentence Representations
Kieraś, Witold Manually Annotated Corpus of Polish Texts Published between 1830 and 1918
Kim, Jiseong Semi-automatic Korean FrameNet Annotation over KAIST Treebank
Automatic Wordnet Mapping: from CoreNet to Princeton WordNet
Unsupervised Korean Word Sense Disambiguation using CoreNet
Kim, Seokhwan PhotoshopQuiA: A Corpus of Non-Factoid Questions and Answers for Why-Question Answering
Kim, Young Kil Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages
Kim, Eun-kyung Incorporating Global Contexts into Sentence Embedding for Relational Extraction at the Paragraph Level with Distant Supervision
Kim, Inyoung CBFC: a parallel L2 speech corpus for Korean and French learners
Kim, Doo Soon Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing
PhotoshopQuiA: A Corpus of Non-Factoid Questions and Answers for Why-Question Answering
Kim, Jin-Dong Mining Biomedical Publications With The LAPPS Grid
Kim Amplayo, Reinald Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Kimmelman, Vadim IPSL: A Database of Iconicity Patterns in Sign Languages. Creation and Use
Kiritchenko, Svetlana Quantifying Qualitative Data for Understanding Controversial Issues
Understanding Emotions: A Dataset of Tweets to Study Interactions between Affect Categories
WikiArt Emotions: An Annotated Dataset of Emotions Evoked by Art
Kirov, Christo UniMorph 2.0: Universal Morphology
Kishimoto, Yudai Improving Crowdsourcing-Based Annotation of Japanese Discourse Relations
Kisler, Thomas MOCCA: Measure of Confidence for Corpus Analysis - Automatic Reliability Check of Transcript and Automatic Segmentation
Kita, Kenji Visualization of the occurrence trend of infectious diseases using Twitter
Kitamura, Masanori Development of a Mobile Observation Support System for Students: FishWatchr Mini
Kjartansson, Oddur Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Klakow, Dietrich The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Klang, Marcus Linking, Searching, and Visualizing Entities in Wikipedia
Klaussner, Carmen A Diachronic Corpus for Literary Style Analysis
Klezovich, Anna IPSL: A Database of Iconicity Patterns in Sign Languages. Creation and Use
Klimek, Bettina Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment
LiDo RDF: From a Relational Database to a Linked Data Graph of Linguistic Terms and Bibliographic Data
Klimešová, Petra Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech
Klubička, Filip Is it worth it? Budget-related evaluation metrics for model selection
Klyueva, Natalia Annotating Chinese Light Verb Constructions according to PARSEME guidelines
Improving a Neural-based Tagger for Multiword Expressions Identification
Knese, Edwin LiDo RDF: From a Relational Database to a Linked Data Graph of Linguistic Terms and Bibliographic Data
Knight, Jo Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger
Knight, Dawn Towards a Welsh Semantic Annotation System
Leveraging Lexical Resources and Constraint Grammar for Rule-Based Part-of-Speech Tagging in Welsh
Knight, Kevin Abstract Meaning Representation of Constructions: The More We Include, the Better the Representation
Knoth, Petr Analyzing Citation-Distance Networks for Evaluating Publication Impact
Kobayashi, Tetsunori Social Image Tags as a Source of Word Embeddings: A Task-oriented Evaluation
Kobayashi, Tessei Infant Word Comprehension-to-Production Index Applied to Investigation of Noun Learning Predominance Using Cross-lingual CDI database
Analyzing Vocabulary Commonality Index Using Large-scaled Database of Child Language Development
Kocabiyikoglu, Ali Can Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation
Kocmi, Tom SumeCzech: Large Czech News-Based Summarization Dataset
Kocoń, Jan Classifier-based Polarity Propagation in a WordNet
Koiso, Hanae Construction of the Corpus of Everyday Japanese Conversation: An Interim Report
Kokkinakis, Dimitrios A Swedish Cookie-Theft Corpus
Komachi, Mamoru Construction of a Japanese Word Similarity Dataset
Komatani, Kazunori Collection of Multimodal Dialog Data and Analysis of the Result of Annotation of Users' Interest Level
Komen, Erwin A Fast and Flexible Webinterface for Dialect Research in the Low Countries
Metadata Collection Records for Language Resources
Signbank: Software to Support Web Based Dictionaries of Sign Language
Komiya, Kanako All-words Word Sense Disambiguation Using Concept Embeddings
Komrsková, Zuzana Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech
Kondo, Makoto Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags
Konle, Leonard Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods
Kontogiorgos, Dimosthenis A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
FARMI: A FrAmework for Recording Multi-Modal Interactions
Crowdsourced Multimodal Corpora Collection Tool
Kopřivová, Marie Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech
Kordoni, Valia Improving Machine Translation of Educational Content via Crowdsourcing
A Multilingual Wikified Data Set of Educational Material
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Korhonen, Anna Acquiring Verb Classes Through Bottom-Up Semantic Verb Clustering
Koroleva, Anna Annotating Spin in Biomedical Scientific Publications : the case of Random Controlled Trials (RCTs)
Koryzis, Dimitris The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Kosmehl, Benjamin Analyzing Middle High German Syntax with RDF and SPARQL
Kosseim, Leila Attention for Implicit Discourse Relation Recognition
Kotlerman, Lili A Recorded Debating Dataset
Kouarata, Guy-Noël Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
Kouarata, Guy-Noel A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Koutsombogera, Maria Modeling Collaborative Multimodal Behavior in Group Dialogues: The MULTISIMO Corpus
Kouylekov, Milen OpenSubtitles2018: Statistical Rescoring of Sentence Alignments in Large, Noisy Parallel Corpora
Kovatchev, Venelin ETPC - A Paraphrase Identification Corpus Annotated with Extended Paraphrase Typology and Negation
Koyanagi, Yusuke Chemical Compounds Knowledge Visualization with Natural Language Processing and Linked Data
Kozawa, Shunsuke Extending Search System based on Interactive Visualization for Speech Corpora
Kozhevnikov, Mikhail Automatic Prediction of Discourse Connectives
Kraif, Olivier Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation
Kral, Pavel Czech Text Document Corpus v 2.0
Kraus, Johannes Effects of Gender Stereotypes on Trust and Likability in Spoken Human-Robot Interaction
Kraus, Matthias Effects of Gender Stereotypes on Trust and Likability in Spoken Human-Robot Interaction
Krause, Sebastian Automatic Prediction of Discourse Connectives
Krenn, Brigitte Action Verb Corpus
Krielke, Pauline ParCorFull: a Parallel Corpus Annotated with Full Coreference
Kriese, Leonard A Multi-layer Annotated Corpus of Argumentative Text: From Argument Schemes to Discourse Relations
Krishna, Amrith Building a Word Segmenter for Sanskrit Overnight
Krishnaswamy, Nikhil An Evaluation Framework for Multimodal Interaction
Krišlauks, Rihards Training and Adapting Multilingual NMT for Less-resourced and Morphologically Rich Languages
Krstev, Cvetana Using English Baits to Catch Serbian Multi-Word Terminology
Kruschwitz, Udo Improving Hate Speech Detection with Deep Learning Ensembles
Scalable Visualisation of Sentiment and Stance
Kríž, Vincent Czech Legal Text Treebank 2.0
Kröger, Dustin LiDo RDF: From a Relational Database to a Linked Data Graph of Linguistic Terms and Bibliographic Data
Ku, Lun-Wei EmotionLines: An Emotion Corpus of Multi-Party Conversations
Kuhn, Jonas Moving TIGER beyond Sentence-Level
German Radio Interviews: The GRAIN Release of the SFB732 Silver Standard Collection
A Lightweight Modeling Middleware for Corpus Processing
Kulahcioglu, Tugba FontLex: A Typographical Lexicon based on Affective Associations
Kulmizev, Artur MGAD: Multilingual Generation of Analogy Datasets
Kumar, Ritesh Aggression-annotated Corpus of Hindi-English Code-mixed Data
Kumar, Adarsh Translating Web Search Queries into Natural Language Questions
Kumar, Rohit Phonetically Balanced Code-Mixed Speech Corpus for Hindi-English Automatic Speech Recognition
Kumari, Surabhi MMQA: A Multi-domain Multi-lingual Question-Answering Framework for English and Hindi
Kummerfeld, Jonathan K. World Knowledge for Abstract Meaning Representation Parsing
Kunchukuttan, Anoop The IIT Bombay English-Hindi Parallel Corpus
Kuntschick, Philipp Framing Named Entity Linking Error Types
Kuo, Chuan-Chun EmotionLines: An Emotion Corpus of Multi-Party Conversations
Kuo, Hong-Kwang A Recorded Debating Dataset
Kupietz, Marc The German Reference Corpus DeReKo: New Developments – New Opportunities
Kurfalı, Murathan Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank
An Assessment of Explicit Inter- and Intra-sentential Discourse Connectives in Turkish Discourse Bank
Kurohashi, Sadao Comprehensive Annotation of Various Types of Temporal Information on the Time Axis
Improving Crowdsourcing-Based Annotation of Japanese Discourse Relations
Kwon, Sunggoo Semi-automatic Korean FrameNet Annotation over KAIST Treebank
Automatic Wordnet Mapping: from CoreNet to Princeton WordNet
Kåsen, Andre The LIA Treebank of Spoken Norwegian Dialects
Köhler, Joachim Improved Transcription and Indexing of Oral History Interviews for Digital Humanities Research
Köser, Stephanie Introducing a Lexicon of Verbal Polarity Shifters for English
Kübler, Sandra UniMorph 2.0: Universal Morphology

 

L
L' Homme, Marie-Claude Browsing the Terminological Structure of a Specialized Domain: A Method Based on Lexical Functions and their Classification
L'Homme, Marie-Claude Lexical Profiling of Environmental Corpora
Laaridh, Imed Dysarthric speech evaluation: automatic and perceptual approaches
Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Labaka, Gorka Building Named Entity Recognition Taggers via Parallel Corpora
Konbitzul: an MWE-specific database for Spanish-Basque
Labropoulou, Penny Managing Public Sector Data for Multilingual Applications Development
A Legal Perspective on Training Models for Natural Language Processing
Lacayrelle, Annig Automatic Identification of Research Fields in Scientific Papers
Lachler, Jordan Modeling Northern Haida Verb Morphology
Lacruz, Isabel Literality and cognitive effort: Japanese and Spanish
Laforest, Frederique T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples
Laganaro, Marina The MonPaGe_HA Database for the Documentation of Spoken French Throughout Adulthood
Lai, Mirko Application and Analysis of a Multi-layered Scheme for Irony on the Italian Twitter Corpus TWITTIRÒ
Lai, Dac Viet TSix: A Human-involved-creation Dataset for Tweet Summarization
Laignelet, Marion A Real-life, French-accented Corpus of Air Traffic Control Communications
Lala, Chiraag Multimodal Lexical Translation
Lalain, Muriel Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Lambert, Patrik MultiBooked: A Corpus of Basque and Catalan Hotel Reviews Annotated for Aspect-level Sentiment Classification
Lambrey, Florie GenDR: A Generic Deep Realizer with Complex Lexicalization
Lamel, Lori A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
The French-Algerian Code-Switching Triggered audio corpus (FACST)
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
Lan, Alex Definite Description Lexical Choice: taking Speaker's Personality into account
Landeau, Anais FrNewsLink : a corpus linking TV Broadcast News Segments and Press Articles
Lando, Tatiana Dialog Intent Structure: A Hierarchical Schema of Linked Dialog Acts
Landragin, Frédéric ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations
Lange, Lukas KRAUTS: A German Temporally Annotated News Corpus
Langlais, Phillippe Transforming Wikipedia into a Large-Scale Fine-Grained Entity Type Corpus
Revisiting the Task of Scoring Open IE Relations
Lango, Mateusz Semi-Automatic Construction of Word-Formation Networks (for Polish and Spanish)
Langone, Helen Candidate Ranking for Maintenance of an Online Dictionary
Laokulrat, Natsuda Incorporating Semantic Attention in Video Description Generation
Lapshinova-Koltunski, Ekaterina ParCorFull: a Parallel Corpus Annotated with Full Coreference
Larasati, Septina The First 100 Days: A Corpus Of Political Agendas on Twitter
Lareau, François GenDR: A Generic Deep Realizer with Complex Lexicalization
Retrieving Information from the French Lexical Network in RDF/OWL Format
Larkin, Samuel EuroGames16: Evaluating Change Detection in Online Conversation
Lavee, Tamar A Recorded Debating Dataset
Lavelli, Alberto PoSTWITA-UD: an Italian Twitter Treebank in Universal Dependencies
Lavergne, Thomas Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Lawless, Seamus C-HTS: A Concept-based Hierarchical Text Segmentation approach
Lawrence, John Intertextual Correspondence for Integrating Corpora
Le, Minh Neural Models of Selectional Preferences for Implicit Semantic Role Labeling
Le Dinh, Thang PhotoshopQuiA: A Corpus of Non-Factoid Questions and Answers for Why-Question Answering
Le Maguer, Sébastien Creating New Language and Voice Components for the Updated MaryTTS Text-to-Speech Synthesis Platform
Leach, Andrew A New Corpus to Support Text Mining for the Curation of Metabolites in the ChEBI Database
Lecadre, Sabrina Matics Software Suite: New Tools for Evaluation and Data Exploration
Lechelle, William Revisiting the Task of Scoring Open IE Relations
Lecouteux, Benjamin UFSAC: Unification of Sense Annotated Corpora and Tools
Lee, Gyeongbok Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Lee, Chi-Yao Fluid Annotation: A Granularity-aware Annotation Tool for Chinese Word Fluidity
Lee, John L1-L2 Parallel Treebank of Learner Chinese: Overused and Underused Syntactic Structures
Lee, Kiyong Towards an ISO Standard for the Annotation of Quantification
Lee, Kristine Text Annotation Graphs: Annotating Complex Natural Language Phenomena
Lee, Ji Young Transfer Learning for Named-Entity Recognition with Neural Networks
Lee, Kyungjae Semi-supervised Training Data Generation for Multilingual Question Answering
Lee, Lung-Hao Building a TOCFL Learner Corpus for Chinese Grammatical Error Diagnosis
Leeflang, Mariska Automating Document Discovery in the Systematic Review Process: How to Use Chaff to Extract Wheat
Lefakis, Leonidas FEIDEGGER: A Multi-modal Corpus of Fashion Images and Descriptions in German
Lefever, Els Discovering the Language of Wine Reviews: A Text Mining Account
A Gold Standard for Multilingual Automatic Term Extraction from Comparable Corpora: Term Structure and Translation Equivalents
MIsA: Multilingual "IsA" Extraction from Corpora
Leh, Almut Improved Transcription and Indexing of Oral History Interviews for Digital Humanities Research
Lehmberg, Timm Introducing the CLARIN Knowledge Centre for Linguistic Diversity and Language Documentation
Lei, Su Dialogue Structure Annotation for Multi-Floor Interaction
Lemnitzer, Lothar A database of German definitory contexts from selected web sources
Lenardič, Jakob CLARIN’s Key Resource Families
Lenc, Ladislav Czech Text Document Corpus v 2.0
Lepage, Yves Korean L2 Vocabulary Prediction: Can a Large Annotated Corpus be Used to Train Better Models for Predicting Unknown Words?
Tools for The Production of Analogical Grids and a Resource of N-gram Analogical Grids in 11 Languages
Lepage, Benoît Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Lestari, Dessi Construction of English-French Multimodal Affective Conversational Corpus from TV Dramas
Leuski, Anton The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Levacher, Killian The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Levin, Lori Parser combinators for Tigrinya and Oromo morphology
Levy, Francois An Annotation Language for Semantic Search of Legal Sources
Levy, Ran Semantic Relatedness of Wikipedia Concepts -- Benchmark Data and a Working Solution
Levāne-Petrova, Kristīne The Use of Text Alignment in Semi-Automatic Error Analysis: Use Case in the Development of the Corpus of the Latvian Language Learners
León-Araúz, Pilar Towards the Inference of Semantic Relations in Complex Nominals: a Pilot Study
Manzanilla: An Image Annotation Tool for TKB Building
Evaluating EcoLexiCAT: a Terminology-Enhanced CAT Tool
Li, Maoxi Building Parallel Monolingual Gan Chinese Dialects Corpus
Li, Boyang Annotating High-Level Structures of Short Stories and Personal Anecdotes
Li, Xiang Sound Signal Processing with Seq2Tree Network
Li, Xiaoqing One Sentence One Model for Neural Machine Translation
Li, Binyang The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media
Li, Chenfang Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Li, Christy Analyzing the Quality of Counseling Conversations: the Tell-Tale Signs of High-quality Counseling
Li, Keying L1-L2 Parallel Treebank of Learner Chinese: Overused and Underused Syntactic Structures
Li, Vivian HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Li, Xian A Corpus for Multilingual Document Classification in Eight Languages
Li, Zhenghua M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains
Li, Xuansong Cross-Document, Cross-Language Event Coreference Annotation Using Event Hoppers
Liao, FangMing EuroGames16: Evaluating Change Detection in Online Conversation
Liberman, Mark From ‘Solved Problems’ to New Challenges: A Report on LDC Activities
Introducing NIEUW: Novel Incentives and Workflows for Eliciting Linguistic Data
Liebeskind, Chaya Automatic Thesaurus Construction for Modern Hebrew
Liesenfeld, Andreas MYCanCor: A Video Corpus of spoken Malaysian Cantonese
Ligeti-Nagy, Noémi What's Wrong, Python? -- A Visual Differ and Graph Library for NLP in Python
Ligozat, Anne-Laure Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Lim, Chae-Gyun Korean TimeBank Including Relative Temporal Information
Lim, KyungTae Multilingual Dependency Parsing for Low-Resource Languages: Case Studies on North Saami and Komi-Zyrian
Lin, Xi Victoria NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System
Lin, Donghui A Framework for Multi-Language Service Design with the Language Grid
Lin, Chin-Ho Learning to Map Natural Language Statements into Knowledge Base Representations for Knowledge Base Construction
Lin, Chin-Yew Revisiting Distant Supervision for Relation Extraction
Ling, Shaoshi CogCompNLP: Your Swiss Army Knife for NLP
Linhares, Andréa carneiro A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task
Linhares Pontes, Elvys A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task
Linz, Nicklas The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Lison, Pierre OpenSubtitles2018: Statistical Rescoring of Sentence Alignments in Large, Noisy Parallel Corpora
Littell, Patrick Parser combinators for Tigrinya and Oromo morphology
Epitran: Precision G2P for Many Languages
Liu, Ruishen Multi-lingual Argumentative Corpora in English, Turkish, Greek, Albanian, Croatian, Serbian, Macedonian, Bulgarian, Romanian and Arabic
Liu, Chao-Hong Chinese-Portuguese Machine Translation: A Study on Building Parallel Corpora from Comparable Texts
Liu, Siyou Chinese-Portuguese Machine Translation: A Study on Building Parallel Corpora from Comparable Texts
Liu, Qianchu NegPar: A parallel corpus annotated for negation
Liu, Ting Dataset for the First Evaluation on Chinese Machine Reading Comprehension
Liu, Yang Error Analysis of Uyghur Name Tagging: Language-specific Techniques and Remaining Challenges
Liu, Jing Revisiting Distant Supervision for Relation Extraction
Loda, Sylvette Disambiguation of Verbal Shifters
Lohar, Pintu FooTweets: A Bilingual Parallel Corpus of World Cup Tweets
Lohr, Christina Sharing Copies of Synthetic Clinical Corpora without Physical Distribution — A Case Study to Get Around IPRs and Privacy Constraints Featuring the German JSYNCC Corpus
Lolive, Damien EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.
SynPaFlex-Corpus: An Expressive French Audiobooks Corpus dedicated to expressive speech synthesis.
Lopatenko, Andrei HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Lopes, José The Spot the Difference corpus: a multi-modal corpus of spontaneous task oriented spoken interactions
FARMI: A FrAmework for Recording Multi-Modal Interactions
Lotz, Alicia Recognizing Behavioral Factors while Driving: A Real-World Multimodal Corpus to Monitor the Driver’s Affective State
Lovick, Olga A Computational Architecture for the Morphology of Upper Tanana
Lu, Qi M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains
Lu, Di Error Analysis of Uyghur Name Tagging: Language-specific Techniques and Remaining Challenges
Lucas, Gale The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Lukeš, David Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech
Lukin, Stephanie Dialogue Structure Annotation for Multi-Floor Interaction
Lundholm Fors, Kristina A Swedish Cookie-Theft Corpus
Luo, Guanheng CogCompNLP: Your Swiss Army Knife for NLP
Luz, Saturnino The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Speech Rate Calculations with Short Utterances: A Study from a Speech-to-Speech, Machine Translation Mediated Map Task
Lyding, Verena Transc&Anno: A Graphical Tool for the Transcription and On-the-Fly Annotation of Handwritten Documents
López, Rodrigo Corpus Building and Evaluation of Aspect-based Opinion Summaries from Tweets in Spanish
López Monroy, Adrian Pastor MPST: A Corpus of Movie Plot Synopses with Tags
Lösch, Andrea European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Lønning, Jan Tore Evaluation of Domain-specific Word Embeddings using Knowledge Resources
Lücking, Andy TreeAnnotator: Versatile Visual Annotation of Hierarchical Text Relations
Lüngen, Harald The German Reference Corpus DeReKo: New Developments – New Opportunities

 

M
M R, Vineeth Building a Word Segmenter for Sanskrit Overnight
MICHAUD, Alexis Evaluation Phonemic Transcription of Low-Resource Tonal Languages for Language Documentation
Ma, Weicheng Sound Signal Processing with Seq2Tree Network
Ma, Wentao Dataset for the First Evaluation on Chinese Machine Reading Comprehension
Ma, Wei-Yun Word Embedding Evaluation Datasets and Wikipedia Title Embedding for Chinese
Extended HowNet 2.0 – An Entity-Relation Common-Sense Representation Model
Macdonald, Ross A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks
Macken, Lieve A fine-grained error analysis of NMT, SMT and RBMT output for English-to-Dutch
Macketanz, Vivien TQ-AutoTest – An Automated Test Suite for (Machine) Translation Quality
Maegaard, Bente CLARIN: Towards FAIR and Responsible Data Science Using Language Resources
Magdy, Walid Multi-Dialect Arabic POS Tagging: A CRF Approach
Part-of-Speech Tagging for Arabic Gulf Dialect Using Bi-LSTM
Magg, Sven A Context-based Approach for Dialogue Act Recognition using Simple Recurrent Neural Networks
Magimai-Doss, Mathew SMILE Swiss German Sign Language Dataset
Magistry, Pierre Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Magnini, Bernardo KRAUTS: A German Temporally Annotated News Corpus
Enriching a Lexicon of Discourse Connectives with Corpus-based Data
Maguiño Valencia, Diego WordNet-Shp: Towards the Building of a Lexical Database for a Peruvian Minority Language
Maharjan, Suraj MPST: A Corpus of Movie Plot Synopses with Tags
Maheshwari, Anant Towards Language Technology for Mi'kmaq
Maheshwari, Tushar Aggression-annotated Corpus of Hindi-English Code-mixed Data
Maier, Wolfgang Towards an Automatic Assessment of Crowdsourced Data for NLU
Majewska, Olga Acquiring Verb Classes Through Bottom-Up Semantic Verb Clustering
Majid, Asifa Discovering the Language of Wine Reviews: A Text Mining Account
Makasso, Emmanuel-Moselly BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools
Makazhanov, Aibek Manual vs Automatic Bitext Extraction
Makino, Ryosaku Preliminary Analysis of Embodied Interactions between Science Communicators and Visitors Based on a Multimodal Corpus of Japanese Conversations in a Science Museum
Malchanau, Andrei The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Towards Continuous Dialogue Corpus Creation: writing to corpus and generating from it
Malisz, Zofia FARMI: A FrAmework for Recording Multi-Modal Interactions
Bringing Order to Chaos: A Non-Sequential Approach for Browsing Large Sets of Found Audio Data
Malmi, Eric Automatic Prediction of Discourse Connectives
Mamidi, Radhika Resource Creation Towards Automated Sentiment Analysis in Telugu (a low resource language) and Integrating Multiple Domain Sources to Enhance Sentiment Prediction
Man, Yuan Multi-lingual Argumentative Corpora in English, Turkish, Greek, Albanian, Croatian, Serbian, Macedonian, Bulgarian, Romanian and Arabic
Mandya, Angrosh A Dataset for Inter-Sentence Relation Extraction using Distant Supervision
Manjunath, Varun Discovering Canonical Indian English Accents: A Crowdsourcing-based Approach
Manuvinakurike, Ramesh Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing
Mapelli, Valérie Data Management Plan (DMP) for Language Data under the New General Da-ta Protection Regulation (GDPR)
New directions in ELRA activities
European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Marciniak, Malgorzata SimLex-999 for Polish
Marcus, Mitchell Low-resource Post Processing of Noisy OCR Output for Historical Corpus Digitisation
Marge, Matthew Dialogue Structure Annotation for Multi-Floor Interaction
Margolin, Drew An Attribution Relations Corpus for Political News
Margoni, Thomas A Legal Perspective on Training Models for Natural Language Processing
Mariani, Joseph Measuring Innovation in Speech and Language Processing Publications.
Marimon, Montserrat Coreference Resolution in FreeLing 4.0
Mariotti, André Referring Expression Generation in time-constrained communication
Marmorstein, Steven WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-hop Inference
Maroudis, Pantelis Recognizing Behavioral Factors while Driving: A Real-World Multimodal Corpus to Monitor the Driver’s Affective State
Marsico, Egidio BDPROTO: A Database of Phonological Inventories from Ancient and Reconstructed Languages
Marteau, Pierre-françois EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.
Marteau, Pierre-François Two Multilingual Corpora Extracted from the Tenders Electronic Daily for Machine Learning and Machine Translation Applications.
Marteau, Camille CONDUCT: An Expressive Conducting Gesture Dataset for Sound Control
Marti, Toni ETPC - A Paraphrase Identification Corpus Annotated with Extended Paraphrase Typology and Negation
Martin, Fanny Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Martinc, Matej Reusable workflows for gender prediction
Martínez Alonso, Héctor Automatic Annotation of Semantic Term Types in the Complete ACL Anthology Reference Corpus
Cheating a Parser to Death: Data-driven Cross-Treebank Annotation Transfer
Martínez Garcia, Eva Evaluating Domain Adaptation for Machine Translation Across Scenarios
Maruyama, Takumi Simplified Corpus with Core Vocabulary
Marzi, Claudia Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective
Marzinotto, Gabriel Semantic Frame Parsing for Information Extraction : the CALOR corpus
Mascarenhas, Samuel FARMI: A FrAmework for Recording Multi-Modal Interactions
Mass, Yosi Semantic Relatedness of Wikipedia Concepts -- Benchmark Data and a Working Solution
Matamala, Anna Evaluating Domain Adaptation for Machine Translation Across Scenarios
Mathias, Sandeep ASAP++: Enriching the ASAP Automated Essay Grading Dataset with Essay Attribute Scores
Matousek, Jindrich Design and Development of Speech Corpora for Air Traffic Control Training
Matsubara, Shigeki Statistical Analysis of Missing Translation in Simultaneous Interpretation Using A Large-scale Bilingual Speech Corpus
Matsuda, Hironobu Visualization of the occurrence trend of infectious diseases using Twitter
Matsumoto, Kazuyuki Visualization of the occurrence trend of infectious diseases using Twitter
Matsumoto, Yuji Universal Dependencies Version 2 for Japanese
Construction of Large-scale English Verbal Multiword Expression Annotated Corpus
A Parallel Corpus of Arabic-Japanese News Articles
EMTC: Multilabel Corpus in Movie Domain for Emotion Analysis in Conversational Text
PDFAnno: a Web-based Linguistic Annotation Tool for PDF Documents
Chemical Compounds Knowledge Visualization with Natural Language Processing and Linked Data
Sudachi: a Japanese Tokenizer for Business
Matsumoto, Ryusei Visualization of the occurrence trend of infectious diseases using Twitter
Matsuyoshi, Suguru Annotating Modality Expressions and Event Factuality for a Japanese Chess Commentary Corpus
Matthews, Graham Toward An Epic Epigraph Graph
Mauclair, Julie Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Mayhew, Stephen CogCompNLP: Your Swiss Army Knife for NLP
Maynard, Hélène A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Mazo, Hélène New directions in ELRA activities
Mazovetskiy, Gleb Improving homograph disambiguation with supervised machine learning
Mazzei, Alessandro PoSTWITA-UD: an Italian Twitter Treebank in Universal Dependencies
Mazzucchi, Andrea From ‘Solved Problems’ to New Challenges: A Report on LDC Activities
McCarthy, Diana Acquiring Verb Classes Through Bottom-Up Semantic Verb Clustering
McCarthy, Arya D. UniMorph 2.0: Universal Morphology
McCoy, Tom Parser combinators for Tigrinya and Oromo morphology
McCrae, John Philip A Comparison Of Emotion Annotation Schemes And A New Annotated Data Set
Automatic Enrichment of Terminological Resources: the IATE RDF Example
A supervised approach to taxonomy extraction using word embeddings
Teanga: A Linked Data based platform for Natural Language Processing
McNaught, John A New Corpus to Support Text Mining for the Curation of Metabolites in the ChEBI Database
McNew, Garland Towards faithfully visualizing global linguistic diversity
Mediankin, Nikita SumeCzech: Large Czech News-Based Summarization Dataset
Meftah, Sara A Neural Network Model for Part-Of-Speech Tagging of Social Media Texts
Mehler, Alexander FastSense: An Efficient Word Sense Disambiguation Classifier
WikiDragon: A Java Framework For Diachronic Content And Network Analysis Of MediaWikis
TreeAnnotator: Versatile Visual Annotation of Hierarchical Text Relations
A UIMA Database Interface for Managing NLP-related Text Annotations
Mehta, Pratik The IIT Bombay English-Hindi Parallel Corpus
Meignier, Sylvain Computer-assisted Speaker Diarization: How to Evaluate Human Corrections
Mekonnen, Baye Yimam Universal Dependencies for Amharic
Melacci, Stefano Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic Lexical Resources
Mendels, Gideon Collecting Code-Switched Data from Social Media
Mendes, Amália Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank
A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language
A Lexicon of Discourse Markers for Portuguese – LDM-PT
Error annotation in a Learner Corpus of Portuguese
Meng, Zhao Towards Neural Speaker Modeling in Multi-Party Conversation: The Task, Dataset, and Models
Meng, Xiaofeng ScholarGraph:a Chinese Knowledge Graph of Chinese Scholars
Meng, Yuanliang Automatic Labeling of Problem-Solving Dialogues for Computational Microgenetic Learning Analytics
Menzel, Wolfgang Incorporating Contextual Information for Language-Independent, Dynamic Disambiguation Tasks
Mercado, Rodolfo ChAnot: An Intelligent Annotation Tool for Indigenous and Highly Agglutinative Languages in Peru
Merkulova, Tatiana FonBund: A Library for Combining Cross-lingual Phonological Segment Data
Mestre, Daniel A Semi-autonomous System for Creating a Human-Machine Interaction Corpus in Virtual Reality: Application to the ACORFORMed System for Training Doctors to Break Bad News
Metaxas, Dimitri Linguistically-driven Framework for Computationally Efficient and Scalable Sign Recognition
Metze, Florian Annotating High-Level Structures of Short Stories and Personal Anecdotes
Meunier, Christine Dysarthric speech evaluation: automatic and perceptual approaches
Meurer, Paul The Abkhaz National Corpus
Meyer, Christian M. Live Blog Corpus for Summarization
Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data
Mi, Chenggang A Neural Network Based Model for Loanword Identification in Uyghur
Miceli Barone, Antonio Valerio Improving Machine Translation of Educational Content via Crowdsourcing
Miehle, Juliana What Causes the Differences in Communication Styles? A Multicultural Study on Directness and Elaborateness
Expert Evaluation of a Spoken Dialogue System in a Clinical Operating Room
Mielke, Sebastian UniMorph 2.0: Universal Morphology
Mieskes, Margot Preparing Data from Psychotherapy for Natural Language Processing
Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data
Migueles-Abraira, Noelia Annotating Abstract Meaning Representations for Spanish
Mihalcea, Rada Analyzing the Quality of Counseling Conversations: the Tell-Tale Signs of High-quality Counseling
World Knowledge for Abstract Meaning Representation Parsing
Mikolov, Tomas Learning Word Vectors for 157 Languages
Advances in Pre-Training Distributed Word Representations
Mikulová, Marie ForFun 1.0: Prague Database of Forms and Functions -- An Invaluable Resource for Linguistic Research
Millour, Alice Toward a Lightweight Solution for Less-resourced Languages: Creating a POS Tagger for Alsatian Using Voluntary Crowdsourcing
Min, Bonan When ACE met KBP: End-to-End Evaluation of Knowledge Base Population with Component-level Annotation
Minami, Yasuhiro Infant Word Comprehension-to-Production Index Applied to Investigation of Noun Learning Predominance Using Cross-lingual CDI database
Analyzing Vocabulary Commonality Index Using Large-scaled Database of Child Language Development
Minard, Anne-Lyse KRAUTS: A German Temporally Annotated News Corpus
Minker, Wolfgang Effects of Gender Stereotypes on Trust and Likability in Spoken Human-Robot Interaction
What Causes the Differences in Communication Styles? A Multicultural Study on Directness and Elaborateness
Expert Evaluation of a Spoken Dialogue System in a Clinical Operating Room
On the Vector Representation of Utterances in Dialogue Context
Contextual Dependencies in Time-Continuous Multidimensional Affect Recognition
Mirkin, Shachar A Recorded Debating Dataset
Mironova, Veselina A German Corpus for Fine-Grained Named Entity Recognition and Relation Extraction of Traffic and Industry Events
A Corpus Study and Annotation Schema for Named Entity Recognition and Relation Extraction of Business Products
Mirzaei, Azadeh Persian Discourse Treebank and coreference corpus
Misra, Amita SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems
Misutka, Jozef Bridging the LAPPS Grid and CLARIN
Mitamura, Teruko Parser combinators for Tigrinya and Oromo morphology
Mitrofan, Maria BioRo: The Biomedical Corpus for the Romanian Language
Mittal, Arpit Simple Large-scale Relation Extraction from Unstructured Text
Mittelholcz, Iván Evaluation of Dictionary Creating Methods for Finno-Ugric Minority Languages
E-magyar -- A Digital Language Processing System
Miyao, Yusuke Universal Dependencies Version 2 for Japanese
Universal Dependencies for Amharic
Miyazaki, Yumi Annotation and Quantitative Analysis of Speaker Information in Novel Conversation Sentences in Japanese
Mizukami, Masahiro Dialogue Scenario Collection of Persuasive Dialogue with Emotional Expressions via Crowdsourcing
Mladenović, Miljana Using English Baits to Catch Serbian Multi-Word Terminology
Modi, Ashutosh MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge
Multi-layer Annotation of the Rigveda
Mohamed, Esraa DART: A Large Dataset of Dialectal Arabic Tweets
Mohammad, Saif Quantifying Qualitative Data for Understanding Controversial Issues
Word Affect Intensities
Understanding Emotions: A Dataset of Tweets to Study Interactions between Affect Categories
WikiArt Emotions: An Annotated Dataset of Emotions Evoked by Art
Mohtaj, Salar Parsivar: A Language Processing Toolkit for Persian
Mojica de la Vega, Luis Gerardo Modeling Trolling in Social Media Conversations
Moldovan, Dan Chinese Relation Classification using Long Short Term Memory Networks
Monachini, Monica One Language to rule them all: modelling Morphological Patterns in a Large Scale Italian Lexicon with SWRL
The LREC Workshops Map
Monteiro, Danielle Building a Corpus for Personality-dependent Natural Language Understanding and Generation
Montemagni, Simonetta Universal Dependencies and Quantitative Typological Trends. A Case Study on Word Order
Montiel-Ponsoda, Elena Automatic Enrichment of Terminological Resources: the IATE RDF Example
Monz, Christof Examining the Tip of the Iceberg: A Data Set for Idiom Translation
Evaluation of Machine Translation Performance Across Multiple Genres and Languages
Moran, Steven BDPROTO: A Database of Phonological Inventories from Ancient and Reconstructed Languages
Towards faithfully visualizing global linguistic diversity
Cross-linguistically Small World Networks are Ubiquitous in Child-directed Speech
Morante, Roser Systems’ Agreements and Disagreements in Temporal Processing: An Extensive Error Analysis of the TempEval-3 Task
Resource Interoperability for Sustainable Benchmarking: The Case of Events
Morawiecki, Paweł Deep Neural Networks for Coreference Resolution for Polish
More, Amir CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing
Moreau, Erwan Multilingual Word Segmentation: Training Many Language-Specific Tokenizers Smoothly Thanks to the Universal Dependencies Corpus
Moreira, Viviane A Large Parallel Corpus of Full-Text Scientific Articles
Moreira, Jander The Effects of Unimodal Representation Choices on Multimodal Learning
Moreno-Ortiz, Antonio Lingmotif-lex: a Wide-coverage, State-of-the-art Lexicon for Sentiment Analysis
Moreno-Schneider, Julian Automatic and Manual Web Annotations in an Infrastructure to handle Fake News and other Online Media Phenomena
Mori, Shinsuke Universal Dependencies Version 2 for Japanese
Annotating Modality Expressions and Event Factuality for a Japanese Chess Commentary Corpus
Mori, Wakaha Building A Handwritten Cuneiform Character Imageset
Morin, Emmanuel Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Moritz, Maria Lexical and Semantic Features for Cross-lingual Text Reuse Classification: an Experiment in English and Latin Paraphrases
Moroz, George IPSL: A Database of Iconicity Patterns in Sign Languages. Creation and Use
Morrison, Clayton WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-hop Inference
Mortazavi, Mahdi MirasVoice: A bilingual (English-Persian) speech corpus
Mortazavi Najafabadi, Seyed hani elamahdi MirasText: An Automatically Generated Text Corpus for Persian
Mortensen, David R. Parser combinators for Tigrinya and Oromo morphology
Epitran: Precision G2P for Many Languages
Moshagen, Sjur Modeling Northern Haida Verb Morphology
Mothe, Josiane Building Evaluation Datasets for Cultural Microblog Retrieval
Mott, Justin Cross-Document, Cross-Language Event Coreference Annotation Using Event Hoppers
Mou, Lili Towards Neural Speaker Modeling in Multi-Party Conversation: The Task, Dataset, and Models
Mouchère, Harold Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Moussallem, Diego LIdioms: A Multilingual Linked Idioms Data Set
RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Mubarak, Hamdy Multi-Dialect Arabic POS Tagging: A CRF Approach
Part-of-Speech Tagging for Arabic Gulf Dialect Using Bi-LSTM
Build Fast and Accurate Lemmatization for Arabic
Mueller, Martin Lightweight Grammatical Annotation in the TEI: New Perspectives
Mueller, Markus A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Mukherjee, Arjun Experiments with Convolutional Neural Networks for Multi-Label Authorship Attribution
Mukuze, Nelson A vision-grounded dataset for predicting typical locations for verbs
Mulhem, Philippe Building Evaluation Datasets for Cultural Microblog Retrieval
Muller, Ludek Towards Processing of the Oral History Interviews and Related Printed Documents
Munasinghe, Pranidhith Annotating Opinions and Opinion Targets in Student Course Feedback
Munesada, Yohei PDFAnno: a Web-based Linguistic Annotation Tool for PDF Documents
Murakami, Yohei Designing a Collaborative Process to Create Bilingual Dictionaries of Indonesian Ethnic Languages
A Framework for Multi-Language Service Design with the Language Grid
Muralidaran, Vigneshwaran No more beating about the bush : A Step towards Idiom Handling for Indian Language NLP
Murawaki, Yugo Universal Dependencies Version 2 for Japanese
Annotating Modality Expressions and Event Factuality for a Japanese Chess Commentary Corpus
Improving Crowdsourcing-Based Annotation of Japanese Discourse Relations
Muresan, Smaranda A Multi-layer Annotated Corpus of Argumentative Text: From Argument Schemes to Discourse Relations
Musat, Claudiu Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
Muscat, Adrian Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Musi, Elena A Multi-layer Annotated Corpus of Argumentative Text: From Argument Schemes to Discourse Relations
Mutschke, Peter Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications
Mykowiecka, Agnieszka SimLex-999 for Polish
Myrzakhmetov, Bagdat Manual vs Automatic Bitext Extraction
Ménard, Lucie The MonPaGe_HA Database for the Documentation of Spoken French Throughout Adulthood
Mírovský, Jiří Discourse Coherence Through the Lens of an Annotated Text Corpus: A Case Study
Müller, Lydia Corpora of Typical Sentences
Müller, Markus BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools

 

N
Nagai, Hiroyuki J-MeDic: A Japanese Disease Name Dictionary based on Real Clinical Usage
Nagesh, Ajay Grounding Gradable Adjectives through Crowdsourcing
Nahli, Ouafae Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective
Nakadai, Kazuhiro Deep JSLC: A Multimodal Corpus Collection for Data-driven Generation of Japanese Sign Language Expressions
Nakamura, Tetsuaki JFCKB: Japanese Feature Change Knowledge Base
JDCFC: A Japanese Dialogue Corpus with Feature Changes
Nakamura, Satoshi Dialogue Scenario Collection of Persuasive Dialogue with Emotional Expressions via Crowdsourcing
Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags
Construction of English-French Multimodal Affective Conversational Corpus from TV Dramas
Nakano, Mikio Collection of Multimodal Dialog Data and Analysis of the Result of Annotation of Users' Interest Level
Nakayama, Hideki Incorporating Semantic Attention in Video Description Generation
Augmenting Image Question Answering Dataset by Exploiting Image Captions
Nam, Sangha Unsupervised Korean Word Sense Disambiguation using CoreNet
Naskos, Thanasis Improving Machine Translation of Educational Content via Crowdsourcing
A Multilingual Wikified Data Set of Educational Material
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Nasr, Alexis Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text
Semantic Frame Parsing for Information Extraction : the CALOR corpus
Adding Syntactic Annotations to Flickr30k Entities Corpus for Multimodal Ambiguous Prepositional-Phrase Attachment Resolution
Nastase, Vivi DeModify: A Dataset for Analyzing Contextual Constraints on Modifier Deletion
Correction of OCR Word Segmentation Errors in Articles from the ACL Collection through Neural Machine Translation Methods
Nasution, Arbi Haza Designing a Collaborative Process to Create Bilingual Dictionaries of Indonesian Ethnic Languages
Navarretta, Costanza The Automatic Annotation of the Semiotic Type of Hand Gestures in Obama' s Humorous Speeches
Navigli, Roberto Huge Automatically Extracted Training-Sets for Multilingual Word SenseDisambiguation
Nazarenko, Adeline An Annotation Language for Semantic Search of Legal Sources
Neale, Steven Leveraging Lexical Resources and Constraint Grammar for Rule-Based Part-of-Speech Tagging in Welsh
A Survey on Automatically-Constructed WordNets and their Evaluation: Lexical and Word Embedding-based Approaches
Neduchal, Petr Towards Processing of the Oral History Interviews and Related Printed Documents
Negri, Matteo ESCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing
Nehring, Jan A Framework for the Needs of Different Types of Users in Multilingual Semantic Enrichment
Neidle, Carol Linguistically-driven Framework for Computationally Efficient and Scalable Sign Recognition
Nejat, Maryam GenDR: A Generic Deep Realizer with Complex Lexicalization
Nellore, Bhanu Teja Phonetically Balanced Code-Mixed Speech Corpus for Hindi-English Automatic Speech Recognition
Nespore-Berzkalne, Gunta Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Neto, Georges Building a Corpus for Personality-dependent Natural Language Understanding and Generation
Neubarth, Friedrich Action Verb Corpus
Neubauer, Catherine Unfolding the External Behavior and Inner Affective State of Teammates through Ensemble Learning: Experimental Evidence from a Dyadic Team Corpus
Neubig, Graham Evaluation Phonemic Transcription of Low-Resource Tonal Languages for Language Documentation
Neuzilova, Lucie Low Resource Methods for Medieval Document Sections Analysis
Neves, Mariana Parallel Corpora for the Biomedical Domain
RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Newell, Edward Constructing a Lexicon of Relational Nouns
An Attribution Relations Corpus for Political News
Ng, Vincent Improving Unsupervised Keyphrase Extraction using Background Knowledge
Modeling Trolling in Social Media Conversations
Ng, Hwee Tou Upping the Ante: Towards a Better Benchmark for Chinese-to-English Machine Translation
Ngo, Thi Lan A Vietnamese Dialog Act Corpus Based on ISO 24617-2 standard
Ngonga Ngomo, Axel-Cyrille LIdioms: A Multilingual Linked Idioms Data Set
RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Nguyen, Minh-Le TSix: A Human-involved-creation Dataset for Tweet Summarization
Nguyen, Dat Quoc A Fast and Accurate Vietnamese Word Segmenter
Nguyen, Kiem-Hieu BKTreebank: Building a Vietnamese Dependency Treebank
Nguyen, Dai Quoc A Fast and Accurate Vietnamese Word Segmenter
Nguyen, Minh-Tien TSix: A Human-involved-creation Dataset for Tweet Summarization
Nguyen, Huy-Tien TSix: A Human-involved-creation Dataset for Tweet Summarization
Nguyen, Nhung A New Corpus to Support Text Mining for the Curation of Metabolites in the ChEBI Database
Ni, Zhaoheng Sound Signal Processing with Seq2Tree Network
Nicolas, Lionel Transc&Anno: A Graphical Tool for the Transcription and On-the-Fly Annotation of Handwritten Documents
Niehues, Jan KIT-Multi: A Translation-Oriented Multilingual Embedding Corpus
Automated Evaluation of Out-of-Context Errors
Niekler, Andreas ILCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Nielsen, Rodney A Corpus of Metaphor Novelty Scores for Syntactically-Related Word Pairs
Annotating Educational Questions for Student Response Analysis
Nieminen, Henri Signbank: Software to Support Web Based Dictionaries of Sign Language
Niesler, Thomas A First South African Corpus of Multilingual Code-switched Soap Opera Speech
Nikolaev, Vitaly The Morpho-syntactic Annotation of Animacy for a Dependency Parser
Improving homograph disambiguation with supervised machine learning
Nikolić, Boško Fine-grained Semantic Textual Similarity for Serbian
Nikulásdóttir, Anna Björk Open ASR for Icelandic: Resources and a Baseline System
Nilsson Björkenstam, Kristina Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction
Nimb, Sanni A Danish FrameNet Lexicon and an Annotated Corpus Used for Training and Evaluating a Semantic Frame Classifier
Ning, Qiang CogCompNLP: Your Swiss Army Knife for NLP
Nishikawa, Hitoshi Analysis of Implicit Conditions in Database Search Dialogues
Nishikawa, Ken'ya Construction of the Corpus of Everyday Japanese Conversation: An Interim Report
Nisioi, Sergiu A Detailed Evaluation of Neural Sequence-to-Sequence Models for In-domain and Cross-domain Text Simplification
Nitoń, Bartłomiej Deep Neural Networks for Coreference Resolution for Polish
Nixon, Lyndon J.B. Framing Named Entity Linking Error Types
Nocaudie, Olivier Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Nooralahzadeh, Farhad Evaluation of Domain-specific Word Embeddings using Knowledge Resources
Nordlund, Arto A Swedish Cookie-Theft Corpus
Norman, Christopher Automating Document Discovery in the Systematic Review Process: How to Use Chaff to Extract Wheat
Nouri, Javad Revita: a Language-learning Platform at the Intersection of ITS and CALL
Novak, Valerie Arabic Data Science Toolkit: An API for Arabic Language Feature Extraction
Novitasari, Sashi Construction of English-French Multimodal Affective Conversational Corpus from TV Dramas
Novák, Attila Cross-Lingual Generation and Evaluation of a Wide-Coverage Lexical Semantic Resource
E-magyar -- A Digital Language Processing System
Novák, Borbála Cross-Lingual Generation and Evaluation of a Wide-Coverage Lexical Semantic Resource
Nugues, Pierre Linking, Searching, and Visualizing Entities in Wikipedia
Náplava, Jakub Diacritics Restoration Using Neural Networks
Nédellec, Claire Combining rule-based and embedding-based approaches to normalize textual entities with an ontology
Névéol, Aurélie Parallel Corpora for the Biomedical Domain
Three Dimensions of Reproducibility in Natural Language Processing
Automating Document Discovery in the Systematic Review Process: How to Use Chaff to Extract Wheat
Nøklestad, Anders The LIA Treebank of Spoken Norwegian Dialects
Nürnberger, Andreas Portable Spelling Corrector for a Less-Resourced Language: Amharic

 

O
O'Donovan, Claire A New Corpus to Support Text Mining for the Curation of Metabolites in the ChEBI Database
O'Gorman, Tim Abstract Meaning Representation of Constructions: The More We Include, the Better the Representation
The New Propbank: Aligning Propbank with AMR through POS Unification
O'Reilly, Maria The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Oard, Douglas W. An Initial Test Collection for Ranked Retrieval of SMS Conversations
Obeid, Ossama The MADAR Arabic Dialect Corpus and Lexicon
A Morphologically Annotated Corpus of Emirati Arabic
MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction
Oberlander, Jon Up-cycling Data for Natural Language Generation
Oberle, Bruno SACR: A Drag-and-Drop Based Tool for Coreference Annotation
Ochs, Magalie A Semi-autonomous System for Creating a Human-Machine Interaction Corpus in Virtual Reality: Application to the ACORFORMed System for Training Doctors to Break Bad News
Odijk, Jan The AnnCor CHILDES Treebank
Oertel, Catharine A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
FARMI: A FrAmework for Recording Multi-Modal Interactions
Crowdsourced Multimodal Corpora Collection Tool
Oflazer, Kemal The MADAR Arabic Dialect Corpus and Lexicon
MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction
Ogrodniczuk, Maciej Deep Neural Networks for Coreference Resolution for Polish
Ohsuga, Tomoko Extending Search System based on Interactive Visualization for Speech Corpora
Okada, Shogo Collection of Multimodal Dialog Data and Analysis of the Result of Annotation of Users' Interest Level
Okahisa, Taro J-MeDic: A Japanese Disease Name Dictionary based on Real Clinical Usage
Okazaki, Naoaki Incorporating Semantic Attention in Video Description Generation
Okinina, Nadezda Transc&Anno: A Graphical Tool for the Transcription and On-the-Fly Annotation of Handwritten Documents
Okumura, Yuko Infant Word Comprehension-to-Production Index Applied to Investigation of Noun Learning Predominance Using Cross-lingual CDI database
Analyzing Vocabulary Commonality Index Using Large-scaled Database of Child Language Development
Oliveira, Elias Portuguese Named Entity Recognition using Conditional Random Fields and Local Grammars
Olsen, Sussi A Danish FrameNet Lexicon and an Annotated Corpus Used for Training and Evaluating a Semantic Frame Classifier
Omura, Mai Universal Dependencies Version 2 for Japanese
Oncevay, Arturo Corpus Building and Evaluation of Aspect-based Opinion Summaries from Tweets in Spanish
WordNet-Shp: Towards the Building of a Lexical Database for a Peruvian Minority Language
ChAnot: An Intelligent Annotation Tool for Indigenous and Highly Agglutinative Languages in Peru
Oostdijk, Nelleke Metadata Collection Records for Language Resources
Oraby, Shereen Exploring Conversational Language Generation for Rich Content about Hotels
SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems
Orekhova, Serafina Enhancing the AI2 Diagrams Dataset Using Rhetorical Structure Theory
Orizu, Udochukwu Content-Based Conflict of Interest Detection on Wikipedia
Ostermann, Simon MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge
Mapping Texts to Scripts: An Entailment Study
Ostler, Daniel Expert Evaluation of a Spoken Dialogue System in a Clinical Operating Room
Otten, Meie The AnnCor CHILDES Treebank
Oualil, Youssef The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Ould-Arbi, Malik BabyCloud, a Technological Platform for Parents and Researchers
Owen, Gareth A New Corpus to Support Text Mining for the Curation of Metabolites in the ChEBI Database

 

P
PVS, Avinesh Live Blog Corpus for Summarization
Padró, Lluís Coreference Resolution in FreeLing 4.0
Paetzold, Gustavo Text Simplification from Professionally Produced Corpora
SimPA: A Sentence-Level Simplification Corpus for the Public Administration Domain
Paggio, Patrizia Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Classifying the Informative Behaviour of Emoji in Microblogs
Pagé-Perron, Émilie Towards a Linked Open Data Edition of Sumerian Corpora
Paikens, Peteris Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Pajović, Danica Cross-linguistically Small World Networks are Ubiquitous in Child-directed Speech
Palmer, Martha Abstract Meaning Representation of Constructions: The More We Include, the Better the Representation
Integrating Generative Lexicon Event Structures into VerbNet
The New Propbank: Aligning Propbank with AMR through POS Unification
Palshikar, Girish K. Towards a Standardized Dataset for Noun Compound Interpretation
Pan, Xiaoman Error Analysis of Uyghur Name Tagging: Language-specific Techniques and Remaining Challenges
Panchenko, Alexander Building a Web-Scale Dependency-Parsed Corpus from CommonCrawl
Enriching Frame Representations with Distributionally Induced Senses
An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages
Improving Hypernymy Extraction with Distributional Semantic Classes
Pandey, Ayushi Phonetically Balanced Code-Mixed Speech Corpus for Hindi-English Automatic Speech Recognition
Panunzi, Alessandro One event, many representations. Mapping action concepts through visual features.
Papageorgiou, Haris BioRead: A New Dataset for Biomedical Reading Comprehension
Papavassiliou, Vassilis Discovering Parallel Language Resources for Training MT Engines
Pappas, Dimitris BioRead: A New Dataset for Biomedical Reading Comprehension
Paraboni, Ivandré Building a Corpus for Personality-dependent Natural Language Understanding and Generation
Definite Description Lexical Choice: taking Speaker's Personality into account
Referring Expression Generation in time-constrained communication
Author Profiling from Facebook Corpora
Reference production in human-computer interaction: Issues for Corpus-based Referring Expression Generation
Parde, Natalie A Corpus of Metaphor Novelty Scores for Syntactically-Related Word Pairs
Pardelli, Gabriella LREMap, a Song of Resources and Evaluation
The LREC Workshops Map
Pardo, Thiago Towards AMR-BR: A SemBank for Brazilian Portuguese Language
Pareti, Silvia Dialog Intent Structure: A Hierarchical Schema of Linked Dialog Acts
Park, Joonsuk A Corpus of eRulemaking User Comments for Measuring Evaluability of Arguments
Park, Sunghyun Semi-supervised Training Data Generation for Multilingual Question Answering
Park, Jungyeul Data Anonymization for Requirements Quality Analysis: a Reproducible Automatic Error Detection Task
Park, Suzi Grapheme-level Awareness in Word Embeddings for Morphologically Rich Languages
Paroubek, Patrick Annotating Spin in Biomedical Scientific Publications : the case of Random Controlled Trials (RCTs)
Measuring Innovation in Speech and Language Processing Publications.
Parsons, Simon Sentiment-Stance-Specificity (SSS) Dataset: Identifying Support-based Entailment among Opinions.
Partanen, Niko Multilingual Dependency Parsing for Low-Resource Languages: Case Studies on North Saami and Komi-Zyrian
Parvez, Md. Rizwan A Corpus of Drug Usage Guidelines Annotated with Type of Advice
Pasini, Tommaso Huge Automatically Extracted Training-Sets for Multilingual Word SenseDisambiguation
Passonneau, Rebecca PyrEval: An Automated Method for Summary Content Analysis
Patel, Kevin Indian Language Wordnets and their Linkages with Princeton WordNet
Towards a Standardized Dataset for Noun Compound Interpretation
Patti, Viviana An Italian Twitter Corpus of Hate Speech against Immigrants
Application and Analysis of a Multi-layered Scheme for Irony on the Italian Twitter Corpus TWITTIRÒ
Patton, Robert Analyzing Citation-Distance Networks for Evaluating Publication Impact
Paul, Mithun Grounding Gradable Adjectives through Crowdsourcing
Pauli, Patrick Adapting Serious Game for Fallacious Argumentation to German: Pitfalls, Insights, and Best Practices
Pecore, Stefania Complex and Precise Movie and Book Annotations in French Language for Aspect Based Sentiment Analysis
Pedersen, Bolette A Danish FrameNet Lexicon and an Annotated Corpus Used for Training and Evaluating a Semantic Frame Classifier
Pelachaud, Catherine From analysis to modeling of engagement as sequences of multimodal behaviors
Peng, Jing Designing a Russian Idiom-Annotated Corpus
Pereira, José ChAnot: An Intelligent Annotation Tool for Indigenous and Highly Agglutinative Languages in Peru
Perez, Naiara Biomedical term normalization of EHRs with UMLS
Pergandi, Jean-Marie A Semi-autonomous System for Creating a Human-Machine Interaction Corpus in Virtual Reality: Application to the ACORFORMed System for Training Doctors to Break Bad News
Petersen, Wiebke AET: Web-based Adjective Exploration Tool for German
Petitjean, Simon A Parser for LTAG and Frame Semantics
Petitrenaud, Simon Computer-assisted Speaker Diarization: How to Evaluate Human Corrections
Petukhova, Volha The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Towards Continuous Dialogue Corpus Creation: writing to corpus and generating from it
Peyrard, Maxime Live Blog Corpus for Summarization
Peñaloza, Daniel Corpus Building and Evaluation of Aspect-based Opinion Summaries from Tweets in Spanish
Pham, Ngoc Quan KIT-Multi: A Translation-Oriented Multilingual Embedding Corpus
Phan, Nhien A Web-based System for Crowd-in-the-Loop Dependency Treebanking
Piantadosi, Steven The Natural Stories Corpus
Piao, Scott Towards a Welsh Semantic Annotation System
Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger
Piasecki, Maciej Classifier-based Polarity Propagation in a WordNet
Piccardi, Massimo BiLSTM-CRF for Persian Named-Entity Recognition ArmanPersoNERCorpus: the First Entity-Annotated Persian Dataset
English-Basque Statistical and Neural Machine Translation
Pielström, Steffen Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods
Pighin, Daniele Automatic Prediction of Discourse Connectives
Pimm, Christophe A Real-life, French-accented Corpus of Air Traffic Control Communications
Pincus, Eli Chahta Anumpa: A multimodal corpus of the Choctaw Language
Pinkal, Manfred MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge
Multi-layer Annotation of the Rigveda
Mapping Texts to Scripts: An Entailment Study
Semi-Supervised Clustering for Short Answer Scoring
Pinnis, Mārcis Training and Adapting Multilingual NMT for Less-resourced and Morphologically Rich Languages
Tilde MT Platform for Developing Client Specific MT Solutions
Pinquier, Julien Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Pipatsrisawat, Knot Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Voice Builder: A Tool for Building Text-To-Speech Voices
Piperidis, Stelios Discovering Parallel Language Resources for Training MT Engines
Managing Public Sector Data for Multilingual Applications Development
European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Pirovani, Juliana Portuguese Named Entity Recognition using Conditional Random Fields and Local Grammars
Pirrelli, Vito Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective
Plu, Julien Sanaphor++: Combining Deep Neural Networks with Semantics for Coreference Resolution
Plátek, Ondřej Using Adversarial Examples in Natural Language Processing
Pocostales, Joel Can Domain Adaptation be Handled as Analogies?
Poerner, Nina A Web Service for Pre-segmenting Very Long Transcribed Speech Recordings
Poibeau, Thierry Multilingual Dependency Parsing for Low-Resource Languages: Case Studies on North Saami and Komi-Zyrian
Poletto, Fabio An Italian Twitter Corpus of Hate Speech against Immigrants
Pollak, Senja Reusable workflows for gender prediction
Pollard, Kimberly Dialogue Structure Annotation for Multi-Floor Interaction
Ponkiya, Girishkumar Towards a Standardized Dataset for Noun Compound Interpretation
Pont, Oriol Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Ponzetto, Simone Paolo Building a Web-Scale Dependency-Parsed Corpus from CommonCrawl
MIsA: Multilingual "IsA" Extraction from Corpora
Enriching Frame Representations with Distributionally Induced Senses
An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages
CATS: A Tool for Customized Alignment of Text Simplification Corpora
Improving Hypernymy Extraction with Distributional Semantic Classes
Poostchi, Hanieh BiLSTM-CRF for Persian Named-Entity Recognition ArmanPersoNERCorpus: the First Entity-Annotated Persian Dataset
Popescu, Vladimir New directions in ELRA activities
Popescu, Octavian A Large Resource of Patterns for Verbal Paraphrases
QUEST: A Natural Language Interface to Relational Databases
Popescu-Belis, Andrei Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
Popovic, Maja A Multilingual Wikified Data Set of Educational Material
Posch, Lisa ILCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Postma, Marten Don't Annotate, but Validate: a Data-to-Text Method for Capturing Event Data
Pouchoulin, Gilles Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Prabhakaran, Vinodkumar RtGender: A Corpus for Studying Differential Responses to Gender
Pradhan, Sameer The New Propbank: Aligning Propbank with AMR through POS Unification
Pragst, Louisa On the Vector Representation of Utterances in Dialogue Context
Prazak, Ales Towards Processing of the Oral History Interviews and Related Printed Documents
Pretkalnina, Lauma Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Preum, Sarah Masud A Corpus of Drug Usage Guidelines Annotated with Type of Advice
Proisl, Thomas Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods
Albanian Part-of-Speech Tagging: Gold Standard and Evaluation
SoMeWeTa: A Part-of-Speech Tagger for German Social Media and Web Texts
Prokofyev, Roman Sanaphor++: Combining Deep Neural Networks with Semantics for Coreference Resolution
Prokopidis, Prokopis Discovering Parallel Language Resources for Training MT Engines
Pronto, Dominique A Real-life, French-accented Corpus of Air Traffic Control Communications
Prud'hommeaux, Emily ASR for Documenting Acutely Under-Resourced Indigenous Languages
Pryzant, Reid JESC: Japanese-English Subtitle Corpus
Prévil, Nathalie Browsing the Terminological Structure of a Specialized Domain: A Method Based on Lexical Functions and their Classification
Psutka, Josef V. Towards Processing of the Oral History Interviews and Related Printed Documents
Puech, Michèle Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Puhrsch, Christian Advances in Pre-Training Distributed Word Representations
Pustejovsky, James Towards an ISO Standard for the Annotation of Quantification
Integrating Generative Lexicon Event Structures into VerbNet
An Evaluation Framework for Multimodal Interaction
Bridging the LAPPS Grid and CLARIN
Pérez-Hernández, Chantal Lingmotif-lex: a Wide-coverage, State-of-the-art Lexicon for Sentiment Analysis
Pérez-Rosas, Verónica Analyzing the Quality of Counseling Conversations: the Tell-Tale Signs of High-quality Counseling
Pétursson, Matthías Open ASR for Icelandic: Resources and a Baseline System
Pędzimąż, Tomasz An Application for Building a Polish Telephone Speech Corpus
Pęzik, Piotr Increasing the Accessibility of Time-Aligned Speech Corpora with Spokes Mix

 

Q
Quaresma, Paulo A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language
Quasnik, Vanessa Medical Entity Corpus with PICO elements and Sentiment Analysis
Quasthoff, Uwe Corpora of Typical Sentences
Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment
Qui, Wei Handling Big Data and Sensitive Data Using EUDAT's Generic Execution Framework and the WebLicht Workflow Engine.
Quiniou, Solen Towards a Diagnosis of Textual Difficulties for Children with Dyslexia
Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Quochi, Valeria The DLDP Survey on Digital Use and Usability of EU Regional and Minority Languages
Qwaider, Chatrine Shami: A Corpus of Levantine Arabic Dialects

 

R
Rach, Niklas On the Vector Representation of Utterances in Dialogue Context
Rademaker, Alexandre Text Mining for History: first steps on building a large dataset
Rajakumar, Ravindran Community-Driven Crowdsourcing: Data Collection with Local Developers
Rajendran, Pavithra Sentiment-Stance-Specificity (SSS) Dataset: Identifying Support-based Entailment among Opinions.
Rambow, Owen The MADAR Arabic Dialect Corpus and Lexicon
Unified Guidelines and Resources for Arabic Dialect Orthography
Ramos, Ricelli Building a Corpus for Personality-dependent Natural Language Understanding and Generation
Ranathunga, Surangika Handling Rare Word Problem using Synthetic Training Data for Sinhala and Tamil Neural Machine Translation
Improving domain-specific SMT for low-resourced languages using data from different domains
Annotating Opinions and Opinion Targets in Student Course Feedback
Graph Based Semi-Supervised Learning Approach for Tamil POS tagging
Rapp, Reinhard A Multilingual Dataset for Evaluating Parallel Sentence Extraction from Comparable Corpora
Raschia, Guillaume Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Ratinov, Lev CogCompNLP: Your Swiss Army Knife for NLP
Rau, Felix Introducing the CLARIN Knowledge Centre for Linguistic Diversity and Language Documentation
Raveh, Eran FARMI: A FrAmework for Recording Multi-Modal Interactions
Ravelli, Andrea Amelio One event, many representations. Mapping action concepts through visual features.
Ravishankar, Vinit MGAD: Multilingual Generation of Analogy Datasets
Raynal, Céline A Real-life, French-accented Corpus of Air Traffic Control Communications
Rayson, Paul Arabic Dialect Identification in the Context of Bivalency and Code-Switching
Towards a Welsh Semantic Annotation System
Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger
Razavi, Marzieh SMILE Swiss German Sign Language Dataset
Reckling, Lucas Towards a Linked Open Data Edition of Sumerian Corpora
Reddy, Vikas Building a Word Segmenter for Sanskrit Overnight
Redman, Tom CogCompNLP: Your Swiss Army Knife for NLP
Reed, Chris Intertextual Correspondence for Integrating Corpora
Reganti, Aishwarya N. Aggression-annotated Corpus of Hindi-English Code-mixed Data
Rehm, Georg Language Technology for Multilingual Europe: An Analysis of a Large-Scale Survey regarding Challenges, Demands, Gaps and Needs
Automatic and Manual Web Annotations in an Infrastructure to handle Fake News and other Online Media Phenomena
Reimerink, Arianne Manzanilla: An Image Annotation Tool for TKB Building
Evaluating EcoLexiCAT: a Terminology-Enhanced CAT Tool
Reiter, Nils QUD-Based Annotation of Discourse Structure and Information Structure: Tool and Evaluation
Remaci, Arslen T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples
Remus, Steffen Retrofitting Word Representations for Unsupervised Sense Aware Word Similarities
Ren, Xuancheng Building an Ellipsis-aware Chinese Dependency Treebank for Web Text
Renner-Westermann, Heike Interoperability of Language-related Information: Mapping the BLL Thesaurus to Lexvo and Glottolog
Resnicow, Kenneth Analyzing the Quality of Counseling Conversations: the Tell-Tale Signs of High-quality Counseling
Rey, Günter Daniel CoLoSS: Cognitive Load Corpus with Speech and Performance Data from a Symbol-Digit Dual-Task
Rey, Christophe Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Reynés, Philippe Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Rialland, Annie A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
Richter, Caitlin Low-resource Post Processing of Noisy OCR Output for Historical Corpus Digitisation
Rieb, Elias TreeAnnotator: Versatile Visual Annotation of Hierarchical Text Relations
Riester, Arndt QUD-Based Annotation of Discourse Structure and Information Structure: Tool and Evaluation
German Radio Interviews: The GRAIN Release of the SFB732 Silver Standard Collection
Rigau, German Building Named Entity Recognition Taggers via Parallel Corpora
Cross-checking WordNet and SUMO Using Meronymy
Biomedical term normalization of EHRs with UMLS
Developing New Linguistic Resources and Tools for the Galician Language
Rigouts Terryn, Ayla A Gold Standard for Multilingual Automatic Term Extraction from Comparable Corpora: Term Structure and Translation Equivalents
Rigutini, Leonardo Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic Lexical Resources
Rijhwani, Shruti Parser combinators for Tigrinya and Oromo morphology
Rikters, Matīss Training and Adapting Multilingual NMT for Less-resourced and Morphologically Rich Languages
Rilliard, Albert A Speaking Atlas of the Regional Languages of France
Rim, Kyeongmin Bridging the LAPPS Grid and CLARIN
Rind-Pawlowski, Monika Universal Morphologies for the Caucasus region
Rinott, Ruty Semantic Relatedness of Wikipedia Concepts -- Benchmark Data and a Working Solution
Rituma, Laura Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Rivière, Laura Extending the gold standard for a lexical substitution task: is it worth it?
Rizzo, Giuseppe Framing Named Entity Linking Error Types
Sanaphor++: Combining Deep Neural Networks with Semantics for Coreference Resolution
Rizzolo, Nickolas CogCompNLP: Your Swiss Army Knife for NLP
Robert, Danièle Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Roberts, Will A Large Automatically-Acquired All-Words List of Multiword Expressions Scored for Compositionality
Roberts, Kirk A FrameNet for Cancer Information in Clinical Narratives: Schema and Annotation
Robichaud, Benoît Lexical Profiling of Environmental Corpora
Browsing the Terminological Structure of a Specialized Domain: A Method Based on Lexical Functions and their Classification
Rocci, Andrea A Multi-layer Annotated Corpus of Argumentative Text: From Argument Schemes to Discourse Relations
Rocha, Danillo Reference production in human-computer interaction: Issues for Corpus-based Referring Expression Generation
Roche, Mathieu Automatic Identification of Research Fields in Scientific Papers
Rodney, Nielsen Annotating Reflections for Health Behavior Change Therapy
Rodrigues, Paul Arabic Data Science Toolkit: An API for Arabic Language Feature Extraction
Rodrigues, João Semantic Equivalence Detection: Are Interrogatives Harder than Declaratives?
Finely Tuned, 2 Billion Token Based Word Embeddings for Portuguese
Rodríguez-Fernández, Sara Generation of a Spanish Artificial Collocation Error Corpus
Roesiger, Ina BASHI: A Corpus of Wall Street Journal Articles Annotated with Bridging Links
German Radio Interviews: The GRAIN Release of the SFB732 Silver Standard Collection
Rohrbach, Anna A vision-grounded dataset for predicting typical locations for verbs
Roman-Jimenez, Geoffrey Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Romportl, Jan Design and Development of Speech Corpora for Air Traffic Control Training
Ronzano, Francesco PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles
Rosenberg, Andrew Interpersonal Relationship Labels for the CALLHOME Corpus
Roshanfekr, Behnam Parsivar: A Language Processing Toolkit for Persian
Rosner, Mike Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Rosset, Sophie Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Rosso, Paolo CATS: A Tool for Customized Alignment of Text Simplification Corpora
Roth, Michael MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge
Roth, Dan CogCompNLP: Your Swiss Army Knife for NLP
Rouces, Jacobo SenSALDO: Creating a Sentiment Lexicon for Swedish
Generating a Gold Standard for a Swedish Sentiment Lexicon
Roy, Subhro CogCompNLP: Your Swiss Army Knife for NLP
Rozis, Roberts Collecting Language Resources from Public Administrations in the Nordic and Baltic Countries
Tilde MT Platform for Developing Client Specific MT Solutions
Rubellin, Françoise Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Rueter, Jack Combining Concepts and Their Translations from Structured Dictionaries of Uralic Minority Languages
Ruigrok, Nel Studying Muslim Stereotyping through Microportrait Extraction
Rumshisky, Anna Automatic Labeling of Problem-Solving Dialogues for Computational Microgenetic Learning Analytics
Ruppenhofer, Josef Introducing a Lexicon of Verbal Polarity Shifters for English
Disambiguation of Verbal Shifters
Building a Morphological Treebank for German from a Linguistic Database
Ruppert, Eugen Building a Web-Scale Dependency-Parsed Corpus from CommonCrawl
Russo, Irene The DLDP Survey on Digital Use and Usability of EU Regional and Minority Languages
Ruths, Derek An Attribution Relations Corpus for Political News
Rychlik, Piotr SimLex-999 for Polish
Rytting, C. Anton Arabic Data Science Toolkit: An API for Arabic Language Feature Extraction
Ryu, Koichiro Statistical Analysis of Missing Translation in Simultaneous Interpretation Using A Large-scale Bilingual Speech Corpus
Rögnvaldsson, Eiríkur Risamálheild: A Very Large Icelandic Text Corpus
Rødven Eide, Stian SenSALDO: Creating a Sentiment Lexicon for Swedish
Generating a Gold Standard for a Swedish Sentiment Lexicon

 

S
S, Sreelekha Morphology Injection for English-Malayalam Statistical Machine Translation
SAADANE, Houda Automatic Identification of Maghreb Dialects Using a Dictionary-Based Approach
SEMMAR, Nasredine Automatic Identification of Maghreb Dialects Using a Dictionary-Based Approach
SINI, Aghilas SynPaFlex-Corpus: An Expressive French Audiobooks Corpus dedicated to expressive speech synthesis.
SUN, Xu Building an Ellipsis-aware Chinese Dependency Treebank for Web Text
Saad, Motaz Shami: A Corpus of Levantine Arabic Dialects
Saam, Christian The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Sabbah, Firas Multi-lingual Argumentative Corpora in English, Turkish, Greek, Albanian, Croatian, Serbian, Macedonian, Bulgarian, Romanian and Arabic
Sabeti, Behnam MirasVoice: A bilingual (English-Persian) speech corpus
MirasText: An Automatically Generated Text Corpus for Persian
Sadat, Fatiha Retrieving Information from the French Lexical Network in RDF/OWL Format
Saddiki, Hind Unified Guidelines and Resources for Arabic Dialect Orthography
A Leveled Reading Corpus of Modern Standard Arabic
Sadeghi Bigham, Bahram Extracting an English-Persian Parallel Corpus from Comparable Corpora
Saedi, Chakaveh Semantic Equivalence Detection: Are Interrogatives Harder than Declaratives?
Browsing and Supporting Pluricentric Global Wordnet, or just your Wordnet of Interest
Safari, Pegah Persian Discourse Treebank and coreference corpus
Safavi, Saeid MirasVoice: A bilingual (English-Persian) speech corpus
Sager, Leslie A Recorded Debating Dataset
Saggion, Horacio PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles
Sagot, Benoît A multilingual collection of CoNLL-U-compatible morphological lexicons
CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing
Cheating a Parser to Death: Data-driven Cross-Treebank Annotation Transfer
Saha, Sriparna Medical Sentiment Analysis using Social Media: Towards building a Patient Assisted System
Sahlgren, Magnus Distributional Term Set Expansion
Sakaguchi, Tomohiro Comprehensive Annotation of Various Types of Temporal Information on the Time Axis
Sakai, Kazuki Creating Large-Scale Argumentation Structures for Dialogue Systems
Sakaida, Rui Preliminary Analysis of Embodied Interactions between Science Communicators and Visitors Based on a Multimodal Corpus of Japanese Conversations in a Science Museum
Sakaizawa, Yuya Construction of a Japanese Word Similarity Dataset
Sakamoto, Miho Sudachi: a Japanese Tokenizer for Business
Sakti, Sakriani Dialogue Scenario Collection of Persuasive Dialogue with Emotional Expressions via Crowdsourcing
Construction of English-French Multimodal Affective Conversational Corpus from TV Dramas
Salam, Amitra TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection
Salameh, Mohammad The MADAR Arabic Dialect Corpus and Lexicon
Unified Guidelines and Resources for Arabic Dialect Orthography
Salamo, Maria ETPC - A Paraphrase Identification Corpus Annotated with Extended Paraphrase Typology and Negation
Sales, Juliano Efson A Multilingual Test Collection for the Semantic Search of Entity Categories
Indra: A Word Embedding and Semantic Relatedness Server
Salffner, Sophie Introducing the CLARIN Knowledge Centre for Linguistic Diversity and Language Documentation
Salimbajevs, Askars Creating Lithuanian and Latvian Speech Corpora from Inaccurately Annotated Web Data
Sallaberry, Christian Automatic Identification of Research Fields in Scientific Papers
Salton, Giancarlo D. Is it worth it? Budget-related evaluation metrics for model selection
Samih, Younes Multi-Dialect Arabic POS Tagging: A CRF Approach
Multilingual Multi-class Sentiment Classification Using Convolutional Neural Networks
Sammons, Mark CogCompNLP: Your Swiss Army Knife for NLP
SanJuan, Eric Building Evaluation Datasets for Cultural Microblog Retrieval
Sandaruwan, Prabath Handling Rare Word Problem using Synthetic Training Data for Sinhala and Tamil Neural Machine Translation
Sandra, Dominiek WordKit: a Python Package for Orthographic and Phonological Featurization
Sanguinetti, Manuela PoSTWITA-UD: an Italian Twitter Treebank in Universal Dependencies
An Italian Twitter Corpus of Hate Speech against Immigrants
Sankepally, Rashmi An Initial Test Collection for Ranked Retrieval of SMS Conversations
Santos, Henrique BlogSet-BR: A Brazilian Portuguese Blog Corpus
Sarah, Gagestein Studying Muslim Stereotyping through Microportrait Extraction
Sarasola, Kepa Konbitzul: an MWE-specific database for Spanish-Basque
Sargsian, Hasmik Universal Morphologies for the Caucasus region
Sarin, Supheakmungkol Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Sarkar, Rajdeep A supervised approach to taxonomy extraction using word embeddings
Saruwatari, Hiroshi CPJD Corpus: Crowdsourced Parallel Speech Corpus of Japanese Dialects
Sarzyńska, Justyna The Linguistic Category Model in Polish (LCM-PL)
Sasaki, Felix A Framework for the Needs of Different Types of Users in Multilingual Semantic Enrichment
Sasaki, Minoru All-words Word Sense Disambiguation Using Concept Embeddings
Sass, Bálint E-magyar -- A Digital Language Processing System
Sateli, Bahar The LODeXporter: Flexible Generation of Linked Open Data Triples from NLP Frameworks for Automatic Knowledge Base Construction
Sato, Yo Creating dialect sub-corpora by clustering: a case in Japanese for an adaptive method
Saubesty, Jorane A Semi-autonomous System for Creating a Human-Machine Interaction Corpus in Virtual Reality: Application to the ACORFORMed System for Training Doctors to Break Bad News
Saulite, Baiba Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Saunshi, Nikunj A Large Self-Annotated Corpus for Sarcasm
Sawada, Shinnosuke Improving Crowdsourcing-Based Annotation of Japanese Discourse Relations
Sayeed, Asad Rollenwechsel-English: a large-scale semantic role corpus
Scarton, Carolina Text Simplification from Professionally Produced Corpora
SimPA: A Sentence-Level Simplification Corpus for the Public Administration Domain
Schabus, Dietmar Academic-Industrial Perspective on the Development and Deployment of a Moderation System for a Newspaper Website
Schenk, Niko Knowing the Author by the Company His Words Keep
Towards a Linked Open Data Edition of Sumerian Corpora
The ACoLi CoNLL Libraries: Beyond Tab-Separated Values
Scherer, Stefan Unfolding the External Behavior and Inner Affective State of Teammates through Ensemble Learning: Experimental Evidence from a Dyadic Team Corpus
Scherrer, Yves Crowdsourcing Regional Variation Data and Automatic Geolocalisation of Speakers of European French
Scheutz, Matthias Towards a Conversation-Analytic Taxonomy of Speech Overlap
Schiel, Florian MOCCA: Measure of Confidence for Corpus Analysis - Automatic Reliability Check of Transcript and Automatic Segmentation
Evaluation of Automatic Formant Trackers
A Web Service for Pre-segmenting Very Long Transcribed Speech Recordings
Schiele, Bernt A vision-grounded dataset for predicting typical locations for verbs
Schiersch, Martin A German Corpus for Fine-Grained Named Entity Recognition and Relation Extraction of Traffic and Industry Events
Schlangen, David A Corpus of Natural Multimodal Spatial Scene Descriptions
Schler, Jonathan Automatic Thesaurus Construction for Modern Hebrew
Schluter, Natalie Baselines and Test Data for Cross-Lingual Inference
Schmidt, Maria Towards an Automatic Assessment of Crowdsourced Data for NLU
Schmidt, Christoph Data-Driven Pronunciation Modeling of Swiss German Dialectal Speech for Automatic Speech Recognition
Schmirler, Katherine Building a Constraint Grammar Parser for Plains Cree Verbs and Arguments
Schmitt, Maximilian A German Corpus for Fine-Grained Named Entity Recognition and Relation Extraction of Traffic and Industry Events
Schmitz, Peter PMKI: an European Commission action for the interoperability, maintainability and sustainability of Language Resources
Schneider, Roman GeCoTagger: Annotation of German Verb Complements with Conditional Random Fields
Schneider, Nathan Abstract Meaning Representation of Constructions: The More We Include, the Better the Representation
Semantic Supersenses for English Possessives
Schnur, Eileen An Integrated Formal Representation for Terminological and Lexical Data included in Classification Schemes
European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Schraagen, Marijn Linguistic and Sociolinguistic Annotation of 17th Century Dutch Letters
Schröder, Ingrid HiNTS: A Tagset for Middle Low German
Schulder, Marc Introducing a Lexicon of Verbal Polarity Shifters for English
Schuler, William Test Sets for Chinese Nonlocal Dependency Parsing
Schumann, Anne-Kathrin Automatic Annotation of Semantic Term Types in the Complete ACL Anthology Reference Corpus
Schwab, Didier UFSAC: Unification of Sense Annotated Corpora and Tools
Schwab, Sandra MIAPARLE: Online training for the discrimination of stress contrasts
Schwartz, Lane A Morphological Analyzer for St. Lawrence Island / Central Siberian Yupik
Schweitzer, Katrin German Radio Interviews: The GRAIN Release of the SFB732 Silver Standard Collection
Schweitzer, Antje German Radio Interviews: The GRAIN Release of the SFB732 Silver Standard Collection
Schwenk, Holger A Corpus for Multilingual Document Classification in Eight Languages
Schädlich, Robert LiDo RDF: From a Relational Database to a Linked Data Graph of Linguistic Terms and Bibliographic Data
Schöch, Christof Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods
Schön, Saskia A Corpus Study and Annotation Schema for Named Entity Recognition and Relation Extraction of Business Products
Schöpfel, Joachim Automatic Identification of Research Fields in Scientific Papers
Seddah, Djamé CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing
Cheating a Parser to Death: Data-driven Cross-Treebank Annotation Transfer
Seffih, Hosni Automatic Identification of Maghreb Dialects Using a Dictionary-Based Approach
Segers, Roxane The Circumstantial Event Ontology (CEO) and ECB+/CEO: an Ontology and Corpus for Implicit Causal Relations between Events
Don't Annotate, but Validate: a Data-to-Text Method for Capturing Event Data
Seitz, Hannah Mapping Texts to Scripts: An Entailment Study
Seminck, Olga A Gold Anaphora Annotation Layer on an Eye Movement Corpus
Semmar, Nasredine A Hybrid Approach for Automatic Extraction of Bilingual Multiword Expressions from Parallel Corpora
A Neural Network Model for Part-Of-Speech Tagging of Social Media Texts
Sennrich, Rico Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method
Improving Machine Translation of Educational Content via Crowdsourcing
Senuma, Hajime Universal Dependencies for Ainu
Sequeira, João A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language
Serras, Manex ES-Port: a Spontaneous Spoken Human-Human Technical Support Corpus for Dialogue Research in Spanish
Sevcikova, Magda Semi-Automatic Construction of Word-Formation Networks (for Polish and Spanish)
Seyfeddinipur, Mandana Introducing the CLARIN Knowledge Centre for Linguistic Diversity and Language Documentation
Seyffarth, Esther AET: Web-based Adjective Exploration Tool for German
Seyoum, Binyam Ephrem Portable Spelling Corrector for a Less-Resourced Language: Amharic
Universal Dependencies for Amharic
Shao, Yutong Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method
Shardlow, Matthew A New Corpus to Support Text Mining for the Curation of Metabolites in the ChEBI Database
Sharma, Dipti No more beating about the bush : A Step towards Idiom Handling for Indian Language NLP
Sharma, Vishnu Building a Word Segmenter for Sanskrit Overnight
Sharma, Arun Gaining and Losing Influence in Online Conversation
Sharoff, Serge Language adaptation experiments via cross-lingual embeddings for related languages
Investigating the Influence of Bilingual MWU on Trainee Translation Quality
Cross-lingual Terminology Extraction for Translation Quality Estimation
A Multilingual Dataset for Evaluating Parallel Sentence Extraction from Comparable Corpora
Sharp, Rebecca Grounding Gradable Adjectives through Crowdsourcing
Sheikh, Zaid Parser combinators for Tigrinya and Oromo morphology
Sheinin, Vadim A Large Resource of Patterns for Verbal Paraphrases
QUEST: A Natural Language Interface to Relational Databases
Shemtov, Hadar Exploring Conversational Language Generation for Rich Content about Hotels
Sherif, Mohamed Ahmed LIdioms: A Multilingual Linked Idioms Data Set
Shi, Haoyue Constructing High Quality Sense-specific Corpus and Word Embedding via Unsupervised Elimination of Pseudo Multi-sense
Shih, Yueh-Yin Extended HowNet 2.0 – An Entity-Relation Common-Sense Representation Model
Shimada, Kazutaka Annotation and Analysis of Extractive Summaries for the Kyutech Corpus
Shin, Jong Hun Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages
Shin, Hyopil Grapheme-level Awareness in Word Embeddings for Morphologically Rich Languages
Shindo, Hiroyuki Construction of Large-scale English Verbal Multiword Expression Annotated Corpus
PDFAnno: a Web-based Linguistic Annotation Tool for PDF Documents
Chemical Compounds Knowledge Visualization with Natural Language Processing and Linked Data
Shinnou, Hiroyuki All-words Word Sense Disambiguation Using Concept Embeddings
Shirai, Kiyoaki JAIST Annotated Corpus of Free Conversation
Shkadzko, Pavel Rollenwechsel-English: a large-scale semantic role corpus
Shnarch, Eyal Semantic Relatedness of Wikipedia Concepts -- Benchmark Data and a Working Solution
Shore, Todd KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue
FARMI: A FrAmework for Recording Multi-Modal Interactions
Shrivastava, Manish Humor Detection in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System
Si, Yuqi A FrameNet for Cancer Information in Clinical Narratives: Schema and Annotation
Sibille, Jean Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Sicard, Etienne Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Sidler-Miserez, Sandra SMILE Swiss German Sign Language Dataset
Sidorov, Maxim Contextual Dependencies in Time-Continuous Multidimensional Affect Recognition
Silfverberg, Miikka A Computational Architecture for the Morphology of Upper Tanana
Silva, João Semantic Equivalence Detection: Are Interrogatives Harder than Declaratives?
Browsing and Supporting Pluricentric Global Wordnet, or just your Wordnet of Interest
Silva, Barbara Building a Corpus for Personality-dependent Natural Language Understanding and Generation
Silva, Vivian Building a Knowledge Graph from Natural Language Definitions for Interpretable Text Entailment Recognition
Simon, Eszter Evaluation of Dictionary Creating Methods for Finno-Ugric Minority Languages
E-magyar -- A Digital Language Processing System
Simonnet, Edwin Simulating ASR errors for training SLU systems
Simonyi, András What's Wrong, Python? -- A Visual Differ and Graph Library for NLP in Python
Simperl, Elena T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples
Sitaram, Sunayana Discovering Canonical Indian English Accents: A Crowdsourcing-based Approach
Skadiņš, Raivis Tilde MT Platform for Developing Client Specific MT Solutions
Skantze, Gabriel A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue
Skorkovska, Lucie Towards Processing of the Oral History Interviews and Related Printed Documents
Skowron, Marcin Academic-Industrial Perspective on the Development and Deployment of a Moderation System for a Newspaper Website
Sliwa, Alfred Multi-lingual Argumentative Corpora in English, Turkish, Greek, Albanian, Croatian, Serbian, Macedonian, Bulgarian, Romanian and Arabic
Sliz-Nagy, Alex SzegedKoref: A Hungarian Coreference Corpus
Slonim, Noam A Recorded Debating Dataset
Semantic Relatedness of Wikipedia Concepts -- Benchmark Data and a Working Solution
SLIDE - a Sentiment Lexicon of Common Idioms
Sloos, Marjoleine The Boarnsterhim Corpus: A Bilingual Frisian-Dutch Panel and Trend Study
Sluyter-Gäthje, Henny FooTweets: A Bilingual Parallel Corpus of World Cup Tweets
Smaili, Kamel An Automatic Learning of an Algerian Dialect Lexicon by using Multilingual Word Embeddings
Smal, Lilli European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Smither, Albry Exploring Conversational Language Generation for Rich Content about Hotels
Soares, Felipe A Large Parallel Corpus of Full-Text Scientific Articles
Sobrevilla Cabezudo, Marco Antonio Corpus Building and Evaluation of Aspect-based Opinion Summaries from Tweets in Spanish
WordNet-Shp: Towards the Building of a Lexical Database for a Peruvian Minority Language
ChAnot: An Intelligent Annotation Tool for Indigenous and Highly Agglutinative Languages in Peru
Sodimana, Keshan Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Solberg, Per Erik The LIA Treebank of Spoken Norwegian Dialects
Solla Portela, Miguel Anxo Developing New Linguistic Resources and Tools for the Galician Language
Solorio, Thamar MPST: A Corpus of Movie Plot Synopses with Tags
Song, Yan Constructing a Chinese Medical Conversation Corpus Annotated with Conversational Structures and Actions
Song, Yangqiu CogCompNLP: Your Swiss Army Knife for NLP
Song, Zhiyi Cross-Document, Cross-Language Event Coreference Annotation Using Event Hoppers
Laying the Groundwork for Knowledge Base Population: Nine Years of Linguistic Resources for TAC KBP
Soria, Claudia The DLDP Survey on Digital Use and Usability of EU Regional and Minority Languages
Sosoni, Vilelmini Improving Machine Translation of Educational Content via Crowdsourcing
A Multilingual Wikified Data Set of Educational Material
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Soto, Victor Collecting Code-Switched Data from Social Media
Soutner, Daniel Towards Processing of the Oral History Interviews and Related Printed Documents
Souza, Leonardo Indra: A Word Embedding and Semantic Relatedness Server
Specia, Lucia Text Simplification from Professionally Produced Corpora
SimPA: A Sentence-Level Simplification Corpus for the Public Administration Domain
Multimodal Lexical Translation
Speranza, Manuela KRAUTS: A German Temporally Annotated News Corpus
Sperber, Matthias KIT-Multi: A Translation-Oriented Multilingual Embedding Corpus
Spiliotopoulos, Dimitris The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Spillane, Brendan The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Srikumar, Vivek CogCompNLP: Your Swiss Army Knife for NLP
Srivastava, Brij Mohan Lal Phonetically Balanced Code-Mixed Speech Corpus for Hindi-English Automatic Speech Recognition
Stadsnes, Cathrine NoReC: The Norwegian Review Corpus
Stadtschnitzer, Michael Data-Driven Pronunciation Modeling of Swiss German Dialectal Speech for Automatic Speech Recognition
Stankovic, John A Corpus of Drug Usage Guidelines Annotated with Type of Advice
Stankovic, Ranka Using English Baits to Catch Serbian Multi-Word Terminology
Staron, Tobias Incorporating Contextual Information for Language-Independent, Dynamic Disambiguation Tasks
Stasimioti, Maria Improving Machine Translation of Educational Content via Crowdsourcing
A Multilingual Wikified Data Set of Educational Material
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Stede, Manfred Developing the Bangla RST Discourse Treebank
A Multi-layer Annotated Corpus of Argumentative Text: From Argument Schemes to Discourse Relations
A Lexicon of Discourse Markers for Portuguese – LDM-PT
Steding, David Lexical and Semantic Features for Cross-lingual Text Reuse Classification: an Experiment in English and Latin Paraphrases
Stehwien, Sabrina German Radio Interviews: The GRAIN Release of the SFB732 Silver Standard Collection
Steiblé, Lucie Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Pronunciation Dictionaries for the Alsatian Dialects to Analyze Spelling and Phonetic Variation
Steiner, Petra Building a Morphological Treebank for German from a Linguistic Database
Steiner, Ingmar A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks
Creating New Language and Voice Components for the Updated MaryTTS Text-to-Speech Synthesis Platform
Steingrímsson, Steinþór Risamálheild: A Very Large Icelandic Text Corpus
Stepanov, Daniela HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Stiegelmayr, Andreas Preparing Data from Psychotherapy for Natural Language Processing
Stoll, Stephanie SMILE Swiss German Sign Language Dataset
Stoll, Sabine Cross-linguistically Small World Networks are Ubiquitous in Child-directed Speech
Stoop, Wessel Signbank: Software to Support Web Based Dictionaries of Sign Language
Straka, Milan Diacritics Restoration Using Neural Networks
SumeCzech: Large Czech News-Based Summarization Dataset
Using Adversarial Examples in Natural Language Processing
Stranak, Pavel Bridging the LAPPS Grid and CLARIN
Stranisci, Marco An Italian Twitter Corpus of Hate Speech against Immigrants
Strassel, Stephanie From ‘Solved Problems’ to New Challenges: A Report on LDC Activities
Cross-Document, Cross-Language Event Coreference Annotation Using Event Hoppers
Simple Semantic Annotation and Situation Frames: Two Approaches to Basic Text Understanding in LORELEI
Laying the Groundwork for Knowledge Base Population: Nine Years of Linguistic Resources for TAC KBP
VAST: A Corpus of Video Annotation for Speech Technologies
Straňák, Pavel Diacritics Restoration Using Neural Networks
Strohmaier, Markus ILCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Strube, Michael BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages
Strzalkowski, Tomek Gaining and Losing Influence in Online Conversation
Strötgen, Jannik KRAUTS: A German Temporally Annotated News Corpus
Stueker, Sebastian A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Stüker, Sebastian BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools
Su, Ketong The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Suderman, Keith Mining Biomedical Publications With The LAPPS Grid
Bridging the LAPPS Grid and CLARIN
Sugano, Yusuke A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks
Sugisaki, Kyoko Building a Corpus from Handwritten Picture Postcards: Transcription, Annotation and Part-of-Speech Tagging
Sugiyama, Hiroaki Collection of Multimodal Dialog Data and Analysis of the Result of Annotation of Users' Interest Level
Sugiyama, Kyoshiro Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags
Suhara, Yoshihiko HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Sui, Zhifang Revisiting Distant Supervision for Relation Extraction
EventWiki: A Knowledge Base of Major Events
Sukhareva, Maria Analyzing Middle High German Syntax with RDF and SPARQL
Sullivan, Florence Automatic Labeling of Problem-Solving Dialogues for Computational Microgenetic Learning Analytics
Sumalvico, Maciej Corpora of Typical Sentences
Sumita, Eiichiro Multilingual Parallel Corpus for Global Communication Plan
Sun, Xuetong Analyzing the Quality of Counseling Conversations: the Tell-Tale Signs of High-quality Counseling
Sun, Yuqi Constructing High Quality Sense-specific Corpus and Word Embedding via Unsupervised Elimination of Pseudo Multi-sense
Surdeanu, Mihai Grounding Gradable Adjectives through Crowdsourcing
Bootstrapping Polar-Opposite Emotion Dimensions from Online Reviews
Text Annotation Graphs: Annotating Complex Natural Language Phenomena
Suwaileh, Reem DART: A Large Dataset of Dialectal Arabic Tweets
Suzuki, Yu Dialogue Scenario Collection of Persuasive Dialogue with Emotional Expressions via Crowdsourcing
Suzuki, Rui All-words Word Sense Disambiguation Using Concept Embeddings
Svoboda, Lukas Evaluation of Croatian Word Embeddings
Swami, Sahil Humor Detection in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System
Sylak-Glassman, John UniMorph 2.0: Universal Morphology
Szolovits, Peter Transfer Learning for Named-Entity Recognition with Neural Networks
Søgaard, Anders A Danish FrameNet Lexicon and an Annotated Corpus Used for Training and Evaluating a Semantic Frame Classifier

 

T
Tafreshi, Shabnam Sentence and Clause Level Emotion Annotation, Detection, and Classification in a Multi-Genre Corpus
Tahara, Takuji A Japanese Corpus for Analyzing Customer Loyalty Information
Tahmasebi, Nina SenSALDO: Creating a Sentiment Lexicon for Swedish
Generating a Gold Standard for a Swedish Sentiment Lexicon
Tahon, Marie SynPaFlex-Corpus: An Expressive French Audiobooks Corpus dedicated to expressive speech synthesis.
Taji, Dima CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing
Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages
Takada, Shohei CEFR-based Lexical Simplification Dataset
Takahashi, Tetsuro Analysis of Implicit Conditions in Database Search Dialogues
Takamichi, Shinnosuke CPJD Corpus: Crowdsourced Parallel Speech Corpus of Japanese Dialects
Takaoka, Kazuma Sudachi: a Japanese Tokenizer for Business
Takoulidou, Eirini Improving Machine Translation of Educational Content via Crowdsourcing
A Multilingual Wikified Data Set of Educational Material
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Tamburini, Fabio PoSTWITA-UD: an Italian Twitter Treebank in Universal Dependencies
Tan, Wang-Chiew HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Tanaka, Yayoi Construction of the Corpus of Everyday Japanese Conversation: An Interim Report
Tanaka, Takaaki Universal Dependencies Version 2 for Japanese
Tanaka, Hiroki Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags
Tanaka, Kazunari Chemical Compounds Knowledge Visualization with Natural Language Processing and Linked Data
Tandler, Raphaël Strategies and Challenges for Crowdsourcing Regional Dialect Perception Data for Swiss German and Swiss French
Tanguy, Ludovic Extending the gold standard for a lexical substitution task: is it worth it?
Tanti, Marc Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Tarvainen, Liisa Lotta Combining Concepts and Their Translations from Structured Dictionaries of Uralic Minority Languages
Tauchmann, Christopher Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data
Tavosanis, Mirko The ICoN Corpus of Academic Written Italian (L1 and L2)
Teisseire, Maguelonne Automatic Identification of Research Fields in Scientific Papers
Teja, Kasi Sai Phonetically Balanced Code-Mixed Speech Corpus for Hindi-English Automatic Speech Recognition
Tellier, Isabelle ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations
Temnikova, Irina The WAW Corpus: The First Corpus of Interpreted Speeches and their Translations for English and Arabic
Tennage, Pasindu Handling Rare Word Problem using Synthetic Training Data for Sinhala and Tamil Neural Machine Translation
Tenorio, Juanjosé Corpus Building and Evaluation of Aspect-based Opinion Summaries from Tweets in Spanish
Teruel, Milagro Increasing Argument Annotation Reproducibility by Using Inter-annotator Agreement to Improve Guidelines
Teslenko, Denis An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages
Tezcan, Arda A fine-grained error analysis of NMT, SMT and RBMT output for English-to-Dutch
Thater, Stefan MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge
Mapping Texts to Scripts: An Entailment Study
Thayaparan, Mokanarangan Graph Based Semi-Supervised Learning Approach for Tamil POS tagging
Thayasivam, Uthayasanker Graph Based Semi-Supervised Learning Approach for Tamil POS tagging
Theivendiram, Pranavan Improving domain-specific SMT for low-resourced languages using data from different domains
Thiemann, Alexander A High-Quality Gold Standard for Citation-based Tasks
Thilakarathne, Malith Handling Rare Word Problem using Synthetic Training Data for Sinhala and Tamil Neural Machine Translation
Thomas, Samuel A Recorded Debating Dataset
Thomas, Philippe A German Corpus for Fine-Grained Named Entity Recognition and Relation Extraction of Traffic and Industry Events
Tiedemann, Jörg OpenSubtitles2018: Statistical Rescoring of Sentence Alignments in Large, Noisy Parallel Corpora
Tihelka, Daniel Design and Development of Speech Corpora for Air Traffic Control Training
Tily, Harry J. The Natural Stories Corpus
Tissi, Katja SMILE Swiss German Sign Language Dataset
Tiwary, Swati TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection
Tjalve, Michael Discovering Canonical Indian English Accents: A Crowdsourcing-based Approach
Tokunaga, Takenobu Analysis of Implicit Conditions in Database Search Dialogues
Tomashenko, Natalia Evaluation of Feature-Space Speaker Adaptation for End-to-End Acoustic Models
Tomimasu, Sayaka Collection of Multimodal Dialog Data and Analysis of the Result of Annotation of Users' Interest Level
Tomita, Junji Predicting Nods by using Dialogue Acts in Dialogue
Creating Large-Scale Argumentation Structures for Dialogue Systems
Tonneau, Jean-Philippe Automatic Identification of Research Fields in Scientific Papers
Tonon, Alberto Sanaphor++: Combining Deep Neural Networks with Semantics for Coreference Resolution
Torisawa, Kentaro Annotating Zero Anaphora for Question Answering
Tornay, Sandrine SMILE Swiss German Sign Language Dataset
Torres-Moreno, Juan-Manuel A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task
Touileb, Samia NoReC: The Norwegian Review Corpus
Tracey, Jennifer Simple Semantic Annotation and Situation Frames: Two Approaches to Basic Text Understanding in LORELEI
Laying the Groundwork for Knowledge Base Population: Nine Years of Linguistic Resources for TAC KBP
VAST: A Corpus of Video Annotation for Speech Technologies
Tratz, Stephen A Web-based System for Crowd-in-the-Loop Dependency Treebanking
Traum, David Dialogue Structure Annotation for Multi-Floor Interaction
The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Identification of Personal Information Shared in Chat-Oriented Dialogue
Trippel, Thorsten Lessons Learned: On the Challenges of Migrating a Research Data Repository from a Research Institution to a University Library.
Troncy, Raphael Sanaphor++: Combining Deep Neural Networks with Semantics for Coreference Resolution
Trosterud, Trond Modeling Northern Haida Verb Morphology
Building a Constraint Grammar Parser for Plains Cree Verbs and Arguments
Trzos, Michal A Real-life, French-accented Corpus of Air Traffic Control Communications
Tsai, Chen-Tse CogCompNLP: Your Swiss Army Knife for NLP
Tsarfaty, Reut CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing
Tseng, Michael Community-Driven Crowdsourcing: Data Collection with Local Developers
Tseng, Yu-Hsiang Fluid Annotation: A Granularity-aware Annotation Tool for Chinese Word Fluidity
Tseng, Yuen-Hsien Building a TOCFL Learner Corpus for Chinese Grammatical Error Diagnosis
Tsuchiya, Masatoshi Performance Impact Caused by Hidden Bias of Training Data for Recognizing Textual Entailment
Tsujii, Jun'ichi SPADE: Evaluation Dataset for Monolingual Phrase Alignment
Tsvetkov, Yulia RtGender: A Corpus for Studying Differential Responses to Gender
Tufis, Dan BioRo: The Biomedical Corpus for the Romanian Language
Tufiș, Dan A Bird’s-eye View of Language Processing Projects at the Romanian Academy
The Reference Corpus of the Contemporary Romanian Language (CoRoLa)
Tuggener, Don SB-CH: A Swiss German Corpus with Sentiment Annotations
Tulkens, Stephan WordKit: a Python Package for Orthographic and Phonological Featurization
Turchi, Marco ESCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing
Turmo, Jordi Coreference Resolution in FreeLing 4.0
Turner, Steve A New Corpus to Support Text Mining for the Curation of Metabolites in the ChEBI Database
Tyers, Francis Finite-state morphological analysis for Gagauz

 

U
Uchida, Yoshitaka Sudachi: a Japanese Tokenizer for Business
Uchida, Satoru CEFR-based Lexical Simplification Dataset
Uchimoto, Kiyotaka Extending Search System based on Interactive Visualization for Speech Corpora
Uematsu, Sumire Universal Dependencies Version 2 for Japanese
Ulinski, Morgan Evaluating the WordsEye Text-to-Scene System: Imaginative and Realistic Sentences
Ultes, Stefan What Causes the Differences in Communication Styles? A Multicultural Study on Directness and Elaborateness
Expert Evaluation of a Spoken Dialogue System in a Clinical Operating Room
On the Vector Representation of Utterances in Dialogue Context
Upadhyay, Shyam CogCompNLP: Your Swiss Army Knife for NLP
Uresova, Zdenka Creating a Verb Synonym Lexicon Based on a Parallel Corpus
Tools for Building an Interlinked Synonym Lexicon Network
Uslu, Tolga FastSense: An Efficient Word Sense Disambiguation Classifier
Ustalov, Dmitry An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages
Improving Hypernymy Extraction with Distributional Semantic Classes
Usuda, Yasuyuki Construction of the Corpus of Everyday Japanese Conversation: An Interim Report
Uszkoreit, Hans TQ-AutoTest – An Automated Test Suite for (Machine) Translation Quality

 

V
Vaheb, Amir MirasVoice: A bilingual (English-Persian) speech corpus
MirasText: An Automatically Generated Text Corpus for Persian
Valenzuela-Escarcega, Marco A. Text Annotation Graphs: Annotating Complex Natural Language Phenomena
Van Attveldt, Wouter Studying Muslim Stereotyping through Microportrait Extraction
Van Brussel, Laura A fine-grained error analysis of NMT, SMT and RBMT output for English-to-Dutch
Van Esch, Daan Text Normalization Infrastructure that Scales to Hundreds of Language Varieties
Van Genabith, Josef European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Van Hout, Roeland A Fast and Flexible Webinterface for Dialect Research in the Low Countries
Van Koppen, Marjo Linguistic and Sociolinguistic Annotation of 17th Century Dutch Letters
The AnnCor CHILDES Treebank
Van Noord, Gertjan A Taxonomy for In-depth Evaluation of Normalization for User Generated Content
Van Noord, Rik Evaluating Scoped Meaning Representations
A Taxonomy for In-depth Evaluation of Normalization for User Generated Content
Van Son, Chantal Resource Interoperability for Sustainable Benchmarking: The Case of Events
Van Uytvanck, Dieter CLARIN: Towards FAIR and Responsible Data Science Using Language Resources
Van Waterschoot, Jelte An Information-Providing Closed-Domain Human-Agent Interaction Corpus
Van Zaanen, Menno Improving Machine Translation of Educational Content via Crowdsourcing
A Multilingual Wikified Data Set of Educational Material
Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content
Van den Bosch, Antal Discovering the Language of Wine Reviews: A Text Mining Account
A Multilingual Wikified Data Set of Educational Material
Van den Heuvel, Henk A Fast and Flexible Webinterface for Dialect Research in the Low Countries
Metadata Collection Records for Language Resources
Van der Goot, Rob A Taxonomy for In-depth Evaluation of Normalization for User Generated Content
Van der Klis, Martijn The AnnCor CHILDES Treebank
Van der Plas, Lonneke Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Van der Sijs, Nicoline A Fast and Flexible Webinterface for Dialect Research in the Low Countries
Van der Veen, Remco The AnnCor CHILDES Treebank
Van der Wees, Marlies Evaluation of Machine Translation Performance Across Multiple Genres and Languages
Van der westhuizen, Ewald A First South African Corpus of Multilingual Code-switched Soap Opera Speech
Van hamme, Hugo TF-LM: TensorFlow-based Language Modeling Toolkit
Variš, Dušan Improving a Neural-based Tagger for Multiword Expressions Identification
Varma, Vasudeva A Workbench for Rapid Generation of Cross-Lingual Summaries
Vasiļjevs, Andrejs Collecting Language Resources from Public Administrations in the Nordic and Baltic Countries
European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Tilde MT Platform for Developing Client Specific MT Solutions
Velardi, Paola A Large Multilingual and Multi-domain Dataset for Recommender Systems
Velldal, Erik NoReC: The Norwegian Review Corpus
Vempala, Alakananda Annotating Temporally-Anchored Spatial Knowledge by Leveraging Syntactic Dependencies
Annotating If the Authors of a Tweet are Located at the Locations They Tweet About
Venezian, Elad A Recorded Debating Dataset
Venturi, Giulia Universal Dependencies and Quantitative Typological Trends. A Case Study on Word Order
Vergez-Couret, Marianne Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard
Verhagen, Marc Bridging the LAPPS Grid and CLARIN
Verkerk, Annemarie BDPROTO: A Database of Phonological Inventories from Ancient and Reconstructed Languages
Vernier, Frédéric A Speaking Atlas of the Regional Languages of France
Verspoor, Karin Parallel Corpora for the Biomedical Domain
Verwimp, Lyan TF-LM: TensorFlow-based Language Modeling Toolkit
Vezzani, Federica TriMED: A Multilingual Terminological Database
Vial, Loïc UFSAC: Unification of Sense Annotated Corpora and Tools
Viard-Gaudin, Christian Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Vidal, Gaëlle SynPaFlex-Corpus: An Expressive French Audiobooks Corpus dedicated to expressive speech synthesis.
Vieira, Renata BlogSet-BR: A Brazilian Portuguese Blog Corpus
Villaneau, Jeanne Complex and Precise Movie and Book Annotations in French Language for Aspect Based Sentiment Analysis
Villata, Serena Increasing Argument Annotation Reproducibility by Using Inter-annotator Agreement to Improve Guidelines
Villavicencio, Aline The brWaC Corpus: A New Open Resource for Brazilian Portuguese
Villayandre-Llamazares, Milka A Lexical Tool for Academic Writing in Spanish based on Expert and Novice Corpora
Vincze, Veronika SzegedKoref: A Hungarian Coreference Corpus
E-magyar -- A Digital Language Processing System
Vishnevetsky, Anastasia The Natural Stories Corpus
Visser, Jacky Intertextual Correspondence for Integrating Corpora
Vlachostergiou, Aggeliki Unfolding the External Behavior and Inner Affective State of Teammates through Ensemble Learning: Experimental Evidence from a Dyadic Team Corpus
Vo, Ngoc Phuoc An A Large Resource of Patterns for Verbal Paraphrases
QUEST: A Natural Language Interface to Relational Databases
Vodrahalli, Kiran A Large Self-Annotated Corpus for Sarcasm
Vogel, Stephan The WAW Corpus: The First Corpus of Interpreted Speeches and their Translations for English and Arabic
Vogel, Carl A Diachronic Corpus for Literary Style Analysis
Modeling Collaborative Multimodal Behavior in Group Dialogues: The MULTISIMO Corpus
Chats and Chunks: Annotation and Analysis of Multiparty Long Casual Conversations
Speech Rate Calculations with Short Utterances: A Study from a Speech-to-Speech, Machine Translation Mediated Map Task
Multilingual Word Segmentation: Training Many Language-Specific Tokenizers Smoothly Thanks to the Universal Dependencies Corpus
Vogiatzis, George Neural Caption Generation for News Images
Voigt, Rob RtGender: A Corpus for Studying Differential Responses to Gender
Vollgraf, Roland ZAP: An Open-Source Multilingual Annotation Projection Framework
FEIDEGGER: A Multi-modal Corpus of Fashion Images and Descriptions in German
Volpe Nunes, Maria das Graças Building a Sentiment Corpus of Tweets in Brazilian Portuguese
Von Däniken, Pius SB-CH: A Swiss German Corpus with Sentiment Annotations
Voss, Clare Dialogue Structure Annotation for Multi-Floor Interaction
Vossen, Piek The Circumstantial Event Ontology (CEO) and ECB+/CEO: an Ontology and Corpus for Implicit Causal Relations between Events
Don't Annotate, but Validate: a Data-to-Text Method for Capturing Event Data
Resource Interoperability for Sustainable Benchmarking: The Case of Events
Vougiouklis, Pavlos T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples
Vu, Thanh A Fast and Accurate Vietnamese Word Segmenter
Vu, Manh Chien PhotoshopQuiA: A Corpus of Non-Factoid Questions and Answers for Why-Question Answering
Vulić, Ivan Acquiring Verb Classes Through Bottom-Up Semantic Verb Clustering
Vyas, Nidhi Creating a Translation Matrix of the Bible’s Names Across 591 Languages
Vylomova, Ekaterina UniMorph 2.0: Universal Morphology
Váradi, Tamás Evaluation of Dictionary Creating Methods for Finno-Ugric Minority Languages
E-magyar -- A Digital Language Processing System

 

W
Wade, Vincent The ADELE Corpus of Dyadic Social Text Conversations:Dialog Act Annotation with ISO 24617-2
Wagner Filho, Jorge Alberto The brWaC Corpus: A New Open Resource for Brazilian Portuguese
Waibel, Alex BULBasaa: A Bilingual Basaa-French Speech Corpus for the Evaluation of Language Documentation Tools
Automated Evaluation of Out-of-Context Errors
Waibel, Alexander KIT-Multi: A Translation-Oriented Multilingual Embedding Corpus
Wainwright, Elizabeth WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-hop Inference
Wakamiya, Shoko J-MeDic: A Japanese Disease Name Dictionary based on Real Clinical Usage
Walker, Marilyn Exploring Conversational Language Generation for Rich Content about Hotels
SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems
Walther, Géraldine UniMorph 2.0: Universal Morphology
Wambacq, Patrick TF-LM: TensorFlow-based Language Modeling Toolkit
Wan, Ada Visualizing the "Dictionary of Regionalisms of France" (DRF)
Tel(s)-Telle(s)-Signs: Highly Accurate Automatic Crosslingual Hypernym Discovery
Wang, Longyue Chinese-Portuguese Machine Translation: A Study on Building Parallel Corpora from Comparable Texts
Wang, Lei A Neural Network Based Model for Loanword Identification in Uyghur
Wang, Mingwen Building Parallel Monolingual Gan Chinese Dialects Corpus
Wang, Yunli EuroGames16: Evaluating Change Detection in Online Conversation
Wang, Yuchen Analyzing the Quality of Counseling Conversations: the Tell-Tale Signs of High-quality Counseling
Wang, Yiou A Japanese Corpus for Analyzing Customer Loyalty Information
Wang, Tengjiao The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media
Wang, Gengyu Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Wang, Qiuyue ScholarGraph:a Chinese Knowledge Graph of Chinese Scholars
Wang, Tong Annotating High-Level Structures of Short Stories and Personal Anecdotes
Wang, Xihao Constructing High Quality Sense-specific Corpus and Word Embedding via Unsupervised Elimination of Pseudo Multi-sense
Wang, Nan Constructing a Chinese Medical Conversation Corpus Annotated with Conversational Structures and Actions
Wang, Chenglong NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System
Wang, Shijin Dataset for the First Evaluation on Chinese Machine Reading Comprehension
Wang, Shuo ScholarGraph:a Chinese Knowledge Graph of Chinese Scholars
Wanner, Leo Generation of a Spanish Artificial Collocation Error Corpus
Compilation of Corpora for the Study of the Information Structure–Prosody Interface
Warner, Andrew PyrEval: An Automated Method for Summary Content Analysis
Washio, Koki Undersampling Improves Hypernymy Prototypicality Learning
Watkins, Gareth Towards a Welsh Semantic Annotation System
Leveraging Lexical Resources and Constraint Grammar for Rule-Based Part-of-Speech Tagging in Welsh
Wattanavekin, Theeraphol Voice Builder: A Tool for Building Text-To-Speech Voices
Wawer, Aleksander The Linguistic Category Model in Polish (LCM-PL)
Way, Andy FooTweets: A Bilingual Parallel Corpus of World Cup Tweets
Webber, Bonnie Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method
NegPar: A parallel corpus annotated for negation
Weber, Cornelius A Context-based Approach for Dialogue Act Recognition using Simple Recurrent Neural Networks
Wei, Bingzhen Building an Ellipsis-aware Chinese Dependency Treebank for Web Text
Wei, Furu EventWiki: A Knowledge Base of Major Events
Weichselbraun, Albert Framing Named Entity Linking Error Types
Weischedel, Ralph When ACE met KBP: End-to-End Evaluation of Knowledge Base Population with Component-level Annotation
Weiss, Benjamin The Nautilus Speaker Characterization Corpus: Speech Recordings and Labels of Speaker Characteristics and Voice Descriptions
Welch, Charles World Knowledge for Abstract Meaning Representation Parsing
Wen, Ji Building an Ellipsis-aware Chinese Dependency Treebank for Web Text
Wendemuth, Andreas Recognizing Behavioral Factors while Driving: A Real-World Multimodal Corpus to Monitor the Driver’s Affective State
Wermter, Stefan A Context-based Approach for Dialogue Act Recognition using Simple Recurrent Neural Networks
Wessling, Jan Towards an Automatic Assessment of Crowdsourced Data for NLU
Wibawa, Jaka Aris Eko Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Wichers Schreur, Jesse Universal Morphologies for the Caucasus region
Wickes, Matthew Low-resource Post Processing of Noisy OCR Output for Historical Corpus Digitisation
Wiedemann, Gregor Page Stream Segmentation with Convolutional Neural Nets Combining Textual and Visual Features
ILCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Wiedmer, Nicolas Building a Corpus from Handwritten Picture Postcards: Transcription, Annotation and Part-of-Speech Tagging
Wiegand, Michael Introducing a Lexicon of Verbal Polarity Shifters for English
Disambiguation of Verbal Shifters
Wieting, John CogCompNLP: Your Swiss Army Knife for NLP
Wilkens, Rodrigo SW4ALL: a CEFR Classified and Aligned Corpus for Language Learning
The brWaC Corpus: A New Open Resource for Brazilian Portuguese
An SLA Corpus Annotated with Pedagogically Relevant Grammatical Structures
Wirzberger, Maria CoLoSS: Cognitive Load Corpus with Speech and Performance Data from a Symbol-Digit Dual-Task
Wirén, Mats Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction
Wisniewski, Guillaume Errator: a Tool to Help Detect Annotation Errors in the Universal Dependencies Project
Witt, Andreas The German Reference Corpus DeReKo: New Developments – New Opportunities
Introducing the CLARIN Knowledge Centre for Linguistic Diversity and Language Documentation
Witte, René The LODeXporter: Flexible Generation of Linked Open Data Triples from NLP Frameworks for Automatic Knowledge Base Construction
Woisard, Virginie Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer
Wojatzki, Michael Quantifying Qualitative Data for Understanding Controversial Issues
Woliński, Marcin A New Version of the Składnica Treebank of Polish Harmonised with the Walenty Valency Dictionary
Manually Annotated Corpus of Polish Texts Published between 1830 and 1918
Woloszyn, Vinicius BlogSet-BR: A Brazilian Portuguese Blog Corpus
Wong, Kam-Fai The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media
Wong, Shun-han Rebekah Using a Corpus of English and Chinese Political Speeches for Metaphor Analysis
Wonsever, Dina Spanish HPSG Treebank based on the AnCora Corpus
Wood, Ian A Comparison Of Emotion Annotation Schemes And A New Annotated Data Set
Wray, Samantha Classification of Closely Related Sub-dialects of Arabic Using Support-Vector Machines
Wright, Jonathan From ‘Solved Problems’ to New Challenges: A Report on LDC Activities
Introducing NIEUW: Novel Incentives and Workflows for Eliciting Linguistic Data
Wróblewska, Alina Polish Corpus of Annotated Descriptions of Images
Wu, Winston Creating a Translation Matrix of the Bible’s Names Across 591 Languages
Creating Large-Scale Multilingual Cognate Tables
A Comparative Study of Extremely Low-Resource Transliteration of the World’s Languages
Massively Translingual Compound Analysis and Translation Discovery
Wu, Jiangqin A Pragmatic Approach for Classical Chinese Word Segmentation
Wu, Jiaqi SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems
Wyner, Adam An Annotation Language for Semantic Search of Legal Sources

 

X
Xexéo, Geraldo RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Xia, Patrick UniMorph 2.0: Universal Morphology
Xia, Jingbo Three Dimensions of Reproducibility in Natural Language Processing
Xia, Fei Constructing a Chinese Medical Conversation Corpus Annotated with Conversational Structures and Actions
PDF-to-Text Reanalysis for Linguistic Data Mining
Xiang, Jun The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media
Xu, Fan Building Parallel Monolingual Gan Chinese Dialects Corpus
Xu, Sun A Chinese Dataset with Negative Full Forms for General Abbreviation Prediction
Xu, Sheng Building a Macro Chinese Discourse Treebank
Xu, Kun QUEST: A Natural Language Interface to Relational Databases
Xu, Hongzhi Annotating Chinese Light Verb Constructions according to PARSEME guidelines
Xu, Ruifeng The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media
Xu, Yinzhan HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Xue, Nianwen Structured Interpretation of Temporal Relations

 

Y
Y. Song, Sung WikiDragon: A Java Framework For Diachronic Content And Network Analysis Of MediaWikis
Yadav, Shweta Medical Sentiment Analysis using Social Media: Towards building a Patient Assisted System
Yamada, Masaru Literality and cognitive effort: Japanese and Spanish
Yamaguchi, Masaya Development of a Mobile Observation Support System for Students: FishWatchr Mini
Yamamoto, Hajime Building A Handwritten Cuneiform Character Imageset
Yamamoto, Kazuhide Simplified Corpus with Core Vocabulary
Crowdsourced Corpus of Sentence Simplification with Core Vocabulary
Yamamura, Takashi Annotation and Analysis of Extractive Summaries for the Kyutech Corpus
Yamauchi, Kenji Building A Handwritten Cuneiform Character Imageset
Yamazaki, Makoto Annotation and Quantitative Analysis of Speaker Information in Novel Conversation Sentences in Japanese
Yan, Yonghong Discriminating between Similar Languages on Imbalanced Conversational Texts
Yanagida, Naomi Development of a Mobile Observation Support System for Students: FishWatchr Mini
Yancey, Kevin Korean L2 Vocabulary Prediction: Can a Large Annotated Corpus be Used to Train Better Models for Predicting Unknown Words?
Yang, Tsung-Han Transfer of Frames from English FrameNet to Construct Chinese FrameNet: A Bilingual Corpus-Based Approach
Yang, YaoSheng M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains
Yang, Yating A Neural Network Based Model for Loanword Identification in Uyghur
Yangarber, Roman Revita: a Language-learning Platform at the Intersection of ITS and CALL
Yarowsky, David Creating a Translation Matrix of the Bible’s Names Across 591 Languages
UniMorph 2.0: Universal Morphology
Creating Large-Scale Multilingual Cognate Tables
A Comparative Study of Extremely Low-Resource Transliteration of the World’s Languages
Massively Translingual Compound Analysis and Translation Discovery
Yatsu, Motoki Comparison of Pun Detection Methods Using Japanese Pun Corpus
Yelle, Julie Arabic Data Science Toolkit: An API for Arabic Language Feature Extraction
Yen, An-Zi Transfer of Frames from English FrameNet to Construct Chinese FrameNet: A Bilingual Corpus-Based Approach
Yeo, Hangu QUEST: A Natural Language Interface to Relational Databases
Yeo, Jinyoung Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Yokono, Hikaru Analysis of Implicit Conditions in Database Search Dialogues
Yokota, Masashi Augmenting Image Question Answering Dataset by Exploiting Image Captions
Yoo, Hiyon CBFC: a parallel L2 speech corpus for Korean and French learners
Yoon, Kyoungho Semi-supervised Training Data Generation for Multilingual Question Answering
Yoshida, Minoru Visualization of the occurrence trend of infectious diseases using Twitter
Yoshikawa, Yuichiro Creating Large-Scale Argumentation Structures for Dialogue Systems
Yoshino, Koichiro Dialogue Scenario Collection of Persuasive Dialogue with Emotional Expressions via Crowdsourcing
Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags
Yu, Xiaodong CogCompNLP: Your Swiss Army Knife for NLP
Yu, Xiaoyan The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media
Yu, Yang Improving Unsupervised Keyphrase Extraction using Background Knowledge
Yu, Shi Sign Languages and the Online World Online Dictionaries & Lexicostatistics
Yuan, YU Investigating the Influence of Bilingual MWU on Trainee Translation Quality
Cross-lingual Terminology Extraction for Translation Quality Estimation
Yvon, François A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments

 

Z
Zaenen, Annie Integrating Generative Lexicon Event Structures into VerbNet
Zafarian, Atefeh Parsivar: A Language Processing Toolkit for Persian
Zaghouani, Wajdi The MADAR Arabic Dialect Corpus and Lexicon
Unified Guidelines and Resources for Arabic Dialect Orthography
Arap-Tweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification
MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction
Zahner, Katharina The Distribution and Prosodic Realization of Verb Forms in German Infant-Directed Speech
Zajic, Zbynek Towards Processing of the Oral History Interviews and Related Printed Documents
Zalmout, Nasser Unified Guidelines and Resources for Arabic Dialect Orthography
Zampieri, Marcos LIdioms: A Multilingual Linked Idioms Data Set
RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Zanon Boito, Marcely A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Zanussi, Zachary EuroGames16: Evaluating Change Detection in Online Conversation
Zare Borzeshi, Ehsan BiLSTM-CRF for Persian Named-Entity Recognition ArmanPersoNERCorpus: the First Entity-Annotated Persian Dataset
English-Basque Statistical and Neural Machine Translation
Zarrouk, Manel The SSIX Corpora: Three Gold Standard Corpora for Sentiment Analysis in English, Spanish and German Financial Microblogs
SemR-11: A Multi-Lingual Gold-Standard for Semantic Similarity and Relatedness for Eleven Languages
Zeman, Daniel Parse Me if You Can: Artificial Treebanks for Parsing Experiments on Elliptical Constructions
Zeng, Huiheng Using a Corpus of English and Chinese Political Speeches for Metaphor Analysis
Zesch, Torsten Quantifying Qualitative Data for Understanding Controversial Issues
ESCRITO - An NLP-Enhanced Educational Scoring Toolkit
DeepTC – An Extension of DKPro Text Classification for Fostering Reproducibility of Deep Learning Experiments
Zettlemoyer, Luke NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System
Zeyrek, Deniz Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank
An Assessment of Explicit Inter- and Intra-sentential Discourse Connectives in Turkish Discourse Bank
Zhan, Weidong Building an Ellipsis-aware Chinese Dependency Treebank for Web Text
Zhang, Yue Cross-lingual Terminology Extraction for Translation Quality Estimation
Zhang, Yi A Chinese Dataset with Negative Full Forms for General Abbreviation Prediction
Zhang, Jiajun Exploiting Pre-Ordering for Neural Machine Translation
One Sentence One Model for Neural Machine Translation
Zhang, Zhiyuan Building an Ellipsis-aware Chinese Dependency Treebank for Web Text
Zhang, Yuchen Structured Interpretation of Temporal Relations
Zhang, Yifan Experiments with Convolutional Neural Networks for Multi-Label Authorship Attribution
Zhang, Min M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains
Zhang, Yan Discriminating between Similar Languages on Imbalanced Conversational Texts
Zhang, Boliang Error Analysis of Uyghur Name Tagging: Language-specific Techniques and Remaining Challenges
Zhang, Linrui Chinese Relation Classification using Long Short Term Memory Networks
Zhao, Xuemin Discriminating between Similar Languages on Imbalanced Conversational Texts
Zhao, Yang Exploiting Pre-Ordering for Neural Machine Translation
Zhou, Xi A Neural Network Based Model for Loanword Identification in Uyghur
Zhou, Ming EventWiki: A Knowledge Base of Major Events
Zhou, Hao Dynamic Oracle for Neural Machine Translation in Decoding Phase
Zhou, Ben CogCompNLP: Your Swiss Army Knife for NLP
Zhu, Qiaoming Building a Macro Chinese Discourse Treebank
Ziad, Housam Teanga: A Linked Data based platform for Natural Language Processing
Zielinski, Andrea Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications
Zilio, Leonardo SW4ALL: a CEFR Classified and Aligned Corpus for Language Learning
An SLA Corpus Annotated with Pedagogically Relevant Grammatical Structures
Zillich, Michael Action Verb Corpus
Zimmerman, Steven Improving Hate Speech Detection with Deep Learning Ensembles
Zinn, Claus Handling Big Data and Sensitive Data Using EUDAT's Generic Execution Framework and the WebLicht Workflow Engine.
Lessons Learned: On the Challenges of Migrating a Research Data Repository from a Research Institution to a University Library.
Zitzelsberger, Thomas Evaluation of Automatic Formant Trackers
Ziyaei, Seyedeh Multi-lingual Argumentative Corpora in English, Turkish, Greek, Albanian, Croatian, Serbian, Macedonian, Bulgarian, Romanian and Arabic
Ziółko, Bartosz An Application for Building a Polish Telephone Speech Corpus
Zlabinger, Markus Medical Entity Corpus with PICO elements and Sentiment Analysis
Znotins, Arturs Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Zong, Chengqing Exploiting Pre-Ordering for Neural Machine Translation
One Sentence One Model for Neural Machine Translation
Zopf, Markus Auto-hMDS: Automatic Construction of a Large Heterogeneous Multilingual Multi-Document Summarization Corpus
Zweigenbaum, Pierre Combining rule-based and embedding-based approaches to normalize textual entities with an ontology
Three Dimensions of Reproducibility in Natural Language Processing
Automating Document Discovery in the Systematic Review Process: How to Use Chaff to Extract Wheat
A Multilingual Dataset for Evaluating Parallel Sentence Extraction from Comparable Corpora

 

Å
Åstrand, Oliver The Spot the Difference corpus: a multi-modal corpus of spontaneous task oriented spoken interactions

 

Ç
Çetinoğlu, Özlem CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing
Çöltekin, Çağrı CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing

 

Ö
Östling, Robert Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction

 

Ø
Øvrelid, Lilja Evaluation of Domain-specific Word Embeddings using Knowledge Resources
The LIA Treebank of Spoken Norwegian Dialects
NoReC: The Norwegian Review Corpus

 

Š
Šandrih, Branislava Using English Baits to Catch Serbian Multi-Word Terminology
Šics, Valters Tilde MT Platform for Developing Client Specific MT Solutions
Šmídl, Luboš Design and Development of Speech Corpora for Air Traffic Control Training
Štajner, Sanja A Detailed Evaluation of Neural Sequence-to-Sequence Models for In-domain and Cross-domain Text Simplification
CATS: A Tool for Customized Alignment of Text Simplification Corpora
Švec, Jan Design and Development of Speech Corpora for Air Traffic Control Training
Towards Processing of the Oral History Interviews and Related Printed Documents