LREC 2016 Proceedings

TOPICS: Browse articles of the conference sorted by topic

A - C - D - E - G - I - K - L - M - N - O - P - Q - S - T - U - V - W

A
Acquisition	VerbCROcean: A Repository of Fine-Grained Semantic Verb Relations for Croatian Trends in HLT Research: A Survey of LDC's Data Scholarship Program Using Word Embeddings to Translate Named Entities The ACQDIV Database: Min(d)ing the Ambient Language Building Tempo-HindiWordNet: A resource for effective temporal information access in Hindi Corpus for Children’s Writing with Enhanced Output for Specific Spelling Patterns (2nd and 3rd Grade) A Turkish Database for Psycholinguistic Studies Based on Frequency, Age of Acquisition, and Imageability Monitoring Disease Outbreak Events on the Web Using Text-mining Approach and Domain Expert Knowledge Domain-Specific Corpus Expansion with Focused Webcrawling Classifying Out-of-vocabulary Terms in a Domain-Specific Social Media Corpus A Database of Laryngeal High-Speed Videos with Simultaneous High-Quality Audio Recordings of Pathological and Non-Pathological Voices The COPLE2 corpus: a learner corpus for Portuguese A Language Independent Method for Generating Large Scale Polarity Lexicons LELIO: An Auto-Adaptative System to Acquire Domain Lexical Knowledge in Technical Texts CEPLEXicon ― A Lexicon of Child European Portuguese A Corpus of Read and Spontaneous Upper Saxon German Speech for ASR Evaluation European Union Language Resources in Sketch Engine Subtask Mining from Search Query Logs for How-Knowledge Acceleration Acquiring Opposition Relations among Italian Verb Senses using Crowdsourcing CirdoX: an on/off-line multisource speech and sound analysis software Creation of comparable corpora for English-{Urdu, Arabic, Persian} The SI TEDx-UM speech database: a new Slovenian Spoken Language Resource Multiword Expressions in Child Language An Extension of the Slovak Broadcast News Corpus based on Semi-Automatic Annotation A Corpus of Native, Non-native and Translated Texts First Steps Towards Coverage-Based Sentence Alignment
Anaphora, Coreference	Sieve-based Coreference Resolution in the Biomedical Domain ARRAU: Linguistically-Motivated Annotation of Anaphoric Descriptions Annotating Characters in Literary Corpora: A Scheme, the CHARLES Tool, and an Annotated Novel Cross-lingual RDF Thesauri Interlinking WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles Exploitation of Co-reference in Distributional Semantics Information structure in the Potsdam Commentary Corpus: Topics Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference. SpaceRef: A corpus of street-level geographic descriptions Summ-it++: an Enriched Version of the Summ-it Corpus Adapting an Entity Centric Model for Portuguese Coreference Resolution IMS HotCoref DE: A Data-driven Co-reference Resolver for German SciCorp: A Corpus of English Scientific Articles Annotated for Information Status Analysis Towards Multiple Antecedent Coreference Resolution in Specialized Discourse Coreference Annotation Scheme and Relation Types for Hindi Coreference in Prague Czech-English Dependency Treebank Incorporating Lexico-semantic Heuristics into Coreference Resolution Sieves for Named Entity Recognition at Document-level
Authoring Tools	C-WEP―Rich Annotated Collection of Writing Errors by Professionals Applying Core Scientific Concepts to Context-Based Citation Recommendation ProphetMT: A Tree-based SMT-driven Controlled Language Authoring/Post-Editing Tool Error Typology and Remediation Strategies for Requirements Written in English by Non-Native Speakers

C
Cognitive Methods	VoxML: A Visualization Modeling Language Metonymy Analysis Using Associative Relations between Words A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults Cognitively Motivated Distributional Representations of Meaning English-to-Japanese Translation vs. Dictation vs. Post-editing: Comparing Translation Modes in a Multilingual Setting Multimodal Resources for Human-Robot Communication Modelling Finding Recurrent Features of Image Schema Gestures: the FIGURE corpus Coordinating Communication in the Wild: The Artwalk Dialogue Corpus of Pedestrian Navigation and Mobile Referential Communication Database of Mandarin Neighborhood Statistics Cohere: A Toolkit for Local Coherence
Collaborative Resource Construction	A Corpus of Wikipedia Discussions: Over the Years, with Topic, Power and Gender Labels Phonetic Inventory for an Arabic Speech Corpus A Multi-Layered Annotated Corpus of Scientific Papers Corpus Resources for Dispute Mediation Discourse New Inflectional Lexicons and Training Corpora for Improved Morphosyntactic Annotation of Croatian and Serbian A Tagged Corpus for Automatic Labeling of Disabilities in Medical Scientific Papers Introducing the Asian Language Treebank (ALT) Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015 Nederlab: Towards a Single Portal and Research Environment for Diachronic Dutch Text Corpora Building Language Resources for Exploring Autism Spectrum Disorders Staggered NLP-assisted refinement for Clinical Annotations of Chronic Disease Events Resources for building applications with Dependency Minimal Recursion Semantics Port4NooJ v3.0: Integrated Linguistic Resources for Portuguese NLP From Interoperable Annotations towards Interoperable Resources: A Multilingual Approach to the Analysis of Discourse UIMA-Based JCoRe 2.0 Goes GitHub and Maven Central ― State-of-the-Art Software Resource Engineering and Distribution of NLP Pipelines Building Evaluation Datasets for Consumer-Oriented Information Retrieval CLARIN-EL Web-based Annotation Tool EDISON: Feature Extraction for NLP, Simplified
Computer-Assisted Language Learning (CALL)	The Validation of MRCPD Cross-language Expansions on Imageability Ratings Unsupervised Ranked Cross-Lingual Lexical Substitution for Low-Resource Languages Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario SweLL on the rise: Swedish Learner Language corpus for European Reference Level studies SVALex: a CEFR-graded Lexical Resource for Swedish Foreign and Second Language Learners Detecting Word Usage Errors in Chinese Sentences for Learning Chinese as a Foreign Language Leveraging Native Data to Correct Preposition Errors in Learners' Dutch Chatbot Technology with Synthetic Voices in the Acquisition of an Endangered Language: Motivation, Development and Evaluation of a Platform for Irish A Shared Task for Spoken CALL? DALILA: The Dialectal Arabic Linguistic Learning Assistant Error Typology and Remediation Strategies for Requirements Written in English by Non-Native Speakers Joining-in-type Humanoid Robot Assisted Language Learning System
Controlled Languages	LELIO: An Auto-Adaptative System to Acquire Domain Lexical Knowledge in Technical Texts ProphetMT: A Tree-based SMT-driven Controlled Language Authoring/Post-Editing Tool Error Typology and Remediation Strategies for Requirements Written in English by Non-Native Speakers
Corpus (Creation, Annotation, etc.)	Endangered Language Documentation: Bootstrapping a Chatino Speech Corpus, Forced Aligner, ASR The PsyMine Corpus - A Corpus annotated with Psychiatric Disorders and their Etiological Factors Optimizing Computer-Assisted Transcription Quality with Iterative User Interfaces QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages An Interaction-Centric Dataset for Learning Automation Rules in Smart Homes C-WEP―Rich Annotated Collection of Writing Errors by Professionals The REAL Corpus: A Crowd-Sourced Corpus of Human Generated and Evaluated Spatial References to Real-World Urban Scenes Ecological Gestures for HRI: the GEE Corpus How to Address Smart Homes with a Social Robot? A Multi-modal Corpus of User Interactions with an Intelligent Environment “Who was Pietro Badoglio?” Towards a QA system for Italian History Croatian Error-Annotated Corpus of Non-Professional Written Language New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification An Annotated Corpus of Direct Speech Annotating Sentiment and Irony in the Online Italian Political Debate on #labuonascuola Axolotl: a Web Accessible Parallel Corpus for Spanish-Nahuatl A Corpus of Wikipedia Discussions: Over the Years, with Topic, Power and Gender Labels NLP Infrastructure for the Lithuanian Language Sense-annotating a Lexical Substitution Data Set with Ubyline Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus The OpenCourseWare Metadiscourse (OCWMD) Corpus An Open Corpus for Named Entity Recognition in Historic Newspapers Domain Adaptation for Named Entity Recognition Using CRFs Building a Dataset for Possessions Identification in Text Age and Gender Prediction on Health Forum Data Generating a Yiddish Speech Corpus, Forced Aligner and Basic ASR System for the AHEYM Project Manual and Automatic Paraphrases for MT Evaluation CodE Alltag: A German-Language E-Mail Corpus ARRAU: Linguistically-Motivated Annotation of Anaphoric Descriptions Internet Argument Corpus 2.0: An SQL schema for Dialogic Social Media and the Corpora to go with it Combining Semantic Annotation of Word Sense & Semantic Roles: A Novel Annotation Scheme for VerbNet Roles on German Language Data A Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus Annotating Characters in Literary Corpora: A Scheme, the CHARLES Tool, and an Annotated Novel MWEs in Treebanks: From Survey to Guidelines LORELEI Language Packs: Data, Tools, and Resources for Technology Development in Low Resource Languages Improving corpus search via parsing Ubuntu-fr: A Large and Open Corpus for Multi-modal Analysis of Online Written Conversations A Turkish-German Code-Switching Corpus Corpus Analysis based on Structural Phenomena in Texts: Exploiting TEI Encoding for Linguistic Research A Web Tool for Building Parallel Corpora of Spoken and Sign Languages Introducing the LCC Metaphor Datasets Passing a USA National Bar Exam: a First Corpus for Experimentation Creating a Large Multi-Layered Representational Repository of Linguistic Code Switched Arabic Data Factuality Annotation and Learning in Spanish Texts Using Word Embeddings to Translate Named Entities Privacy Issues in Online Machine Translation Services - European Perspective The Alaskan Athabascan Grammar Database Corpora for Learning the Mutual Relationship between Semantic Relatedness and Textual Entailment DUEL: A Multi-lingual Multimodal Dialogue Corpus for Disfluency, Exclamations and Laughter The OnForumS corpus from the Shared Task on Online Forum Summarisation at MultiLing 2015 Capturing Chat: Annotation and Tools for Multiparty Casual Conversation. DT-Neg: Tutorial Dialogues Annotated for Negation Scope and Focus in Context Enriching TimeBank: Towards a more precise annotation of temporal relations in a text Phrase Level Segmentation and Labelling of Machine Translation Errors Building the Macedonian-Croatian Parallel Corpus The ACQDIV Database: Min(d)ing the Ambient Language Towards Automatic Transcription of ILSE ― an Interdisciplinary Longitudinal Study of Adult Development and Aging A Tangled Web: The Faint Signals of Deception in Text - Boulder Lies and Truth Corpus (BLT-C) SatiricLR: a Language Resource of Satirical News Articles The Uppsala Corpus of Student Writings: Corpus Creation, Annotation, and Analysis The Query of Everything: Developing Open-Domain, Natural-Language Queries for BOLT Information Retrieval Spanish Word Vectors from Wikipedia Two Years of Aranea: Increasing Counts and Tuning the Pipeline Universal Dependencies for Japanese Annotating and Detecting Medical Events in Clinical Notes Collecting Language Resources for the Latvian e-Government Machine Translation Platform Multiword Expressions Dataset for Indian Languages Quantitative Analysis of Gazes and Grounding Acts in L1 and L2 Conversations The Validation of MRCPD Cross-language Expansions on Imageability Ratings SemRelData ― Multilingual Contextual Annotation of Semantic Relations between Nominals: Dataset and Guidelines A Dependency Treebank of the Chinese Buddhist Canon Hidden Resources ― Strategies to Acquire and Exploit Potential Spoken Language Resources in National Archives Learning from Within? Comparing PoS Tagging Approaches for Historical Text Introducing the Weighted Trustability Evaluator for Crowdsourcing Exemplified by Speaker Likability Classification Question-Answering with Logic Specific to Video Games SubCo: A Learner Translation Corpus of Human and Machine Subtitles Multi-language Speech Collection for NIST LRE Selection Criteria for Low Resource Language Programs Assessing the Prosody of Non-Native Speakers of English: Measures and Feature Sets Japanese Word―Color Associations with and without Contexts Phonetic Inventory for an Arabic Speech Corpus A Language Resource of German Errors Written by Children with Dyslexia MarsaGram: an excursion in the forests of parsing trees The IPR-cleared Corpus of Contemporary Written and Spoken Romanian Language Compilation of an Arabic Children’s Corpus CoRuSS - a New Prosodically Annotated Corpus of Russian Spontaneous Speech Corpus for Children’s Writing with Enhanced Output for Specific Spelling Patterns (2nd and 3rd Grade) Annotating Logical Forms for EHR Questions Modelling Multi-issue Bargaining Dialogues: Data Collection, Annotation Design and Corpus Evaluating a Topic Modelling Approach to Measuring Corpus Similarity Benchmarking Lexical Simplification Systems AIMU: Actionable Items for Meeting Understanding Phoneme Alignment Using the Information on Phonological Processes in Continuous Speech Arabic to English Person Name Transliteration using Twitter Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario A Multi-Layered Annotated Corpus of Scientific Papers Korean TimeML and Korean TimeBank TEG-REP: A corpus of Textual Entailment Graphs based on Relation Extraction Patterns SYN2015: Representative Corpus of Contemporary Written Czech Challenges of Evaluating Sentiment Analysis Tools on Social Media EmoTweet-28: A Fine-Grained Emotion Corpus for Sentiment Analysis A Corpus of Images and Text in Online News WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles POS-tagging of Historical Dutch Accuracy of Automatic Cross-Corpus Emotion Labeling for Conversational Speech Corpus Commonization The VU Sound Corpus: Adding More Fine-grained Annotations to the Freesound Database A Taxonomy of Specific Problem Classes in Text-to-Speech Synthesis: Comparing Commercial and Open Source Performance A Bilingual Discourse Corpus and Its Applications Quality Assessment of the Reuters Vol. 2 Multilingual Corpus Language Resource Addition Strategies for Raw Text Parsing Information structure in the Potsdam Commentary Corpus: Topics Compasses, Magnets, Water Microscopes: Annotation of Terminology in a Diachronic Corpus of Scientific Texts The SpeDial datasets: datasets for Spoken Dialogue Systems analytics A Corpus of Literal and Idiomatic Uses of German Infinitive-Verb Compounds The ILMT-s2s Corpus ― A Multimodal Interlingual Map Task Corpus The Negochat Corpus of Human-agent Negotiation Dialogues KorAP Architecture ― Diving in the Deep Sea of Corpus Data Name Translation based on Fine-grained Named Entity Recognition in a Single Language Wikification for Scriptio Continua Two Decades of Terminology: European Framework Programmes Titles The IFCASL Corpus of French and German Non-native and Native Read Speech Legal Text Interpretation: Identifying Hohfeldian Relations from Text Learning Tone and Attribution for Financial Text Mining Mirroring Facial Expressions and Emotions in Dyadic Conversations SweLL on the rise: Swedish Learner Language corpus for European Reference Level studies Uzbek-English and Turkish-English Morpheme Alignment Corpora Text Segmentation of Digitized Clinical Texts Large Multi-lingual, Multi-level and Multi-genre Annotation Corpus Creating Annotated Dialogue Resources: Cross-domain Dialogue Act Classification Giving Lexical Resources a Second Life: Démonette, a Multi-sourced Morpho-semantic Network for French Solving the AL Chicken-and-Egg Corpus and Model Problem: Model-free Active Learning for Phenomena-driven Corpus Construction Lexical Resources to Enrich English Malayalam Machine Translation Building a Corpus of Errors and Quality in Machine Translation: Experiments on Error Impact Reliable Baselines for Sentiment Analysis in Resource-Limited Languages: The Serbian Movie Review Dataset TTS for Low Resource Languages: A Bangla Synthesizer A Semantically Compositional Annotation Scheme for Time Normalization PROMETHEUS: A Corpus of Proverbs Annotated with Metaphors Corpus Annotation within the French FrameNet: a Domain-by-domain Methodology Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference. Correcting Errors in a Treebank Based on Tree Mining Comparison of Emotional Understanding in Modality-Controlled Environments using Multimodal Online Emotional Communication Corpus A Multilingual, Multi-style and Multi-granularity Dataset for Cross-language Textual Similarity Detection Corpus Resources for Dispute Mediation Discourse The SemDaX Corpus ― Sense Annotations with Scalable Sense Inventories A Corpus of Argument Networks: Using Graph Properties to Analyse Divisive Issues WIKIPARQ: A Tabulated Wikipedia Resource Using the Parquet Format Novel elicitation and annotation schemes for sentential and sub-sentential alignments of bitexts Covering various Needs in Temporal Annotation: a Proposal of Extension of ISO TimeML that Preserves Upward Compatibility A Turkish Database for Psycholinguistic Studies Based on Frequency, Age of Acquisition, and Imageability 4Couv: A New Treebank for French Domain-Specific Corpus Expansion with Focused Webcrawling Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest A Large-scale Recipe and Meal Data Collection as Infrastructure for Food Research CORILSE: a Spanish Sign Language Repository for Linguistic Analysis A Comparative Analysis of Crowdsourced Natural Language Corpora for Spoken Dialog Systems Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus An Arabic-Moroccan Darija Code-Switched Corpus The OFAI Multi-Modal Task Description Corpus A Tagged Corpus for Automatic Labeling of Disabilities in Medical Scientific Papers A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults Universal Dependencies v1: A Multilingual Treebank Collection FABIOLE, a Speech Database for Forensic Speaker Comparison A Japanese Chess Commentary Corpus InScript: Narrative texts annotated with script information Finding Definitions in Large Corpora with Sketch Engine Towards a Multi-dimensional Taxonomy of Stories in Dialogue PersonaBank: A Corpus of Personal Narratives and Their Story Intention Graphs Corpus-Based Diacritic Restoration for South Slavic Languages AfriBooms: An Online Treebank for Afrikaans Parallel Sentence Extraction from Comparable Corpora with Neural Network Features UPPC - Urdu Paraphrase Plagiarism Corpus A Publicly Available Indonesian Corpora for Automatic Abstractive and Extractive Chat Summarization Differentia compositionem facit. A Slower-Paced and Reliable Parser for Latin How Diachronic Text Corpora Affect Context based Retrieval of OOV Proper Names for Audio News Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities AMISCO: The Austrian German Multi-Sensor Corpus Emotion Analysis on Twitter: The Hidden Challenge A Database of Laryngeal High-Speed Videos with Simultaneous High-Quality Audio Recordings of Pathological and Non-Pathological Voices Identifying Content Types of Messages Related to Open Source Software Projects WTF-LOD - A New Resource for Large-Scale NER Evaluation C4Corpus: Multilingual Web-size Corpus with Free License Training & Quality Assessment of an Optical Character Recognition Model for Northern Haida Improving Information Extraction from Wikipedia Texts using Basic English Exploiting a Large Strongly Comparable Corpus Purely Corpus-based Automatic Conversation Authoring FOLK-Gold ― A Gold Standard for Part-of-Speech-Tagging of Spoken German Automatic identification of Mild Cognitive Impairment through the analysis of Italian spontaneous speech productions CINTIL DependencyBank PREMIUM - A Corpus of Grammatical Dependencies for Portuguese A General Framework for the Annotation of Causality Based on FrameNet PE2rr Corpus: Manual Error Annotation of Automatically Pre-annotated MT Post-edits Estonian Dependency Treebank: from Constraint Grammar tagset to Universal Dependencies D(H)ante: A New Set of Tools for XIII Century Italian LexFr: Adapting the LexIt Framework to Build a Corpus-based French Subcategorization Lexicon QUEMDISSE? Reported speech in Portuguese Annotating Temporally-Anchored Spatial Knowledge on Top of OntoNotes Semantic Roles A Classification-based Approach to Economic Event Detection in Dutch News Text A Corpus of Gesture-Annotated Dialogues for Monologue-to-Dialogue Generation from Personal Narratives Construction of an English Dependency Corpus incorporating Compound Function Words Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons Design and Development of the MERLIN Learner Corpus Platform EN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment Analysis The Universal Dependencies Treebank of Spoken Slovenian Introducing the Asian Language Treebank (ALT) The COPLE2 corpus: a learner corpus for Portuguese TGermaCorp -- A (Digital) Humanities Resource for (Computational) Linguistics 1 Million Captioned Dutch Newspaper Images ANTUSD: A Large Chinese Sentiment Dictionary Multimodal Resources for Human-Robot Communication Modelling Metrical Annotation of a Large Corpus of Spanish Sonnets: Representation, Scansion and Evaluation The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks Corpus for Customer Purchase Behavior Prediction in Social Media metaTED: a Corpus of Metadiscourse for Spoken Language Universal Dependencies for Norwegian TweetMT: A Parallel Microblog Corpus Construction of Japanese Audio-Visual Emotion Database and Its Application in Emotion Recognition GRaSP: A Multilayered Annotation Scheme for Perspectives Nederlab: Towards a Single Portal and Research Environment for Diachronic Dutch Text Corpora NLP and Public Engagement: The Case of the Italian School Reform Enhancing The RATP-DECODA Corpus With Linguistic Annotations For Performing A Large Range Of NLP Tasks Parallel Discourse Annotations on a Corpus of Short Texts BulPhonC: Bulgarian Speech Corpus for the Development of ASR Technology Designing a Speech Corpus for the Development and Evaluation of Dictation Systems in Latvian Poly-GrETEL: Cross-Lingual Example-based Querying of Syntactic Constructions Web Chat Conversations from Contact Centers: a Descriptive Study MEANTIME, the NewsReader Multilingual Event and Time Corpus LanguageCrawl: A Generic Tool for Building Language Models Upon Common-Crawl Crowdsourcing a Large Dataset of Domain-Specific Context-Sensitive Semantic Verb Relations The LetsRead Corpus of Portuguese Children Reading Aloud for Performance Evaluation Crowdsourced Corpus with Entity Salience Annotations ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain Features for Generic Corpus Querying Graded and Word-Sense-Disambiguation Decisions in Corpus Pattern Analysis: a Pilot Study Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis Cysill Ar-lein: A Corpus of Written Contemporary Welsh Compiled from an On-line Spelling and Grammar Checker Identification of Drug-Related Medical Conditions in Social Media Emotion Corpus Construction Based on Selection from Hashtags Mining the Spoken Wikipedia for Speech Data and Beyond On the Use of a Serious Game for Recording a Speech Corpus of People with Intellectual Disabilities A Corpus of Clinical Practice Guidelines Annotated with the Importance of Recommendations Construction and Analysis of a Large Vietnamese Text Corpus The dialogue breakdown detection challenge: Task description, datasets, and evaluation metrics The Methodius Corpus of Rhetorical Discourse Structures and Generated Texts SpaceRef: A corpus of street-level geographic descriptions That'll Do Fine!: A Coarse Lexical Resource for English-Hindi MT, Using Polylingual Topic Models Constructing a Norwegian Academic Wordlist Tweeting and Being Ironic in the Debate about a Political Reform: the French Annotated Corpus TWitter-MariagePourTous CItA: an L1 Italian Learners Corpus to Study the Development of Writing Competence CEPLEXicon ― A Lexicon of Child European Portuguese Finding Recurrent Features of Image Schema Gestures: the FIGURE corpus Evaluating Lexical Simplification and Vocabulary Knowledge for Learners of French: Possibilities of Using the FLELex Resource A Corpus of Read and Spontaneous Upper Saxon German Speech for ASR Evaluation Parallel Speech Corpora of Japanese Dialects Automatic Recognition of Linguistic Replacements in Text Series Generated from Keystroke Logs Towards a Corpus of Violence Acts in Arabic Social Media Affective Lexicon Creation for the Greek Language The TYPALOC Corpus: A Collection of Various Dysarthric Speech Recordings in Read and Spontaneous Styles Multilevel Annotation of Agreement and Disagreement in Italian News Blogs PentoRef: A Corpus of Spoken References in Task-oriented Dialogues Building Language Resources for Exploring Autism Spectrum Disorders Comprehensive and Consistent PropBank Light Verb Annotation Summ-it++: an Enriched Version of the Summ-it Corpus Automatic Corpus Extension for Data-driven Natural Language Generation European Union Language Resources in Sketch Engine Extracting Structured Scholarly Information from the Machine Translation Literature Edit Categories and Editor Role Identification in Wikipedia Inconsistency Detection in Semantic Annotation Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers Staggered NLP-assisted refinement for Clinical Annotations of Chronic Disease Events SCARE ― The Sentiment Corpus of App Reviews with Fine-grained Annotations in German Developing a Dataset for Evaluating Approaches for Document Expansion with Images Coordinating Communication in the Wild: The Artwalk Dialogue Corpus of Pedestrian Navigation and Mobile Referential Communication A Multimodal Corpus for the Assessment of Public Speaking Ability and Anxiety Fast and Robust POS tagger for Arabic Tweets Using Agreement-based Bootstrapping WAGS: A Beautiful English-Italian Benchmark Supporting Word Alignment Evaluation on Rare Words Datasets for Aspect-Based Sentiment Analysis in French Integration of Lexical and Semantic Knowledge for Sentiment Analysis in SMS DART: a Dataset of Arguments and their Relations on Twitter Hypergraph Modelization of a Syntactically Annotated English Wikipedia Dump MADAD: A Readability Annotation Tool for Arabic Text Finding Alternative Translations in a Large Corpus of Movie Subtitle ASPEC: Asian Scientific Paper Excerpt Corpus Discontinuous Verb Phrases in Parsing and Machine Translation of English and German A Large-Scale Multilingual Disambiguation of Glosses Domain Adaptation in MT Using Titles in Wikipedia as a Parallel Corpus: Resources and Evaluation Crowdsourcing Salient Information from News and Tweets Guidelines and Framework for a Large Scale Arabic Diacritized Corpus A Dutch Dysarthric Speech Database for Individualized Speech Therapy Research TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling TEITOK: Text-Faithful Annotated Corpora Extracting Interlinear Glossed Text from LaTeX Documents A Shared Task for Spoken CALL? From Interoperable Annotations towards Interoperable Resources: A Multilingual Approach to the Analysis of Discourse Laughter in French Spontaneous Conversational Dialogs A Corpus of Word-Aligned Asked and Anticipated Questions in a Virtual Patient Dialogue System The ACL RD-TEC 2.0: A Language Resource for Evaluating Term Extraction and Entity Recognition Methods Persian Proposition Bank Dialogue System Characterisation by Back-channelling Patterns Extracted from Dialogue Corpus Creation of comparable corpora for English-{Urdu, Arabic, Persian} Detecting Annotation Scheme Variation in Out-of-Domain Treebanks SciCorp: A Corpus of English Scientific Articles Annotated for Information Status Analysis Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation Universal Dependencies for Persian Aspect based Sentiment Analysis in Hindi: Resource Creation and Evaluation BosphorusSign: A Turkish Sign Language Recognition Corpus in Health and Finance Domains A Longitudinal Bilingual Frisian-Dutch Radio Broadcast Database Designed for Code-Switching Research Gulf Arabic Linguistic Resource Building for Sentiment Analysis If You Even Don't Have a Bit of Bible: Learning Delexicalized POS Taggers The CIRDO Corpus: Comprehensive Audio/Video Database of Domestic Falls of Elderly People Annotating Named Entities in Consumer Health Questions VPS-GradeUp: Graded Decisions on Usage Patterns Interoperability of Annotation Schemes: Using the Pepper Framework to Display AWA Documents in the ANNIS Interface PARC 3.0: A Corpus of Attribution Relations Hard Time Parsing Questions: Building a QuestionBank for French SuperCAT: The (New and Improved) Corpus Analysis Toolkit Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic AppDialogue: Multi-App Dialogues for Intelligent Assistants A Multimodal Motion-Captured Corpus of Matched and Mismatched Extravert-Introvert Conversational Pairs Urdu Summary Corpus Towards Automatic Identification of Effective Clues for Team Word-Guessing Games A CUP of CoFee: A large Collection of feedback Utterances Provided with communicative function annotations OSMAN ― A Novel Arabic Readability Metric Parallel Global Voices: a Collection of Multilingual Corpora with Citizen Media Stories Typed Entity and Relation Annotation on Computer Science Papers Speech Corpus Spoken by Young-old, Old-old and Oldest-old Japanese Summarizing Behaviours: An Experiment on the Annotation of Call-Centre Conversations Automatic Construction of Discourse Corpora for Dialogue Translation TermITH-Eval: a French Standard-Based Resource for Keyphrase Extraction Evaluation The Royal Society Corpus: From Uncharted Data to Corpus The Scielo Corpus: a Parallel Corpus of Scientific Publications for Biomedicine ArchiMob - A Corpus of Spoken Swiss German Building Evaluation Datasets for Consumer-Oriented Information Retrieval Annotating Topic Development in Information Seeking Queries Detection of Reformulations in Spoken French Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: the Case of Slovene A Proposition Bank of Urdu A Hungarian Sentiment Corpus Manually Annotated at Aspect Level Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing A Large Scale Corpus of Gulf Arabic CHATR the Corpus; a 20-year-old archive of Concatenative Speech Synthesis A Regional News Corpora for Contextualized Entity Discovery and Linking Survey of Conversational Behavior: Towards the Design of a Balanced Corpus of Everyday Japanese Conversation A Dataset for Open Event Extraction in English Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages Coreference Annotation Scheme and Relation Types for Hindi A Study of Reuse and Plagiarism in LREC papers A Reading Comprehension Corpus for Machine Translation Evaluation Transfer of Corpus-Specific Dialogue Act Annotation to ISO Standard: Is it worth it? Producing Monolingual and Parallel Web Corpora at the Same Time - SpiderLing and Bitextor's Love Affair A Multi-party Multi-modal Dataset for Focus of Visual Attention in Human-human and Human-robot Interaction Semantic Annotation of the ACL Anthology Corpus for the Automatic Analysis of Scientific Literature Designing A Long Lasting Linguistic Project: The Case Study of ASIt Controlled Propagation of Concept Annotations in Textual Corpora Exploiting Arabic Diacritization for High Quality Automatic Annotation An Extension of the Slovak Broadcast News Corpus based on Semi-Automatic Annotation Coreference in Prague Czech-English Dependency Treebank Joining-in-type Humanoid Robot Assisted Language Learning System Searching in the Penn Discourse Treebank Using the PML-Tree Query Rapid Development of Morphological Analyzers for Typologically Diverse Languages DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus A Multi-domain Corpus of Swedish Word Sense Annotation A Corpus of Native, Non-native and Translated Texts “He Said She Said” ― a Male/Female Corpus of Polish Global Open Resources and Information for Language and Linguistic Analysis (GORILLA) Crowdsourcing an OCR Gold Standard for a German and French Heritage Corpus corpus-tools.org: An Interoperable Generic Software Tool Set for Multi-layer Linguistic Corpora On Developing Resources for Patient-level Information Retrieval Graphical Annotation for Syntax-Semantics Mapping Monolingual Social Media Datasets for Detecting Contradiction and Entailment Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job Multi-label Annotation in Scientific Articles - The Multi-label Cancer Risk Assessment Corpus Improving the Annotation of Sentence Specificity Functions of Code-Switching in Tweets: An Annotation Framework and Some Initial Experiments Czech Legal Text Treebank 1.0 Building A Case-based Semantic English-Chinese Parallel Treebank NorGramBank: A ‘Deep’ Treebank for Norwegian VerbLexPor: a lexical resource with semantic roles for Portuguese OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles Challenges and Solutions for Consistent Annotation of Vietnamese Treebank Crowdsourcing a Multi-lingual Speech Corpus: Recording, Transcription and Annotation of the CrowdIS Corpora First Steps Towards Coverage-Based Sentence Alignment Latin Vallex. A Treebank-based Semantic Valency Lexicon for Latin CommonCOW: Massively Huge Web Corpora from CommonCrawl Data and a Method to Distribute them Freely under Restrictive EU Copyright Laws Sentiframes: A Resource for Verb-centered German Sentiment Inference Temporal Information Annotation: Crowd vs. Experts PotTS: The Potsdam Twitter Sentiment Corpus Parallel Chinese-English Entities, Relations and Events Corpora Automatic Classification of Tweets for Analyzing Communication Behavior of Museums Adapting the TANL tool suite to Universal Dependencies
Crowdsourcing	A Gold Standard for Scalar Adjectives Optimizing Computer-Assisted Transcription Quality with Iterative User Interfaces Remote Elicitation of Inflectional Paradigms to Seed Morphological Analysis in Low-Resource Languages The REAL Corpus: A Crowd-Sourced Corpus of Human Generated and Evaluated Spatial References to Real-World Urban Scenes Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus Arabic Corpora for Credibility Analysis A Web Tool for Building Parallel Corpora of Spoken and Sign Languages The OnForumS corpus from the Shared Task on Online Forum Summarisation at MultiLing 2015 A Tangled Web: The Faint Signals of Deception in Text - Boulder Lies and Truth Corpus (BLT-C) Introducing the Weighted Trustability Evaluator for Crowdsourcing Exemplified by Speaker Likability Classification Japanese Word―Color Associations with and without Contexts Wikipedia Titles As Noun Tag Predictors The VU Sound Corpus: Adding More Fine-grained Annotations to the Freesound Database Crowdsourcing Ontology Lexicons The Negochat Corpus of Human-agent Negotiation Dialogues Analysis of English Spelling Errors in a Word-Typing Game Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference. Towards Using Social Media to Identify Individuals at Risk for Preventable Chronic Illness A Comparative Analysis of Crowdsourced Natural Language Corpora for Spoken Dialog Systems InScript: Narrative texts annotated with script information Enhancing Access to Online Education: Quality Machine Translation of MOOC Content Annotating Temporally-Anchored Spatial Knowledge on Top of OntoNotes Semantic Roles Palabras: Crowdsourcing Transcriptions of L2 Speech Crowdsourcing a Large Dataset of Domain-Specific Context-Sensitive Semantic Verb Relations Crowdsourced Corpus with Entity Salience Annotations Cysill Ar-lein: A Corpus of Written Contemporary Welsh Compiled from an On-line Spelling and Grammar Checker EasyTree: A Graphical Tool for Dependency Tree Annotation Towards a Corpus of Violence Acts in Arabic Social Media Crowdsourcing Salient Information from News and Tweets Acquiring Opposition Relations among Italian Verb Senses using Crowdsourcing Semantic Relation Extraction with Semantic Patterns Experiment on Radiology Reports Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing Crowdsourcing an OCR Gold Standard for a German and French Heritage Corpus Crowdsourcing a Multi-lingual Speech Corpus: Recording, Transcription and Annotation of the CrowdIS Corpora Temporal Information Annotation: Crowd vs. Experts

D
Dialogue	An Annotated Corpus of Direct Speech Internet Argument Corpus 2.0: An SQL schema for Dialogic Social Media and the Corpora to go with it Ubuntu-fr: A Large and Open Corpus for Multi-modal Analysis of Online Written Conversations DUEL: A Multi-lingual Multimodal Dialogue Corpus for Disfluency, Exclamations and Laughter Capturing Chat: Annotation and Tools for Multiparty Casual Conversation. DT-Neg: Tutorial Dialogues Annotated for Negation Scope and Focus in Context A Dependency Treebank of the Chinese Buddhist Canon Modelling Multi-issue Bargaining Dialogues: Data Collection, Annotation Design and Corpus AIMU: Actionable Items for Meeting Understanding The SpeDial datasets: datasets for Spoken Dialogue Systems analytics The Negochat Corpus of Human-agent Negotiation Dialogues Mirroring Facial Expressions and Emotions in Dyadic Conversations Creating Annotated Dialogue Resources: Cross-domain Dialogue Act Classification A Comparative Study of Text Preprocessing Approaches for Topic Detection of User Utterances Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus Towards a Multi-dimensional Taxonomy of Stories in Dialogue A Document Repository for Social Media and Speech Conversations Purely Corpus-based Automatic Conversation Authoring A Corpus of Gesture-Annotated Dialogues for Monologue-to-Dialogue Generation from Personal Narratives The dialogue breakdown detection challenge: Task description, datasets, and evaluation metrics PentoRef: A Corpus of Spoken References in Task-oriented Dialogues The DialogBank Coordinating Communication in the Wild: The Artwalk Dialogue Corpus of Pedestrian Navigation and Mobile Referential Communication Vocal Pathologies Detection and Mispronounced Phonemes Identification: Case of Arabic Continuous Speech Managing Linguistic and Terminological Variation in a Medical Dialogue System Laughter in French Spontaneous Conversational Dialogs A Corpus of Word-Aligned Asked and Anticipated Questions in a Virtual Patient Dialogue System Dialogue System Characterisation by Back-channelling Patterns Extracted from Dialogue Corpus AppDialogue: Multi-App Dialogues for Intelligent Assistants A Verbal and Gestural Corpus of Story Retellings to an Expressive Embodied Virtual Character A Multimodal Motion-Captured Corpus of Matched and Mismatched Extravert-Introvert Conversational Pairs Towards Automatic Identification of Effective Clues for Team Word-Guessing Games A CUP of CoFee: A large Collection of feedback Utterances Provided with communicative function annotations Summarizing Behaviours: An Experiment on the Annotation of Call-Centre Conversations ArchiMob - A Corpus of Spoken Swiss German Survey of Conversational Behavior: Towards the Design of a Balanced Corpus of Everyday Japanese Conversation A Multi-party Multi-modal Dataset for Focus of Visual Attention in Human-human and Human-robot Interaction Deep Learning of Audio and Language Features for Humor Prediction
Digital Libraries	A Computational Perspective on the Romanian Dialects Evaluating the Noisy Channel Model for the Normalization of Historical Texts: Basque, Spanish and Slovene Measuring Lexical Quality of a Historical Finnish Newspaper Collection ― Analysis of Garbled OCR Data with Basic Language Technology Tools and Means South African National Centre for Digital Language Resources Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited Data Management Plans and Data Centers Lin\|gu\|is\|tik: Building the Linguist's Pathway to Bibliographies, Libraries, Language Resources and Linked Open Data Designing A Long Lasting Linguistic Project: The Case Study of ASIt
Discourse Annotation, Representation and Processing	Falling silent, lost for words ... Tracing personal involvement in interviews with Dutch war veterans Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus The OpenCourseWare Metadiscourse (OCWMD) Corpus ARRAU: Linguistically-Motivated Annotation of Anaphoric Descriptions Ubuntu-fr: A Large and Open Corpus for Multi-modal Analysis of Online Written Conversations DUEL: A Multi-lingual Multimodal Dialogue Corpus for Disfluency, Exclamations and Laughter Quantitative Analysis of Gazes and Grounding Acts in L1 and L2 Conversations A Multi-Layered Annotated Corpus of Scientific Papers A Bilingual Discourse Corpus and Its Applications Information structure in the Potsdam Commentary Corpus: Topics The SpeDial datasets: datasets for Spoken Dialogue Systems analytics Learning Tone and Attribution for Financial Text Mining Adding Semantic Relations to a Large-Coverage Connective Lexicon of German Corpus Resources for Dispute Mediation Discourse A Corpus of Argument Networks: Using Graph Properties to Analyse Divisive Issues PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus A Tagged Corpus for Automatic Labeling of Disabilities in Medical Scientific Papers PersonaBank: A Corpus of Personal Narratives and Their Story Intention Graphs Fine-Grained Chinese Discourse Relation Labelling A Corpus of Gesture-Annotated Dialogues for Monologue-to-Dialogue Generation from Personal Narratives Argument Mining: the Bottleneck of Knowledge and Language Resources Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks metaTED: a Corpus of Metadiscourse for Spoken Language Enhancing The RATP-DECODA Corpus With Linguistic Annotations For Performing A Large Range Of NLP Tasks Parallel Discourse Annotations on a Corpus of Short Texts A Corpus of Clinical Practice Guidelines Annotated with the Importance of Recommendations The Methodius Corpus of Rhetorical Discourse Structures and Generated Texts The DialogBank From Interoperable Annotations towards Interoperable Resources: A Multilingual Approach to the Analysis of Discourse Applying Core Scientific Concepts to Context-Based Citation Recommendation SciCorp: A Corpus of English Scientific Articles Annotated for Information Status Analysis PARC 3.0: A Corpus of Attribution Relations Using lexical and Dependency Features to Disambiguate Discourse Connectives in Hindi A CUP of CoFee: A large Collection of feedback Utterances Provided with communicative function annotations Summarizing Behaviours: An Experiment on the Annotation of Call-Centre Conversations Automatic Construction of Discourse Corpora for Dialogue Translation Annotating Topic Development in Information Seeking Queries Coreference Annotation Scheme and Relation Types for Hindi Transfer of Corpus-Specific Dialogue Act Annotation to ISO Standard: Is it worth it? Searching in the Penn Discourse Treebank Using the PML-Tree Query Cohere: A Toolkit for Local Coherence Multi-label Annotation in Scientific Articles - The Multi-label Cancer Risk Assessment Corpus Improving the Annotation of Sentence Specificity
Document Classification, Text categorisation	Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource An Empirical Exploration of Moral Foundations Theory in Partisan News Sources DRANZIERA: An Evaluation Protocol For Multi-Domain Opinion Mining Coh-Metrix-Esp: A Complexity Analysis Tool for Documents Written in Spanish Age and Gender Prediction on Health Forum Data Comparing Speech and Text Classification on ICNALE A Tangled Web: The Faint Signals of Deception in Text - Boulder Lies and Truth Corpus (BLT-C) SatiricLR: a Language Resource of Satirical News Articles Compilation of an Arabic Children’s Corpus Quality Assessment of the Reuters Vol. 2 Multilingual Corpus Learning Tone and Attribution for Financial Text Mining Reliable Baselines for Sentiment Analysis in Resource-Limited Languages: The Serbian Movie Review Dataset A Comparative Study of Text Preprocessing Approaches for Topic Detection of User Utterances A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings Towards a Multi-dimensional Taxonomy of Stories in Dialogue A Semi-Supervised Approach for Gender Identification Identifying Content Types of Messages Related to Open Source Software Projects Ensemble Classification of Grants using LDA-based Features Character-Level Neural Translation for Multilingual Media Monitoring in the SUMMA Project Emotion Corpus Construction Based on Selection from Hashtags A Corpus of Clinical Practice Guidelines Annotated with the Importance of Recommendations Towards a Corpus of Violence Acts in Arabic Social Media Edit Categories and Editor Role Identification in Wikipedia Exploring the Realization of Irony in Twitter Data Evaluation Set for Slovak News Information Retrieval Discriminating Similar Languages: Evaluations and Explorations Modeling Language Change in Historical Corpora: The Case of Portuguese Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages Specialising Paragraph Vectors for Text Polarity Detection A Corpus of Native, Non-native and Translated Texts “He Said She Said” ― a Male/Female Corpus of Polish Cohere: A Toolkit for Local Coherence Multi-label Annotation in Scientific Articles - The Multi-label Cancer Risk Assessment Corpus MoBiL: A Hybrid Feature Set for Automatic Human Translation Quality Assessment Detecting Expressions of Blame or Praise in Text Automatic Classification of Tweets for Analyzing Communication Behavior of Museums

E
Emotion Recognition/Generation	Falling silent, lost for words ... Tracing personal involvement in interviews with Dutch war veterans EmoTweet-28: A Fine-Grained Emotion Corpus for Sentiment Analysis Accuracy of Automatic Cross-Corpus Emotion Labeling for Conversational Speech Corpus Commonization Mirroring Facial Expressions and Emotions in Dyadic Conversations Detecting Implicit Expressions of Affect from Text using Semantic Knowledge on Common Concept Properties Comparison of Emotional Understanding in Modality-Controlled Environments using Multimodal Online Emotional Communication Corpus Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings Emotion Analysis on Twitter: The Hidden Challenge AVAB-DBS: an Audio-Visual Affect Bursts Database for Synthesis Construction of Japanese Audio-Visual Emotion Database and Its Application in Emotion Recognition Could Speaker, Gender or Age Awareness be beneficial in Speech-based Emotion Recognition? Tweeting and Being Ironic in the Debate about a Political Reform: the French Annotated Corpus TWitter-MariagePourTous Affective Lexicon Creation for the Greek Language Datasets for Aspect-Based Sentiment Analysis in French Evaluating Context Selection Strategies to Build Emotive Vector Space Models Sentiment Analysis in Social Networks through Topic modeling A Multimodal Motion-Captured Corpus of Matched and Mismatched Extravert-Introvert Conversational Pairs Deep Learning of Audio and Language Features for Humor Prediction PotTS: The Potsdam Twitter Sentiment Corpus
Endangered Languages	Endangered Language Documentation: Bootstrapping a Chatino Speech Corpus, Forced Aligner, ASR A Finite-state Morphological Analyser for Tuvan Remote Elicitation of Inflectional Paradigms to Seed Morphological Analysis in Low-Resource Languages Generating a Yiddish Speech Corpus, Forced Aligner and Basic ASR System for the AHEYM Project The Alaskan Athabascan Grammar Database Using a Small Lexicon with CRFs Confidence Measure to Improve POS Tagging Accuracy Constraint-Based Bilingual Lexicon Induction for Closely Related Languages Selection Criteria for Low Resource Language Programs Data Formats and Management Strategies from the Perspective of Language Resource Producers ― Personal Diachronic and Social Synchronic Data Sharing ― A Morphological Lexicon of Esperanto with Morpheme Frequencies Training & Quality Assessment of an Optical Character Recognition Model for Northern Haida Fostering digital representation of EU regional and minority languages: the Digital Language Diversity Project Cysill Ar-lein: A Corpus of Written Contemporary Welsh Compiled from an On-line Spelling and Grammar Checker Bridge-Language Capitalization Inference in Western Iranian: Sorani, Kurmanji, Zazaki, and Tajik Chatbot Technology with Synthetic Voices in the Acquisition of an Endangered Language: Motivation, Development and Evaluation of a Platform for Irish If You Even Don't Have a Bit of Bible: Learning Delexicalized POS Taggers Legacy language atlas data mining: mapping Kru languages A Rule-based Shallow-transfer Machine Translation System for Scots and English
Evaluation Methodologies	Orthographic and Morphological Correspondences between Related Slavic Languages as a Base for Modeling of Mutual Intelligibility Ecological Gestures for HRI: the GEE Corpus Complementarity, F-score, and NLP Evaluation DRANZIERA: An Evaluation Protocol For Multi-Domain Opinion Mining Manual and Automatic Paraphrases for MT Evaluation LORELEI Language Packs: Data, Tools, and Resources for Technology Development in Low Resource Languages Using the TED Talks to Evaluate Spoken Post-editing of Machine Translation Revisiting Summarization Evaluation for Scientific Articles What’s the Issue Here?: Task-based Evaluation of Reader Comment Summarization Systems RankDCG: Rank-Ordering Evaluation Measure Spanish Word Vectors from Wikipedia The Language Application Grid and Galaxy Multi-language Speech Collection for NIST LRE An Empirical Study of Arabic Formulaic Sequence Extraction Methods Homing in on Twitter Users: Evaluating an Enhanced Geoparser for User Profile Locations Evaluating a Topic Modelling Approach to Measuring Corpus Similarity Measuring Lexical Quality of a Historical Finnish Newspaper Collection ― Analysis of Garbled OCR Data with Basic Language Technology Tools and Means Use of Domain-Specific Language Resources in Machine Translation Exploitation of Co-reference in Distributional Semantics A Taxonomy of Specific Problem Classes in Text-to-Speech Synthesis: Comparing Commercial and Open Source Performance Compasses, Magnets, Water Microscopes: Annotation of Terminology in a Diachronic Corpus of Scientific Texts A Novel Evaluation Method for Morphological Segmentation Building a Corpus of Errors and Quality in Machine Translation: Experiments on Error Impact Novel elicitation and annotation schemes for sentential and sub-sentential alignments of bitexts PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation Linguistically Inspired Language Model Augmentation for MT UPPC - Urdu Paraphrase Plagiarism Corpus Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities Word Embedding Evaluation and Combination PE2rr Corpus: Manual Error Annotation of Automatically Pre-annotated MT Post-edits D(H)ante: A New Set of Tools for XIII Century Italian Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015 Polarity Lexicon Building: to what Extent Is the Manual Effort Worth? Using Contextual Information for Machine Translation Evaluation Evaluating the Impact of Light Post-Editing on Usability Standard Test Collection for English-Persian Cross-Lingual Word Sense Disambiguation Evaluating Machine Translation in a Usage Scenario Cross-validating Image Description Datasets and Evaluation Metrics OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited WAGS: A Beautiful English-Italian Benchmark Supporting Word Alignment Evaluation on Rare Words Guidelines and Framework for a Large Scale Arabic Diacritized Corpus Comparing the Level of Code-Switching in Corpora Evaluation Set for Slovak News Information Retrieval The ACL RD-TEC 2.0: A Language Resource for Evaluating Term Extraction and Entity Recognition Methods Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation Tools and Guidelines for Principled Machine Translation Development Generating Task-Pertinent sorted Error Lists for Speech Recognition Towards Automatic Identification of Effective Clues for Team Word-Guessing Games OSMAN ― A Novel Arabic Readability Metric EVALution-MAN: A Chinese Dataset for the Training and Evaluation of DSMs Analysing Constraint Grammars with a SAT-solver The Trials and Tribulations of Predicting Post-Editing Productivity Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization A Regional News Corpora for Contextualized Entity Discovery and Linking Evaluating Interactive System Adaptation Applying the Cognitive Machine Translation Evaluation Approach to Arabic A Reading Comprehension Corpus for Machine Translation Evaluation B2SG: a TOEFL-like Task for Portuguese Translation Errors and Incomprehensibility: a Case Study using Machine-Translated Second Language Proficiency Tests Distributional Thesauri for Information Retrieval and vice versa MoBiL: A Hybrid Feature Set for Automatic Human Translation Quality Assessment

G
Grammar and Syntax	Accurate Deep Syntactic Parsing of Graphs: The Case of French C-WEP―Rich Annotated Collection of Writing Errors by Professionals Improving corpus search via parsing Explicit Fine grained Syntactic and Semantic Annotation of the Idafa Construction in Arabic MarsaGram: an excursion in the forests of parsing trees Evaluating a Deterministic Shift-Reduce Neural Parser for Constituent Parsing Syntax-based Multi-system Machine Translation A sense-based lexicon of count and mass expressions: The Bochum English Countability Lexicon Corpus Annotation within the French FrameNet: a Domain-by-domain Methodology Correcting Errors in a Treebank Based on Tree Mining 4Couv: A New Treebank for French Detecting Optional Arguments of Verbs Leveraging Native Data to Correct Preposition Errors in Learners' Dutch Universal Dependencies v1: A Multilingual Treebank Collection AfriBooms: An Online Treebank for Afrikaans Semantic Layer of the Valence Dictionary of Polish Walenty CINTIL DependencyBank PREMIUM - A Corpus of Grammatical Dependencies for Portuguese Estonian Dependency Treebank: from Constraint Grammar tagset to Universal Dependencies D(H)ante: A New Set of Tools for XIII Century Italian The Universal Dependencies Treebank of Spoken Slovenian Distribution of Valency Complements in Czech Complex Predicates: Between Verb and Noun Italian VerbNet: A Construction-based Approach to Italian Verb Classification Universal Dependencies for Norwegian Poly-GrETEL: Cross-Lingual Example-based Querying of Syntactic Constructions Using Contextual Information for Machine Translation Evaluation Syntactic Analysis of Phrasal Compounds in Corpora: a Challenge for NLP Tools Hypergraph Modelization of a Syntactically Annotated English Wikipedia Dump Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization Discontinuous Verb Phrases in Parsing and Machine Translation of English and German Lemmatization and Morphological Tagging in German and Latin: A Comparison and a Survey of the State-of-the-art Detecting Annotation Scheme Variation in Out-of-Domain Treebanks A Lexical Resource for the Identification of “Weak Words” in German Specification Documents Hard Time Parsing Questions: Building a QuestionBank for French Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks Analysing Constraint Grammars with a SAT-solver Recent Advances in Development of a Lexicon-Grammar of Polish: PolNet 3.0 Lin\|gu\|is\|tik: Building the Linguist's Pathway to Bibliographies, Libraries, Language Resources and Linked Open Data Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View Adapting the TANL tool suite to Universal Dependencies

I
Information Extraction, Information Retrieval	Entity Linking with a Paraphrase Flavor The PsyMine Corpus - A Corpus annotated with Psychiatric Disorders and their Etiological Factors Sieve-based Coreference Resolution in the Biomedical Domain Rule-based Automatic Multi-word Term Extraction and Lemmatization Complementarity, F-score, and NLP Evaluation The OpenCourseWare Metadiscourse (OCWMD) Corpus Arabic Corpora for Credibility Analysis Domain Adaptation for Named Entity Recognition Using CRFs A Machine Learning based Music Retrieval and Recommendation System Corpus Query Lingua Franca (CQLF) Revisiting Summarization Evaluation for Scientific Articles Factuality Annotation and Learning in Spanish Texts The Query of Everything: Developing Open-Domain, Natural-Language Queries for BOLT Information Retrieval Neural Embedding Language Models in Semantic Clustering of Web Search Results The Validation of MRCPD Cross-language Expansions on Imageability Ratings Building Tempo-HindiWordNet: A resource for effective temporal information access in Hindi Operational Assessment of Keyword Search on Oral History Arabic to English Person Name Transliteration using Twitter Korean TimeML and Korean TimeBank TEG-REP: A corpus of Textual Entailment Graphs based on Relation Extraction Patterns An Annotated Corpus and Method for Analysis of Ad-Hoc Structures Embedded in Text A Large DataBase of Hypernymy Relations Extracted from the Web. JATE 2.0: Java Automatic Term Extraction with Apache Solr Using a Cross-Language Information Retrieval System based on OHSUMED to Evaluate the Moses and KantanMT Statistical Machine Translation Systems Two Decades of Terminology: European Framework Programmes Titles E-TIPSY: Search Query Corpus Annotated with Entities, Term Importance, POS Tags, and Syntactic Parses A Semantically Compositional Annotation Scheme for Time Normalization Analysis of English Spelling Errors in a Word-Typing Game TermoPL - a Flexible Tool for Terminology Extraction Odin's Runes: A Rule Language for Information Extraction Speech Trax: A Bottom to the Top Approach for Speaker Tracking and Indexing in an Archiving Context Finding Definitions in Large Corpora with Sketch Engine Parallel Sentence Extraction from Comparable Corpora with Neural Network Features Encoding Adjective Scales for Fine-grained Resources Improving Information Extraction from Wikipedia Texts using Basic English Exploiting a Large Strongly Comparable Corpus QUEMDISSE? Reported speech in Portuguese A Classification-based Approach to Economic Event Detection in Dutch News Text Nine Features in a Random Forest to Learn Taxonomical Semantic Relations Evaluating Translation Quality and CLIR Performance of Query Sessions Relation- and Phrase-level Linking of FrameNet with Sar-graphs Identification of Drug-Related Medical Conditions in Social Media What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets Construction and Analysis of a Large Vietnamese Text Corpus Forecasting Emerging Trends from Scientific Literature Visualisation and Exploration of High-Dimensional Distributional Features in Lexical Semantic Classification Adapting an Entity Centric Model for Portuguese Coreference Resolution Extracting Structured Scholarly Information from the Machine Translation Literature Developing a Dataset for Evaluating Approaches for Document Expansion with Images Extracting Weighted Language Lexicons from Wikipedia More than Word Cooccurrence: Exploring Support and Opposition in International Climate Negotiations with Semantic Parsing A Sequence Model Approach to Relation Extraction in Portuguese Subtask Mining from Search Query Logs for How-Knowledge Acceleration Evaluation Set for Slovak News Information Retrieval Applying Core Scientific Concepts to Context-Based Citation Recommendation Segmenting Hashtags using Automatically Created Training Data Detection of Major ASL Sign Types in Continuous Signing For ASL Recognition Legacy language atlas data mining: mapping Kru languages What does this Emoji Mean? A Vector Space Skip-Gram Model for Twitter Emojis TermITH-Eval: a French Standard-Based Resource for Keyphrase Extraction Evaluation Building Evaluation Datasets for Consumer-Oriented Information Retrieval Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization A Dataset for Open Event Extraction in English A Study of Reuse and Plagiarism in LREC papers Context-enhanced Adaptive Entity Linking Predictive Modeling: Guessing the NLP Terms of Tomorrow GATE-Time: Extraction of Temporal Expressions and Events On Developing Resources for Patient-level Information Retrieval Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job Distributional Thesauri for Information Retrieval and vice versa Studying the Temporal Dynamics of Word Co-occurrences: An Application to Event Detection Parallel Chinese-English Entities, Relations and Events Corpora Learning Thesaurus Relations from Distributional Features Incorporating Lexico-semantic Heuristics into Coreference Resolution Sieves for Named Entity Recognition at Document-level

K
Knowledge Discovery/Representation	An Interaction-Centric Dataset for Learning Automation Rules in Smart Homes A New Integrated Open-source Morphological Analyzer for Hungarian Embedding Open-domain Common-sense Knowledge from Text Passing a USA National Bar Exam: a First Corpus for Experimentation Explicit Fine grained Syntactic and Semantic Annotation of the Idafa Construction in Arabic Corpora for Learning the Mutual Relationship between Semantic Relatedness and Textual Entailment Discovering Fuzzy Synsets from the Redundancy in Different Lexical-Semantic Resources "LVF-lemon ― Towards a Linked Data Representation of ""Les Verbes français""" Domain Ontology Learning Enhanced by Optimized Relation Instance in DBpedia Legal Text Interpretation: Identifying Hohfeldian Relations from Text E-TIPSY: Search Query Corpus Annotated with Entities, Term Importance, POS Tags, and Syntactic Parses Detecting Implicit Expressions of Affect from Text using Semantic Knowledge on Common Concept Properties A Corpus of Argument Networks: Using Graph Properties to Analyse Divisive Issues Odin's Runes: A Rule Language for Information Extraction Building Concept Graphs from Monolingual Dictionary Entries SemLinker, a Modular and Open Source Framework for Named Entity Discovery and Linking InScript: Narrative texts annotated with script information Automatic Enrichment of WordNet with Common-Sense Knowledge Improving Information Extraction from Wikipedia Texts using Basic English Word Embedding Evaluation and Combination Argument Mining: the Bottleneck of Knowledge and Language Resources Nine Features in a Random Forest to Learn Taxonomical Semantic Relations What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets Best of Both Worlds: Making Word Sense Embeddings Interpretable Subtask Mining from Search Query Logs for How-Knowledge Acceleration The Event and Implied Situation Ontology (ESO): Application and Evaluation What does this Emoji Mean? A Vector Space Skip-Gram Model for Twitter Emojis EVALution-MAN: A Chinese Dataset for the Training and Evaluation of DSMs The Royal Society Corpus: From Uncharted Data to Corpus Towards Multiple Antecedent Coreference Resolution in Specialized Discourse Semantic Annotation of the ACL Anthology Corpus for the Automatic Analysis of Scientific Literature SlangNet: A WordNet like resource for English Slang Learning Thesaurus Relations from Distributional Features

L
Language Identification	Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource Multi-language Speech Collection for NIST LRE An Arabic-Moroccan Darija Code-Switched Corpus Integration of Lexical and Semantic Knowledge for Sentiment Analysis in SMS Assessing the Potential of Metaphoricity of verbs using corpus data Discriminating Similar Languages: Evaluations and Explorations
Language Modelling	MARMOT: A Toolkit for Translation Quality Estimation at the Word Level Deriving Morphological Analyzers from Example Inflections Discriminative Analysis of Linguistic Features for Typological Study Morphological Analysis of Sahidic Coptic for Automatic Glossing Factuality Annotation and Learning in Spanish Texts Creating Linked Data Morphological Language Resources with MMoOn - The Hebrew Morpheme Inventory Using SMT for OCR Error Correction of Historical Texts Domain-Specific Corpus Expansion with Focused Webcrawling Linguistically Inspired Language Model Augmentation for MT Leveraging Native Data to Correct Preposition Errors in Learners' Dutch GRaSP: A Multilayered Annotation Scheme for Perspectives SCALE: A Scalable Language Engineering Toolkit Towards a Linguistic Ontology with an Emphasis on Reasoning and Knowledge Reuse Extracting Weighted Language Lexicons from Wikipedia Filtering Wiktionary Triangles by Linear Mbetween Distributed Word Models Discriminating Similar Languages: Evaluations and Explorations
Lexicon, Lexical Database	Semantic Links for Portuguese A Gold Standard for Scalar Adjectives A Finite-state Morphological Analyser for Tuvan The Gavagai Living Lexicon VerbCROcean: A Repository of Fine-Grained Semantic Verb Relations for Croatian Rule-based Automatic Multi-word Term Extraction and Lemmatization A New Integrated Open-source Morphological Analyzer for Hungarian Transfer-Based Learning-to-Rank Assessment of Medical Term Technicality Enriching a Portuguese WordNet using Synonyms from a Monolingual Dictionary Very-large Scale Parsing and Normalization of Wiktionary Morphological Paradigms Tēzaurs.lv: the Largest Open Lexical Database for Latvian NileULex: A Phrase and Word Level Sentiment Lexicon for Egyptian and Modern Standard Arabic VoxML: A Visualization Modeling Language Example-based Acquisition of Fine-grained Collocation Resources A Finite-State Morphological Analyser for Sindhi A Computational Perspective on the Romanian Dialects The on-line version of Grammatical Dictionary of Polish A Taxonomy of Spanish Nouns, a Statistical Algorithm to Generate it and its Implementation in Open Source Code Synset Ranking of Hindi WordNet Evaluating Lexical Similarity to build Sentiment Similarity Constraint-Based Bilingual Lexicon Induction for Closely Related Languages An Empirical Study of Arabic Formulaic Sequence Extraction Methods Japanese Word―Color Associations with and without Contexts A Language Resource of German Errors Written by Children with Dyslexia Discovering Fuzzy Synsets from the Redundancy in Different Lexical-Semantic Resources Aspectual Flexibility Increases with Agentivity and Concreteness\\ A Computational Classification Experiment on Polysemous Verbs "LVF-lemon ― Towards a Linked Data Representation of ""Les Verbes français""" A Framework for Cross-lingual/Node-wise Alignment of Lexical-Semantic Resources Crowdsourcing Ontology Lexicons Curation of Dutch Regional Dictionaries A sense-based lexicon of count and mass expressions: The Bochum English Countability Lexicon A lexicon of perception for the identification of synaesthetic metaphors in corpora Happy Accident: A Sentiment Composition Lexicon for Opposing Polarity Phrases Wikification for Scriptio Continua Two Decades of Terminology: European Framework Programmes Titles Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages A Morphological Lexicon of Esperanto with Morpheme Frequencies How does Dictionary Size Influence Performance of Vietnamese Word Segmentation? Adding Semantic Relations to a Large-Coverage Connective Lexicon of German SVALex: a CEFR-graded Lexical Resource for Swedish Foreign and Second Language Learners Giving Lexical Resources a Second Life: Démonette, a Multi-sourced Morpho-semantic Network for French Lexical Resources to Enrich English Malayalam Machine Translation Creating a General Russian Sentiment Lexicon TTS for Low Resource Languages: A Bangla Synthesizer GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds A Turkish Database for Psycholinguistic Studies Based on Frequency, Age of Acquisition, and Imageability Building Concept Graphs from Monolingual Dictionary Entries Detecting Optional Arguments of Verbs New Inflectional Lexicons and Training Corpora for Improved Morphosyntactic Annotation of Croatian and Serbian Classifying Out-of-vocabulary Terms in a Domain-Specific Social Media Corpus DeQue: A Lexicon of Complex Prepositions and Conjunctions in French A Japanese Chess Commentary Corpus Paraphrasing Out-of-Vocabulary Words with Word Embeddings and Semantic Lexicons for Low Resource Statistical Machine Translation Encoding Adjective Scales for Fine-grained Resources How Diachronic Text Corpora Affect Context based Retrieval of OOV Proper Names for Audio News Automatic Enrichment of WordNet with Common-Sense Knowledge Semantic Layer of the Valence Dictionary of Polish Walenty Ambiguity Diagnosis for Terms in Digital Humanities A General Framework for the Annotation of Causality Based on FrameNet LexFr: Adapting the LexIt Framework to Build a Corpus-based French Subcategorization Lexicon QUEMDISSE? Reported speech in Portuguese Extending Monolingual Semantic Textual Similarity Task to Multiple Cross-lingual Settings Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons The Hebrew FrameNet Project Addressing the MFS Bias in WSD systems A Lexical Resource of Hebrew Verb-Noun Multi-Word Expressions Italian VerbNet: A Construction-based Approach to Italian Verb Classification TGermaCorp -- A (Digital) Humanities Resource for (Computational) Linguistics LELIO: An Auto-Adaptative System to Acquire Domain Lexical Knowledge in Technical Texts Polarity Lexicon Building: to what Extent Is the Manual Effort Worth? Challenges of Adjective Mapping between plWordNet and Princeton WordNet Graded and Word-Sense-Disambiguation Decisions in Corpus Pattern Analysis: a Pilot Study Accessing and Elaborating Walenty - a Valence Dictionary of Polish - via Internet Browser CEPLEXicon ― A Lexicon of Child European Portuguese Al Qamus al Muhit, a Medieval Arabic Lexicon in LMF Evaluating Lexical Simplification and Vocabulary Knowledge for Learners of French: Possibilities of Using the FLELex Resource Automatically Generated Affective Norms of Abstractness, Arousal, Imageability and Valence for 350 000 German Lemmas Affective Lexicon Creation for the Greek Language A Large Rated Lexicon with French Medical Words Mapping Ontologies Using Ontologies: Cross-lingual Semantic Role Information Transfer Multi-prototype Chinese Character Embedding Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries Extracting Weighted Language Lexicons from Wikipedia Best of Both Worlds: Making Word Sense Embeddings Interpretable Evaluating Context Selection Strategies to Build Emotive Vector Space Models Towards Lexical Encoding of Multi-Word Expressions in Spanish Dialects Port4NooJ v3.0: Integrated Linguistic Resources for Portuguese NLP Managing Linguistic and Terminological Variation in a Medical Dialogue System Assessing the Potential of Metaphoricity of verbs using corpus data Filtering Wiktionary Triangles by Linear Mbetween Distributed Word Models A comparison of Named-Entity Disambiguation and Word Sense Disambiguation BosphorusSign: A Turkish Sign Language Recognition Corpus in Health and Finance Domains Gulf Arabic Linguistic Resource Building for Sentiment Analysis A Lexical Resource for the Identification of “Weak Words” in German Specification Documents PARSEME Survey on MWE Resources Generating a Large-Scale Entity Linking Dictionary from Wikipedia Link Structure and Article Text Refurbishing a Morphological Database for German ANEW+: Automatic Expansion and Validation of Affective Norms of Words Lexicons in Multiple Languages Recent Advances in Development of a Lexicon-Grammar of Polish: PolNet 3.0 Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing A Rule-based Shallow-transfer Machine Translation System for Scots and English Effect Functors for Opinion Inference PreMOn: a Lemon Extension for Exposing Predicate Models as Linked Data Multiword Expressions in Child Language A Framework for Automatic Acquisition of Croatian and Serbian Verb Aspect from Corpora Database of Mandarin Neighborhood Statistics Wow! What a Useful Extension! Introducing Non-Referential Concepts to Wordnet Graph-Based Induction of Word Senses in Croatian SlangNet: A WordNet like resource for English Slang B2SG: a TOEFL-like Task for Portuguese A Multi-domain Corpus of Swedish Word Sense Annotation Wiktionnaire's Wikicode GLAWIfied: a Workable French Machine-Readable Dictionary Distributional Thesauri for Information Retrieval and vice versa ALT Explored: Integrating an Online Dialectometric Tool and an Online Dialect Atlas VerbLexPor: a lexical resource with semantic roles for Portuguese A Multilingual Predicate Matrix Latin Vallex. A Treebank-based Semantic Valency Lexicon for Latin Sentiframes: A Resource for Verb-centered German Sentiment Inference Named Entity Resources - Overview and Outlook Merging Data Resources for Inflectional and Derivational Morphology in Czech
Linked Data	Semantic Links for Portuguese Publishing the Trove Newspaper Corpus Cross-lingual RDF Thesauri Interlinking Concepticon: A Resource for the Linking of Concept Lists "LVF-lemon ― Towards a Linked Data Representation of ""Les Verbes français""" A Corpus of Images and Text in Online News WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles WTF-LOD - A New Resource for Large-Scale NER Evaluation Riddle Generation using Word Associations Challenges of Adjective Mapping between plWordNet and Princeton WordNet Relation- and Phrase-level Linking of FrameNet with Sar-graphs Crosswalking from CMDI to Dublin Core and MARC 21 Mapping Ontologies Using Ontologies: Cross-lingual Semantic Role Information Transfer Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries Generating a Large-Scale Entity Linking Dictionary from Wikipedia Link Structure and Article Text Lin\|gu\|is\|tik: Building the Linguist's Pathway to Bibliographies, Libraries, Language Resources and Linked Open Data The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud PreMOn: a Lemon Extension for Exposing Predicate Models as Linked Data Open Data Vocabularies for Assigning Usage Rights to Data Resources from Translation Projects
LR Infrastructures and Architectures	Two Architectures for Parallel Processing of Huge Amounts of Text Trends in HLT Research: A Survey of LDC's Data Scholarship Program How to Address Smart Homes with a Social Robot? A Multi-modal Corpus of User Interactions with an Intelligent Environment Internet Argument Corpus 2.0: An SQL schema for Dialogic Social Media and the Corpora to go with it Publishing the Trove Newspaper Corpus Corpus Query Lingua Franca (CQLF) Providing a Catalogue of Language Resources for Commercial Users Corpus Analysis based on Structural Phenomena in Texts: Exploiting TEI Encoding for Linguistic Research Creating a Large Multi-Layered Representational Repository of Linguistic Code Switched Arabic Data Collecting Language Resources for the Latvian e-Government Machine Translation Platform The Language Application Grid and Galaxy Learning from Within? Comparing PoS Tagging Approaches for Historical Text ELRA Activities and Services New Developments in the LRE Map Data Formats and Management Strategies from the Perspective of Language Resource Producers ― Personal Diachronic and Social Synchronic Data Sharing ― Korean TimeML and Korean TimeBank The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources Analysis of English Spelling Errors in a Word-Typing Game A Large-scale Recipe and Meal Data Collection as Infrastructure for Food Research EstNLTK - NLP Toolkit for Estonian South African National Centre for Digital Language Resources A Document Repository for Social Media and Speech Conversations C4Corpus: Multilingual Web-size Corpus with Free License Using a Language Technology Infrastructure for German in order to Anonymize German Sign Language Corpus Data Design and Development of the MERLIN Learner Corpus Platform The Hebrew FrameNet Project FLAT: Constructing a CLARIN Compatible Home for Language Resources The BAS Speech Data Repository CLARIAH in the Netherlands Crosswalking from CMDI to Dublin Core and MARC 21 LREC as a Graph: People and Resources in a Network Hypergraph Modelization of a Syntactically Annotated English Wikipedia Dump MADAD: A Readability Annotation Tool for Arabic Text Data Management Plans and Data Centers Fostering the Next Generation of European Language Technology: Recent Developments ― Emerging Initiatives ― Challenges and Opportunities UIMA-Based JCoRe 2.0 Goes GitHub and Maven Central ― State-of-the-Art Software Resource Engineering and Distribution of NLP Pipelines Facilitating Metadata Interoperability in CLARIN-DK The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud Towards a Language Service Infrastructure for Mobile Environments Global Open Resources and Information for Language and Linguistic Analysis (GORILLA) GATE-Time: Extraction of Temporal Expressions and Events corpus-tools.org: An Interoperable Generic Software Tool Set for Multi-layer Linguistic Corpora Open Data Vocabularies for Assigning Usage Rights to Data Resources from Translation Projects NorGramBank: A ‘Deep’ Treebank for Norwegian CLARIN-EL Web-based Annotation Tool
LR National/International Projects, Infrastructural/Policy issues	NLP Infrastructure for the Lithuanian Language CodE Alltag: A German-Language E-Mail Corpus LORELEI Language Packs: Data, Tools, and Resources for Technology Development in Low Resource Languages Providing a Catalogue of Language Resources for Commercial Users Hidden Resources ― Strategies to Acquire and Exploit Potential Spoken Language Resources in National Archives ELRA Activities and Services Language Resource Citation: the ISLRN Dissemination and Further Developments The ELRA License Wizard Review on the Existing Language Resources for Languages of France Selection Criteria for Low Resource Language Programs New Developments in the LRE Map Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities The IPR-cleared Corpus of Contemporary Written and Spoken Romanian Language SYN2015: Representative Corpus of Contemporary Written Czech Character-Level Neural Translation for Multilingual Media Monitoring in the SUMMA Project South African Language Resources: Phrase Chunking A Lexical Resource of Hebrew Verb-Noun Multi-Word Expressions Nederlab: Towards a Single Portal and Research Environment for Diachronic Dutch Text Corpora Fostering digital representation of EU regional and minority languages: the Digital Language Diversity Project CLARIAH in the Netherlands LREC as a Graph: People and Resources in a Network Port4NooJ v3.0: Integrated Linguistic Resources for Portuguese NLP Persian Proposition Bank Data Management Plans and Data Centers Fostering the Next Generation of European Language Technology: Recent Developments ― Emerging Initiatives ― Challenges and Opportunities Evaluating Interactive System Adaptation The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud The Public License Selector:  Making Open Licensing Easier Graphical Annotation for Syntax-Semantics Mapping Government Domain Named Entity Recognition for South African Languages

M
Machine Translation, SpeechToSpeech Translation	Word Sense-Aware Machine Translation: Including Senses as Contextual Features for Improved Translation Models Manual and Automatic Paraphrases for MT Evaluation Using the TED Talks to Evaluate Spoken Post-editing of Machine Translation Privacy Issues in Online Machine Translation Services - European Perspective Phrase Level Segmentation and Labelling of Machine Translation Errors The United Nations Parallel Corpus v1.0 Building the Macedonian-Croatian Parallel Corpus Collecting Language Resources for the Latvian e-Government Machine Translation Platform SubCo: A Learner Translation Corpus of Human and Machine Subtitles Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities Syntax-based Multi-system Machine Translation Use of Domain-Specific Language Resources in Machine Translation A Bilingual Discourse Corpus and Its Applications Using a Cross-Language Information Retrieval System based on OHSUMED to Evaluate the Moses and KantanMT Statistical Machine Translation Systems CATaLog Online: Porting a Post-editing Tool to the Web The ILMT-s2s Corpus ― A Multimodal Interlingual Map Task Corpus Name Translation based on Fine-grained Named Entity Recognition in a Single Language Uzbek-English and Turkish-English Morpheme Alignment Corpora Large Multi-lingual, Multi-level and Multi-genre Annotation Corpus Using SMT for OCR Error Correction of Historical Texts Lexical Resources to Enrich English Malayalam Machine Translation Building a Corpus of Errors and Quality in Machine Translation: Experiments on Error Impact Novel elicitation and annotation schemes for sentential and sub-sentential alignments of bitexts PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation Linguistically Inspired Language Model Augmentation for MT Paraphrasing Out-of-Vocabulary Words with Word Embeddings and Semantic Lexicons for Low Resource Statistical Machine Translation Parallel Sentence Extraction from Comparable Corpora with Neural Network Features Enhancing Access to Online Education: Quality Machine Translation of MOOC Content Exploiting a Large Strongly Comparable Corpus Character-Level Neural Translation for Multilingual Media Monitoring in the SUMMA Project PE2rr Corpus: Manual Error Annotation of Automatically Pre-annotated MT Post-edits Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons English-to-Japanese Translation vs. Dictation vs. Post-editing: Comparing Translation Modes in a Multilingual Setting Introducing the Asian Language Treebank (ALT) TweetMT: A Parallel Microblog Corpus Evaluating Translation Quality and CLIR Performance of Query Sessions Using Contextual Information for Machine Translation Evaluation That'll Do Fine!: A Coarse Lexical Resource for English-Hindi MT, Using Polylingual Topic Models Evaluating the Impact of Light Post-Editing on Usability Bootstrapping a Hybrid MT System to a New Language Pair Evaluating Machine Translation in a Usage Scenario Using BabelNet to Improve OOV Coverage in SMT WAGS: A Beautiful English-Italian Benchmark Supporting Word Alignment Evaluation on Rare Words Finding Alternative Translations in a Large Corpus of Movie Subtitle ASPEC: Asian Scientific Paper Excerpt Corpus Discontinuous Verb Phrases in Parsing and Machine Translation of English and German Domain Adaptation in MT Using Titles in Wikipedia as a Parallel Corpus: Resources and Evaluation Evaluation of the KIT Lecture Translation System Filtering Wiktionary Triangles by Linear Mbetween Distributed Word Models Tools and Guidelines for Principled Machine Translation Development ProphetMT: A Tree-based SMT-driven Controlled Language Authoring/Post-Editing Tool The Scielo Corpus: a Parallel Corpus of Scientific Publications for Biomedicine The Trials and Tribulations of Predicting Post-Editing Productivity A Rule-based Shallow-transfer Machine Translation System for Scots and English Applying the Cognitive Machine Translation Evaluation Approach to Arabic A Reading Comprehension Corpus for Machine Translation Evaluation Producing Monolingual and Parallel Web Corpora at the Same Time - SpiderLing and Bitextor's Love Affair IRIS: English-Irish Machine Translation System Translation Errors and Incomprehensibility: a Case Study using Machine-Translated Second Language Proficiency Tests Building A Case-based Semantic English-Chinese Parallel Treebank OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles Towards producing bilingual lexica from monolingual corpora First Steps Towards Coverage-Based Sentence Alignment
Metadata	The United Nations Parallel Corpus v1.0 Review on the Existing Language Resources for Languages of France New Developments in the LRE Map A Language Resource of German Errors Written by Children with Dyslexia The IPR-cleared Corpus of Contemporary Written and Spoken Romanian Language Compilation of an Arabic Children’s Corpus The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources FLAT: Constructing a CLARIN Compatible Home for Language Resources CLARIAH in the Netherlands Crosswalking from CMDI to Dublin Core and MARC 21 Automatically Generated Affective Norms of Abstractness, Arousal, Imageability and Valence for 350 000 German Lemmas LREC as a Graph: People and Resources in a Network A Lexical Resource for the Identification of “Weak Words” in German Specification Documents PARSEME Survey on MWE Resources Facilitating Metadata Interoperability in CLARIN-DK The Royal Society Corpus: From Uncharted Data to Corpus Open Data Vocabularies for Assigning Usage Rights to Data Resources from Translation Projects
Morphology	A Finite-state Morphological Analyser for Tuvan Orthographic and Morphological Correspondences between Related Slavic Languages as a Base for Modeling of Mutual Intelligibility Remote Elicitation of Inflectional Paradigms to Seed Morphological Analysis in Low-Resource Languages A New Integrated Open-source Morphological Analyzer for Hungarian A Proposal for a Part-of-Speech Tagset for the Albanian Language Very-large Scale Parsing and Normalization of Wiktionary Morphological Paradigms Tēzaurs.lv: the Largest Open Lexical Database for Latvian A Finite-State Morphological Analyser for Sindhi Deriving Morphological Analyzers from Example Inflections Morphological Analysis of Sahidic Coptic for Automatic Glossing The on-line version of Grammatical Dictionary of Polish Creating Linked Data Morphological Language Resources with MMoOn - The Hebrew Morpheme Inventory Using a Small Lexicon with CRFs Confidence Measure to Improve POS Tagging Accuracy Evaluating the Noisy Channel Model for the Normalization of Historical Texts: Basque, Spanish and Slovene Farasa: A New Fast and Accurate Arabic Word Segmenter A Novel Evaluation Method for Morphological Segmentation A Morphological Lexicon of Esperanto with Morpheme Frequencies How does Dictionary Size Influence Performance of Vietnamese Word Segmentation? Giving Lexical Resources a Second Life: Démonette, a Multi-sourced Morpho-semantic Network for French Universal Dependencies v1: A Multilingual Treebank Collection Syntactic Analysis of Phrasal Compounds in Corpora: a Challenge for NLP Tools Al Qamus al Muhit, a Medieval Arabic Lexicon in LMF Bilingual Lexicon Extraction at the Morpheme Level Using Distributional Analysis Lemmatization and Morphological Tagging in German and Latin: A Comparison and a Survey of the State-of-the-art Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic DALILA: The Dialectal Arabic Linguistic Learning Assistant Refurbishing a Morphological Database for German A Large Scale Corpus of Gulf Arabic A Framework for Automatic Acquisition of Croatian and Serbian Verb Aspect from Corpora Exploiting Arabic Diacritization for High Quality Automatic Annotation Rapid Development of Morphological Analyzers for Typologically Diverse Languages A Neural Lemmatizer for Bengali Merging Data Resources for Inflectional and Derivational Morphology in Czech
Multilinguality	Orthographic and Morphological Correspondences between Related Slavic Languages as a Base for Modeling of Mutual Intelligibility Transfer-Based Learning-to-Rank Assessment of Medical Term Technicality Axolotl: a Web Accessible Parallel Corpus for Spanish-Nahuatl Very-large Scale Parsing and Normalization of Wiktionary Morphological Paradigms A Computational Perspective on the Romanian Dialects A Turkish-German Code-Switching Corpus Introducing the LCC Metaphor Datasets Comparing Speech and Text Classification on ICNALE Modelling a Parallel Corpus of French and French Belgian Sign Language The United Nations Parallel Corpus v1.0 Building the Macedonian-Croatian Parallel Corpus Two Years of Aranea: Increasing Counts and Tuning the Pipeline Universal Dependencies for Japanese Cross-lingual RDF Thesauri Interlinking Quantitative Analysis of Gazes and Grounding Acts in L1 and L2 Conversations SemRelData ― Multilingual Contextual Annotation of Semantic Relations between Nominals: Dataset and Guidelines Speech Synthesis of Code-Mixed Text Crowdsourcing Ontology Lexicons CATaLog Online: Porting a Post-editing Tool to the Web Sentiment Lexicons for Arabic Social Media The IFCASL Corpus of French and German Non-native and Native Read Speech Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages Uzbek-English and Turkish-English Morpheme Alignment Corpora Large Multi-lingual, Multi-level and Multi-genre Annotation Corpus PROMETHEUS: A Corpus of Proverbs Annotated with Metaphors A Multilingual, Multi-style and Multi-granularity Dataset for Cross-language Textual Similarity Detection WIKIPARQ: A Tabulated Wikipedia Resource Using the Parquet Format South African National Centre for Digital Language Resources C4Corpus: Multilingual Web-size Corpus with Free License Cognitively Motivated Distributional Representations of Meaning Extending Monolingual Semantic Textual Similarity Task to Multiple Cross-lingual Settings Cross-lingual Linking of Multi-word Entities and their corresponding Acronyms EN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment Analysis English-to-Japanese Translation vs. Dictation vs. Post-editing: Comparing Translation Modes in a Multilingual Setting The COPLE2 corpus: a learner corpus for Portuguese Collecting Resources in Sub-Saharan African Languages for Automatic Speech Recognition: a Case Study of Wolof Challenges of Adjective Mapping between plWordNet and Princeton WordNet Poly-GrETEL: Cross-Lingual Example-based Querying of Syntactic Constructions MEANTIME, the NewsReader Multilingual Event and Time Corpus Evaluating Translation Quality and CLIR Performance of Query Sessions Standard Test Collection for English-Persian Cross-Lingual Word Sense Disambiguation European Union Language Resources in Sketch Engine FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies Evaluating Machine Translation in a Usage Scenario Finding Alternative Translations in a Large Corpus of Movie Subtitle ASPEC: Asian Scientific Paper Excerpt Corpus Bilingual Lexicon Extraction at the Morpheme Level Using Distributional Analysis Improving Bilingual Terminology Extraction from Comparable Corpora via Multiple Word-Space Models A Large-Scale Multilingual Disambiguation of Glosses MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP Comparing the Level of Code-Switching in Corpora Creation of comparable corpora for English-{Urdu, Arabic, Persian} Fostering the Next Generation of European Language Technology: Recent Developments ― Emerging Initiatives ― Challenges and Opportunities Parallel Global Voices: a Collection of Multilingual Corpora with Citizen Media Stories The Scielo Corpus: a Parallel Corpus of Scientific Publications for Biomedicine Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German Applying the Cognitive Machine Translation Evaluation Approach to Arabic Producing Monolingual and Parallel Web Corpora at the Same Time - SpiderLing and Bitextor's Love Affair UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing Coreference in Prague Czech-English Dependency Treebank IRIS: English-Irish Machine Translation System Functions of Code-Switching in Tweets: An Annotation Framework and Some Initial Experiments OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles A Multilingual Predicate Matrix Towards producing bilingual lexica from monolingual corpora
Multimedia Document Processing	SubCo: A Learner Translation Corpus of Human and Machine Subtitles A Corpus of Images and Text in Online News Speech Trax: A Bottom to the Top Approach for Speaker Tracking and Indexing in an Archiving Context A Japanese Chess Commentary Corpus Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles 1 Million Captioned Dutch Newspaper Images The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents Developing a Dataset for Evaluating Approaches for Document Expansion with Images ArchiMob - A Corpus of Spoken Swiss German
MultiWord Expressions & Collocations	Rule-based Automatic Multi-word Term Extraction and Lemmatization Example-based Acquisition of Fine-grained Collocation Resources MWEs in Treebanks: From Survey to Guidelines Multiword Expressions Dataset for Indian Languages An Empirical Study of Arabic Formulaic Sequence Extraction Methods A lexicon of perception for the identification of synaesthetic metaphors in corpora Compasses, Magnets, Water Microscopes: Annotation of Terminology in a Diachronic Corpus of Scientific Texts Happy Accident: A Sentiment Composition Lexicon for Opposing Polarity Phrases mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing TermoPL - a Flexible Tool for Terminology Extraction GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds DeQue: A Lexicon of Complex Prepositions and Conjunctions in French Construction of an English Dependency Corpus incorporating Compound Function Words Cross-lingual Linking of Multi-word Entities and their corresponding Acronyms Distribution of Valency Complements in Czech Complex Predicates: Between Verb and Noun A Lexical Resource of Hebrew Verb-Noun Multi-Word Expressions Forecasting Emerging Trends from Scientific Literature Comprehensive and Consistent PropBank Light Verb Annotation Inconsistency Detection in Semantic Annotation Towards Lexical Encoding of Multi-Word Expressions in Spanish Dialects PARSEME Survey on MWE Resources Recent Advances in Development of a Lexicon-Grammar of Polish: PolNet 3.0 Multiword Expressions in Child Language

N
Named Entity Recognition	QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages Complementarity, F-score, and NLP Evaluation An Open Corpus for Named Entity Recognition in Historic Newspapers Domain Adaptation for Named Entity Recognition Using CRFs Using Word Embeddings to Translate Named Entities Annotating and Detecting Medical Events in Clinical Notes Name Translation based on Fine-grained Named Entity Recognition in a Single Language SemLinker, a Modular and Open Source Framework for Named Entity Discovery and Linking WTF-LOD - A New Resource for Large-Scale NER Evaluation Using a Language Technology Infrastructure for German in order to Anonymize German Sign Language Corpus Data Cross-lingual Linking of Multi-word Entities and their corresponding Acronyms Crowdsourced Corpus with Entity Salience Annotations ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain Identification of Drug-Related Medical Conditions in Social Media Bridge-Language Capitalization Inference in Western Iranian: Sorani, Kurmanji, Zazaki, and Tajik Summ-it++: an Enriched Version of the Summ-it Corpus A Sequence Model Approach to Relation Extraction in Portuguese The ACL RD-TEC 2.0: A Language Resource for Evaluating Term Extraction and Entity Recognition Methods Annotating Named Entities in Consumer Health Questions The hunvec framework for NN-CRF-based sequential tagging A Regional News Corpora for Contextualized Entity Discovery and Linking Context-enhanced Adaptive Entity Linking DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus Named Entity Recognition on Twitter for Turkish using Semi-supervised Learning with Word Embeddings Parallel Chinese-English Entities, Relations and Events Corpora Government Domain Named Entity Recognition for South African Languages Named Entity Resources - Overview and Outlook Incorporating Lexico-semantic Heuristics into Coreference Resolution Sieves for Named Entity Recognition at Document-level
Natural Language Generation	The REAL Corpus: A Crowd-Sourced Corpus of Human Generated and Evaluated Spatial References to Real-World Urban Scenes The Methodius Corpus of Rhetorical Discourse Structures and Generated Texts PentoRef: A Corpus of Spoken References in Task-oriented Dialogues Automatic Corpus Extension for Data-driven Natural Language Generation Cross-validating Image Description Datasets and Evaluation Metrics Towards producing bilingual lexica from monolingual corpora

O
Ontologies	Ecological Gestures for HRI: the GEE Corpus Semi-automatic Parsing for Web Knowledge Extraction through Semantic Annotation Metonymy Analysis Using Associative Relations between Words Creating Linked Data Morphological Language Resources with MMoOn - The Hebrew Morpheme Inventory A Taxonomy of Spanish Nouns, a Statistical Algorithm to Generate it and its Implementation in Open Source Code Annotating Logical Forms for EHR Questions Domain Ontology Learning Enhanced by Optimized Relation Instance in DBpedia A Framework for Cross-lingual/Node-wise Alignment of Lexical-Semantic Resources Issues and Challenges in Annotating Urdu Action Verbs on the IMAGACT4ALL Platform Towards a Linguistic Ontology with an Emphasis on Reasoning and Knowledge Reuse Constructing a Norwegian Academic Wordlist Mapping Ontologies Using Ontologies: Cross-lingual Semantic Role Information Transfer Extracting Structured Scholarly Information from the Machine Translation Literature Managing Linguistic and Terminological Variation in a Medical Dialogue System The Event and Implied Situation Ontology (ESO): Application and Evaluation Semantic Relation Extraction with Semantic Patterns Experiment on Radiology Reports Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German PreMOn: a Lemon Extension for Exposing Predicate Models as Linked Data Wow! What a Useful Extension! Introducing Non-Referential Concepts to Wordnet Automatic Biomedical Term Polysemy Detection
Opinion Mining / Sentiment Analysis	Annotating Sentiment and Irony in the Online Italian Political Debate on #labuonascuola NileULex: A Phrase and Word Level Sentiment Lexicon for Egyptian and Modern Standard Arabic DRANZIERA: An Evaluation Protocol For Multi-Domain Opinion Mining OPFI: A Tool for Opinion Finding in Polish SatiricLR: a Language Resource of Satirical News Articles Evaluating Lexical Similarity to build Sentiment Similarity Using Data Mining Techniques for Sentiment Shifter Identification Challenges of Evaluating Sentiment Analysis Tools on Social Media EmoTweet-28: A Fine-Grained Emotion Corpus for Sentiment Analysis A Dataset for Detecting Stance in Tweets Sentiment Lexicons for Arabic Social Media Happy Accident: A Sentiment Composition Lexicon for Opposing Polarity Phrases Detecting Implicit Expressions of Affect from Text using Semantic Knowledge on Common Concept Properties Reliable Baselines for Sentiment Analysis in Resource-Limited Languages: The Serbian Movie Review Dataset Creating a General Russian Sentiment Lexicon A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings Encoding Adjective Scales for Fine-grained Resources Emotion Analysis on Twitter: The Hidden Challenge EN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment Analysis A Language Independent Method for Generating Large Scale Polarity Lexicons ANTUSD: A Large Chinese Sentiment Dictionary Polarity Lexicon Building: to what Extent Is the Manual Effort Worth? GRaSP: A Multilayered Annotation Scheme for Perspectives Emotion Corpus Construction Based on Selection from Hashtags SCARE ― The Sentiment Corpus of App Reviews with Fine-grained Annotations in German Exploring the Realization of Irony in Twitter Data Integration of Lexical and Semantic Knowledge for Sentiment Analysis in SMS Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis Sentiment Analysis in Social Networks through Topic modeling Aspect based Sentiment Analysis in Hindi: Resource Creation and Evaluation Gulf Arabic Linguistic Resource Building for Sentiment Analysis PARC 3.0: A Corpus of Attribution Relations ANEW+: Automatic Expansion and Validation of Affective Norms of Words Lexicons in Multiple Languages A Hungarian Sentiment Corpus Manually Annotated at Aspect Level Effect Functors for Opinion Inference Specialising Paragraph Vectors for Text Polarity Detection Sentiframes: A Resource for Verb-centered German Sentiment Inference
Optical Character Recognition	An Open Corpus for Named Entity Recognition in Historic Newspapers Measuring Lexical Quality of a Historical Finnish Newspaper Collection ― Analysis of Garbled OCR Data with Basic Language Technology Tools and Means Using SMT for OCR Error Correction of Historical Texts Training & Quality Assessment of an Optical Character Recognition Model for Northern Haida OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited Crowdsourcing an OCR Gold Standard for a German and French Heritage Corpus
Other	Two Architectures for Parallel Processing of Huge Amounts of Text Trends in HLT Research: A Survey of LDC's Data Scholarship Program “Who was Pietro Badoglio?” Towards a QA system for Italian History Coh-Metrix-Esp: A Complexity Analysis Tool for Documents Written in Spanish Metonymy Analysis Using Associative Relations between Words A Finite-State Morphological Analyser for Sindhi Discriminative Analysis of Linguistic Features for Typological Study Privacy Issues in Online Machine Translation Services - European Perspective The ACQDIV Database: Min(d)ing the Ambient Language Building Tempo-HindiWordNet: A resource for effective temporal information access in Hindi Review on the Existing Language Resources for Languages of France Corpus for Children’s Writing with Enhanced Output for Specific Spelling Patterns (2nd and 3rd Grade) Unsupervised Ranked Cross-Lingual Lexical Substitution for Low-Resource Languages Wikipedia Titles As Noun Tag Predictors SYN2015: Representative Corpus of Contemporary Written Czech Automatic Anomaly Detection for Dysarthria across Two Speech Styles: Read vs Spontaneous Speech User, who art thou? User Profiling for Oral Corpus Platforms Curation of Dutch Regional Dictionaries Semi-automatically Alignment of Predicates between Speech and OntoNotes data Wikification for Scriptio Continua Adding Semantic Relations to a Large-Coverage Connective Lexicon of German Crossmodal Network-Based Distributional Semantic Models Detecting Word Usage Errors in Chinese Sentences for Learning Chinese as a Foreign Language EstNLTK - NLP Toolkit for Estonian The OFAI Multi-Modal Task Description Corpus A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults Fine-Grained Chinese Discourse Relation Labelling Automatic identification of Mild Cognitive Impairment through the analysis of Italian spontaneous speech productions Construction of Japanese Audio-Visual Emotion Database and Its Application in Emotion Recognition Parallel Discourse Annotations on a Corpus of Short Texts Fostering digital representation of EU regional and minority languages: the Digital Language Diversity Project Features for Generic Corpus Querying The TYPALOC Corpus: A Collection of Various Dysarthric Speech Recordings in Read and Spontaneous Styles A Large Rated Lexicon with French Medical Words IMS HotCoref DE: A Data-driven Co-reference Resolver for German Towards Lexical Encoding of Multi-Word Expressions in Spanish Dialects Laughter in French Spontaneous Conversational Dialogs Acquiring Opposition Relations among Italian Verb Senses using Crowdsourcing A comparison of Named-Entity Disambiguation and Word Sense Disambiguation Universal Dependencies for Persian Modeling Language Change in Historical Corpora: The Case of Portuguese The CIRDO Corpus: Comprehensive Audio/Video Database of Domestic Falls of Elderly People Interoperability of Annotation Schemes: Using the Pepper Framework to Display AWA Documents in the ANNIS Interface SuperCAT: The (New and Improved) Corpus Analysis Toolkit SPLIT: Smart Preprocessing (Quasi) Language Independent Tool A Verbal and Gestural Corpus of Story Retellings to an Expressive Embodied Virtual Character Word Segmentation for Akkadian Cuneiform Survey of Conversational Behavior: Towards the Design of a Balanced Corpus of Everyday Japanese Conversation Yes, We Care! Results of the Ethics and Natural Language Processing Surveys NNBlocks: A Deep Learning Framework for Computational Linguistics Neural Network Models The Public License Selector:  Making Open Licensing Easier Named Entity Recognition on Twitter for Turkish using Semi-supervised Learning with Word Embeddings Deep Learning of Audio and Language Features for Humor Prediction Improving the Annotation of Sentence Specificity ALT Explored: Integrating an Online Dialectometric Tool and an Online Dialect Atlas Detecting Expressions of Blame or Praise in Text CommonCOW: Massively Huge Web Corpora from CommonCrawl Data and a Method to Distribute them Freely under Restrictive EU Copyright Laws Temporal Information Annotation: Crowd vs. Experts EDISON: Feature Extraction for NLP, Simplified Entity Linking with a Paraphrase Flavor Accurate Deep Syntactic Parsing of Graphs: The Case of French Enriching a Portuguese WordNet using Synonyms from a Monolingual Dictionary An Empirical Exploration of Moral Foundations Theory in Partisan News Sources Embedding Open-domain Common-sense Knowledge from Text OPFI: A Tool for Opinion Finding in Polish Cro36WSD: A Lexical Sample for Croatian Word Sense Disambiguation The Uppsala Corpus of Student Writings: Corpus Creation, Annotation, and Analysis Evaluating Lexical Similarity to build Sentiment Similarity Annotating and Detecting Medical Events in Clinical Notes Multiword Expressions Dataset for Indian Languages Constraint-Based Bilingual Lexicon Induction for Closely Related Languages The ELRA License Wizard CASSAurus: A Resource of Simpler Spanish Synonyms CoRuSS - a New Prosodically Annotated Corpus of Russian Spontaneous Speech Evaluating the Noisy Channel Model for the Normalization of Historical Texts: Basque, Spanish and Slovene Farasa: A New Fast and Accurate Arabic Word Segmenter Automatic Anomaly Detection for Dysarthria across Two Speech Styles: Read vs Spontaneous Speech Using a Cross-Language Information Retrieval System based on OHSUMED to Evaluate the Moses and KantanMT Statistical Machine Translation Systems Curation of Dutch Regional Dictionaries LibN3L:A Lightweight Package for Neural NLP Extractive Summarization under Strict Length Constraints DeQue: A Lexicon of Complex Prepositions and Conjunctions in French A Singing Voice Database in Basque for Statistical Singing Synthesis of Bertsolaritza ANTUSD: A Large Chinese Sentiment Dictionary Universal Dependencies for Norwegian Can Tweets Predict TV Ratings? Web Chat Conversations from Contact Centers: a Descriptive Study MEANTIME, the NewsReader Multilingual Event and Time Corpus Could Speaker, Gender or Age Awareness be beneficial in Speech-based Emotion Recognition? CItA: an L1 Italian Learners Corpus to Study the Development of Writing Competence Automatic Recognition of Linguistic Replacements in Text Series Generated from Keystroke Logs SCARE ― The Sentiment Corpus of App Reviews with Fine-grained Annotations in German Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries Improving Bilingual Terminology Extraction from Comparable Corpora via Multiple Word-Space Models Domain Adaptation in MT Using Titles in Wikipedia as a Parallel Corpus: Resources and Evaluation A Dutch Dysarthric Speech Database for Individualized Speech Therapy Research Neural Scoring Function for MST Parser TEITOK: Text-Faithful Annotated Corpora TLT-CRF: A Lexicon-supported Morphological Tagger for Latin Based on Conditional Random Fields A Longitudinal Bilingual Frisian-Dutch Radio Broadcast Database Designed for Code-Switching Research The CIRDO Corpus: Comprehensive Audio/Video Database of Domestic Falls of Elderly People Generating Task-Pertinent sorted Error Lists for Speech Recognition Using lexical and Dependency Features to Disambiguate Discourse Connectives in Hindi SPLIT: Smart Preprocessing (Quasi) Language Independent Tool Parallel Global Voices: a Collection of Multilingual Corpora with Citizen Media Stories TermITH-Eval: a French Standard-Based Resource for Keyphrase Extraction Evaluation French Learners Audio Corpus of German Speech (FLACGS) Yes, We Care! Results of the Ethics and Natural Language Processing Surveys Transfer of Corpus-Specific Dialogue Act Annotation to ISO Standard: Is it worth it? Wiktionnaire's Wikicode GLAWIfied: a Workable French Machine-Readable Dictionary A Neural Lemmatizer for Bengali CommonCOW: Massively Huge Web Corpora from CommonCrawl Data and a Method to Distribute them Freely under Restrictive EU Copyright Laws

P
Parsing	Accurate Deep Syntactic Parsing of Graphs: The Case of French Punctuation Prediction for Unsegmented Transcript Based on Word Vector Semi-automatic Parsing for Web Knowledge Extraction through Semantic Annotation Explicit Fine grained Syntactic and Semantic Annotation of the Idafa Construction in Arabic Phrase Level Segmentation and Labelling of Machine Translation Errors Universal Dependencies for Japanese A Dependency Treebank of the Chinese Buddhist Canon Evaluating a Deterministic Shift-Reduce Neural Parser for Constituent Parsing Language Resource Addition Strategies for Raw Text Parsing E-TIPSY: Search Query Corpus Annotated with Entities, Term Importance, POS Tags, and Syntactic Parses 4Couv: A New Treebank for French AfriBooms: An Online Treebank for Afrikaans Differentia compositionem facit. A Slower-Paced and Reliable Parser for Latin CINTIL DependencyBank PREMIUM - A Corpus of Grammatical Dependencies for Portuguese Estonian Dependency Treebank: from Constraint Grammar tagset to Universal Dependencies Construction of an English Dependency Corpus incorporating Compound Function Words South African Language Resources: Phrase Chunking Syntactic Analysis of Phrasal Compounds in Corpora: a Challenge for NLP Tools EasyTree: A Graphical Tool for Dependency Tree Annotation Neural Scoring Function for MST Parser Extracting Interlinear Glossed Text from LaTeX Documents Cross-lingual and Supervised Models for Morphosyntactic Annotation: a Comparison on Romanian Hard Time Parsing Questions: Building a QuestionBank for French Using lexical and Dependency Features to Disambiguate Discourse Connectives in Hindi Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks Towards Building Semantic Role Labeler for Indian Languages Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View The Denoised Web Treebank: Evaluating Dependency Parsing under Noisy Input Conditions UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing Towards Comparability of Linguistic Graph Banks for Semantic Parsing Czech Legal Text Treebank 1.0 NorGramBank: A ‘Deep’ Treebank for Norwegian Government Domain Named Entity Recognition for South African Languages
Part-of-Speech Tagging	A Proposal for a Part-of-Speech Tagset for the Albanian Language Morphological Analysis of Sahidic Coptic for Automatic Glossing Using a Small Lexicon with CRFs Confidence Measure to Improve POS Tagging Accuracy Two Years of Aranea: Increasing Counts and Tuning the Pipeline Learning from Within? Comparing PoS Tagging Approaches for Historical Text Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario Wikipedia Titles As Noun Tag Predictors POS-tagging of Historical Dutch Language Resource Addition Strategies for Raw Text Parsing New Inflectional Lexicons and Training Corpora for Improved Morphosyntactic Annotation of Croatian and Serbian FOLK-Gold ― A Gold Standard for Part-of-Speech-Tagging of Spoken German TGermaCorp -- A (Digital) Humanities Resource for (Computational) Linguistics Features for Generic Corpus Querying Constructing a Norwegian Academic Wordlist Fast and Robust POS tagger for Arabic Tweets Using Agreement-based Bootstrapping Lemmatization and Morphological Tagging in German and Latin: A Comparison and a Survey of the State-of-the-art TLT-CRF: A Lexicon-supported Morphological Tagger for Latin Based on Conditional Random Fields Cross-lingual and Supervised Models for Morphosyntactic Annotation: a Comparison on Romanian If You Even Don't Have a Bit of Bible: Learning Delexicalized POS Taggers Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic The hunvec framework for NN-CRF-based sequential tagging Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: the Case of Slovene Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German A Large Scale Corpus of Gulf Arabic The Denoised Web Treebank: Evaluating Dependency Parsing under Noisy Input Conditions UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing Exploiting Arabic Diacritization for High Quality Automatic Annotation Rapid Development of Morphological Analyzers for Typologically Diverse Languages FlexTag: A Highly Flexible PoS Tagging Framework
Person Identification	Comparing Speech and Text Classification on ICNALE Arabic to English Person Name Transliteration using Twitter Speech Trax: A Bottom to the Top Approach for Speaker Tracking and Indexing in an Archiving Context FABIOLE, a Speech Database for Forensic Speaker Comparison Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015 Dialogue System Characterisation by Back-channelling Patterns Extracted from Dialogue Corpus “He Said She Said” ― a Male/Female Corpus of Polish Predicting Author Age from Weibo Microblog Posts
Phonetic Databases, Phonology	New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification Phonetic Inventory for an Arabic Speech Corpus Defining and Counting Phonological Classes in Cross-linguistic Segment Databases Phoneme Alignment Using the Information on Phonological Processes in Continuous Speech The IFCASL Corpus of French and German Non-native and Native Read Speech The BAS Speech Data Repository Bridge-Language Capitalization Inference in Western Iranian: Sorani, Kurmanji, Zazaki, and Tajik Vocal Pathologies Detection and Mispronounced Phonemes Identification: Case of Arabic Continuous Speech Polish Rhythmic Database ― New Resources for Speech Timing and Rhythm Analysis
Profiling	Building a Dataset for Possessions Identification in Text Age and Gender Prediction on Health Forum Data SweLL on the rise: Swedish Learner Language corpus for European Reference Level studies A Semi-Supervised Approach for Gender Identification TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling Predicting Author Age from Weibo Microblog Posts
Prosody	Assessing the Prosody of Non-Native Speakers of English: Measures and Feature Sets AMISCO: The Austrian German Multi-Sensor Corpus Introducing the SEA_AP: an Enhanced Tool for Automatic Prosodic Analysis Metrical Annotation of a Large Corpus of Spanish Sonnets: Representation, Scansion and Evaluation Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis On the Use of a Serious Game for Recording a Speech Corpus of People with Intellectual Disabilities Polish Rhythmic Database ― New Resources for Speech Timing and Rhythm Analysis

Q
Question Answering	Event Coreference Resolution with Multi-Pass Sieves “Who was Pietro Badoglio?” Towards a QA system for Italian History The Query of Everything: Developing Open-Domain, Natural-Language Queries for BOLT Information Retrieval Question-Answering with Logic Specific to Video Games Annotating Named Entities in Consumer Health Questions Annotating Topic Development in Information Seeking Queries Markov Logic Networks for Text Mining: A Qualitative and Empirical Comparison with Integer Linear Programming

S
Semantics	A Gold Standard for Scalar Adjectives The Gavagai Living Lexicon VerbCROcean: A Repository of Fine-Grained Semantic Verb Relations for Croatian VoxML: A Visualization Modeling Language Example-based Acquisition of Fine-grained Collocation Resources Embedding Open-domain Common-sense Knowledge from Text Combining Semantic Annotation of Word Sense & Semantic Roles: A Novel Annotation Scheme for VerbNet Roles on German Language Data SemAligner: A Method and Tool for Aligning Chunks with Semantic Relation Types and Semantic Similarity Scores Introducing the LCC Metaphor Datasets DT-Neg: Tutorial Dialogues Annotated for Negation Scope and Focus in Context Medical Concept Embeddings via Labeled Background Corpora Enriching TimeBank: Towards a more precise annotation of temporal relations in a text Cro36WSD: A Lexical Sample for Croatian Word Sense Disambiguation A Taxonomy of Spanish Nouns, a Statistical Algorithm to Generate it and its Implementation in Open Source Code Spanish Word Vectors from Wikipedia Synset Ranking of Hindi WordNet Neural Embedding Language Models in Semantic Clustering of Web Search Results SemRelData ― Multilingual Contextual Annotation of Semantic Relations between Nominals: Dataset and Guidelines Using Data Mining Techniques for Sentiment Shifter Identification Question-Answering with Logic Specific to Video Games Concepticon: A Resource for the Linking of Concept Lists Aspectual Flexibility Increases with Agentivity and Concreteness\\ A Computational Classification Experiment on Polysemous Verbs Annotating Logical Forms for EHR Questions Exploitation of Co-reference in Distributional Semantics A Framework for Cross-lingual/Node-wise Alignment of Lexical-Semantic Resources The VU Sound Corpus: Adding More Fine-grained Annotations to the Freesound Database A sense-based lexicon of count and mass expressions: The Bochum English Countability Lexicon A lexicon of perception for the identification of synaesthetic metaphors in corpora A Corpus of Literal and Idiomatic Uses of German Infinitive-Verb Compounds A Dataset for Detecting Stance in Tweets Semi-automatically Alignment of Predicates between Speech and OntoNotes data Legal Text Interpretation: Identifying Hohfeldian Relations from Text Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing Crossmodal Network-Based Distributional Semantic Models A Semantically Compositional Annotation Scheme for Time Normalization PROMETHEUS: A Corpus of Proverbs Annotated with Metaphors Corpus Annotation within the French FrameNet: a Domain-by-domain Methodology GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds The SemDaX Corpus ― Sense Annotations with Scalable Sense Inventories Covering various Needs in Temporal Annotation: a Proposal of Extension of ISO TimeML that Preserves Upward Compatibility Building Concept Graphs from Monolingual Dictionary Entries CORILSE: a Spanish Sign Language Repository for Linguistic Analysis PersonaBank: A Corpus of Personal Narratives and Their Story Intention Graphs Paraphrasing Out-of-Vocabulary Words with Word Embeddings and Semantic Lexicons for Low Resource Statistical Machine Translation Semantic Layer of the Valence Dictionary of Polish Walenty Riddle Generation using Word Associations A General Framework for the Annotation of Causality Based on FrameNet Cognitively Motivated Distributional Representations of Meaning Annotating Temporally-Anchored Spatial Knowledge on Top of OntoNotes Semantic Roles Extending Monolingual Semantic Textual Similarity Task to Multiple Cross-lingual Settings The Hebrew FrameNet Project Addressing the MFS Bias in WSD systems Argument Mining: the Bottleneck of Knowledge and Language Resources Italian VerbNet: A Construction-based Approach to Italian Verb Classification Nine Features in a Random Forest to Learn Taxonomical Semantic Relations metaTED: a Corpus of Metadiscourse for Spoken Language ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain Issues and Challenges in Annotating Urdu Action Verbs on the IMAGACT4ALL Platform SpaceRef: A corpus of street-level geographic descriptions Visualisation and Exploration of High-Dimensional Distributional Features in Lexical Semantic Classification Al Qamus al Muhit, a Medieval Arabic Lexicon in LMF Automatically Generated Affective Norms of Abstractness, Arousal, Imageability and Valence for 350 000 German Lemmas A Large Rated Lexicon with French Medical Words Comprehensive and Consistent PropBank Light Verb Annotation Inconsistency Detection in Semantic Annotation Datasets for Aspect-Based Sentiment Analysis in French DART: a Dataset of Arguments and their Relations on Twitter Multi-prototype Chinese Character Embedding Bilingual Lexicon Extraction at the Morpheme Level Using Distributional Analysis Best of Both Worlds: Making Word Sense Embeddings Interpretable Improving Bilingual Terminology Extraction from Comparable Corpora via Multiple Word-Space Models Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis Can Topic Modelling benefit from Word Sense Information? Resources for building applications with Dependency Minimal Recursion Semantics Typology of Adjectives Benchmark for Compositional Distributional Models Assessing the Potential of Metaphoricity of verbs using corpus data Persian Proposition Bank Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks Semantic Relation Extraction with Semantic Patterns Experiment on Radiology Reports Typed Entity and Relation Annotation on Computer Science Papers EVALution-MAN: A Chinese Dataset for the Training and Evaluation of DSMs Towards Building Semantic Role Labeler for Indian Languages Effect Functors for Opinion Inference A Dataset for Open Event Extraction in English A Framework for Automatic Acquisition of Croatian and Serbian Verb Aspect from Corpora Semantic Annotation of the ACL Anthology Corpus for the Automatic Analysis of Scientific Literature Wow! What a Useful Extension! Introducing Non-Referential Concepts to Wordnet Graph-Based Induction of Word Senses in Croatian Towards Comparability of Linguistic Graph Banks for Semantic Parsing A Crowdsourced Database of Event Sequence Descriptions for the Acquisition of High-quality Script Knowledge GATE-Time: Extraction of Temporal Expressions and Events Building A Case-based Semantic English-Chinese Parallel Treebank VerbLexPor: a lexical resource with semantic roles for Portuguese A Multilingual Predicate Matrix Latin Vallex. A Treebank-based Semantic Valency Lexicon for Latin Merging Data Resources for Inflectional and Derivational Morphology in Czech
Semantic Web	Semi-automatic Parsing for Web Knowledge Extraction through Semantic Annotation Concepticon: A Resource for the Linking of Concept Lists Towards a Linguistic Ontology with an Emphasis on Reasoning and Knowledge Reuse Context-enhanced Adaptive Entity Linking DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job
Sign Language Recognition/Generation	A Web Tool for Building Parallel Corpora of Spoken and Sign Languages Modelling a Parallel Corpus of French and French Belgian Sign Language CORILSE: a Spanish Sign Language Repository for Linguistic Analysis Using a Language Technology Infrastructure for German in order to Anonymize German Sign Language Corpus Data Finding Recurrent Features of Image Schema Gestures: the FIGURE corpus BosphorusSign: A Turkish Sign Language Recognition Corpus in Health and Finance Domains Detection of Major ASL Sign Types in Continuous Signing For ASL Recognition
Social Media Processing	Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource Annotating Sentiment and Irony in the Online Italian Political Debate on #labuonascuola A Corpus of Wikipedia Discussions: Over the Years, with Topic, Power and Gender Labels NileULex: A Phrase and Word Level Sentiment Lexicon for Egyptian and Modern Standard Arabic Building a Dataset for Possessions Identification in Text CodE Alltag: A German-Language E-Mail Corpus A Turkish-German Code-Switching Corpus What’s the Issue Here?: Task-based Evaluation of Reader Comment Summarization Systems Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities Homing in on Twitter Users: Evaluating an Enhanced Geoparser for User Profile Locations Speech Synthesis of Code-Mixed Text Challenges of Evaluating Sentiment Analysis Tools on Social Media A Dataset for Detecting Stance in Tweets Sentiment Lexicons for Arabic Social Media An Arabic-Moroccan Darija Code-Switched Corpus Classifying Out-of-vocabulary Terms in a Domain-Specific Social Media Corpus A Document Repository for Social Media and Speech Conversations A Language Independent Method for Generating Large Scale Polarity Lexicons Corpus for Customer Purchase Behavior Prediction in Social Media TweetMT: A Parallel Microblog Corpus Can Tweets Predict TV Ratings? Web Chat Conversations from Contact Centers: a Descriptive Study Multilevel Annotation of Agreement and Disagreement in Italian News Blogs Exploring the Realization of Irony in Twitter Data Fast and Robust POS tagger for Arabic Tweets Using Agreement-based Bootstrapping DART: a Dataset of Arguments and their Relations on Twitter Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling Sentiment Analysis in Social Networks through Topic modeling Analyzing Time Series Changes of Correlation between Market Share and Concerns on Companies measured through Search Engine Suggests Segmenting Hashtags using Automatically Created Training Data What does this Emoji Mean? A Vector Space Skip-Gram Model for Twitter Emojis A Hungarian Sentiment Corpus Manually Annotated at Aspect Level Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages The Denoised Web Treebank: Evaluating Dependency Parsing under Noisy Input Conditions Named Entity Recognition on Twitter for Turkish using Semi-supervised Learning with Word Embeddings Exploring Language Variation Across Europe - A Web-based Tool for Computational Sociolinguistics Monolingual Social Media Datasets for Detecting Contradiction and Entailment Functions of Code-Switching in Tweets: An Annotation Framework and Some Initial Experiments Predicting Author Age from Weibo Microblog Posts Effects of Sampling on Twitter Trend Detection PotTS: The Potsdam Twitter Sentiment Corpus FlexTag: A Highly Flexible PoS Tagging Framework Automatic Classification of Tweets for Analyzing Communication Behavior of Museums
Speech Recognition/Understanding	Optimizing Computer-Assisted Transcription Quality with Iterative User Interfaces Punctuation Prediction for Unsegmented Transcript Based on Word Vector The DIRHA Portuguese Corpus: A Comparison of Home Automation Command Detection and Recognition in Simulated and Real Data. Enhanced CORILGA: Introducing the Automatic Phonetic Alignment Tool for Continuous Speech Using the TED Talks to Evaluate Spoken Post-editing of Machine Translation Introducing the Weighted Trustability Evaluator for Crowdsourcing Exemplified by Speaker Likability Classification Assessing the Prosody of Non-Native Speakers of English: Measures and Feature Sets AIMU: Actionable Items for Meeting Understanding A Comparative Analysis of Crowdsourced Natural Language Corpora for Spoken Dialog Systems How Diachronic Text Corpora Affect Context based Retrieval of OOV Proper Names for Audio News Introducing the SEA_AP: an Enhanced Tool for Automatic Prosodic Analysis Syllable based DNN-HMM Cantonese Speech to Text System Palabras: Crowdsourcing Transcriptions of L2 Speech Collecting Resources in Sub-Saharan African Languages for Automatic Speech Recognition: a Case Study of Wolof BulPhonC: Bulgarian Speech Corpus for the Development of ASR Technology Designing a Speech Corpus for the Development and Evaluation of Dictation Systems in Latvian SCALE: A Scalable Language Engineering Toolkit The LetsRead Corpus of Portuguese Children Reading Aloud for Performance Evaluation Mining the Spoken Wikipedia for Speech Data and Beyond A Corpus of Read and Spontaneous Upper Saxon German Speech for ASR Evaluation Parallel Speech Corpora of Japanese Dialects Generating Task-Pertinent sorted Error Lists for Speech Recognition The SI TEDx-UM speech database: a new Slovenian Spoken Language Resource AppDialogue: Multi-App Dialogues for Intelligent Assistants Speech Corpus Spoken by Young-old, Old-old and Oldest-old Japanese Joining-in-type Humanoid Robot Assisted Language Learning System
Speech Resource/Database	Endangered Language Documentation: Bootstrapping a Chatino Speech Corpus, Forced Aligner, ASR Falling silent, lost for words ... Tracing personal involvement in interviews with Dutch war veterans New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification The DIRHA Portuguese Corpus: A Comparison of Home Automation Command Detection and Recognition in Simulated and Real Data. Enhanced CORILGA: Introducing the Automatic Phonetic Alignment Tool for Continuous Speech Generating a Yiddish Speech Corpus, Forced Aligner and Basic ASR System for the AHEYM Project A Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus Capturing Chat: Annotation and Tools for Multiparty Casual Conversation. Towards Automatic Transcription of ILSE ― an Interdisciplinary Longitudinal Study of Adult Development and Aging Hidden Resources ― Strategies to Acquire and Exploit Potential Spoken Language Resources in National Archives CoRuSS - a New Prosodically Annotated Corpus of Russian Spontaneous Speech Operational Assessment of Keyword Search on Oral History Accuracy of Automatic Cross-Corpus Emotion Labeling for Conversational Speech Corpus Commonization User, who art thou? User Profiling for Oral Corpus Platforms Semi-automatically Alignment of Predicates between Speech and OntoNotes data Comparison of Emotional Understanding in Modality-Controlled Environments using Multimodal Online Emotional Communication Corpus FABIOLE, a Speech Database for Forensic Speaker Comparison A Singing Voice Database in Basque for Statistical Singing Synthesis of Bertsolaritza AMISCO: The Austrian German Multi-Sensor Corpus A Database of Laryngeal High-Speed Videos with Simultaneous High-Quality Audio Recordings of Pathological and Non-Pathological Voices FOLK-Gold ― A Gold Standard for Part-of-Speech-Tagging of Spoken German AVAB-DBS: an Audio-Visual Affect Bursts Database for Synthesis Introducing the SEA_AP: an Enhanced Tool for Automatic Prosodic Analysis Syllable based DNN-HMM Cantonese Speech to Text System Palabras: Crowdsourcing Transcriptions of L2 Speech Collecting Resources in Sub-Saharan African Languages for Automatic Speech Recognition: a Case Study of Wolof BulPhonC: Bulgarian Speech Corpus for the Development of ASR Technology The LetsRead Corpus of Portuguese Children Reading Aloud for Performance Evaluation The BAS Speech Data Repository Mining the Spoken Wikipedia for Speech Data and Beyond Parallel Speech Corpora of Japanese Dialects The TYPALOC Corpus: A Collection of Various Dysarthric Speech Recordings in Read and Spontaneous Styles A Dutch Dysarthric Speech Database for Individualized Speech Therapy Research A Shared Task for Spoken CALL? A Longitudinal Bilingual Frisian-Dutch Radio Broadcast Database Designed for Code-Switching Research The SI TEDx-UM speech database: a new Slovenian Spoken Language Resource A Verbal and Gestural Corpus of Story Retellings to an Expressive Embodied Virtual Character Speech Corpus Spoken by Young-old, Old-old and Oldest-old Japanese SPA: Web-based Platform for easy Access to Speech Processing Modules Polish Rhythmic Database ― New Resources for Speech Timing and Rhythm Analysis CHATR the Corpus; a 20-year-old archive of Concatenative Speech Synthesis Database of Mandarin Neighborhood Statistics An Extension of the Slovak Broadcast News Corpus based on Semi-Automatic Annotation Global Open Resources and Information for Language and Linguistic Analysis (GORILLA) Crowdsourcing a Multi-lingual Speech Corpus: Recording, Transcription and Annotation of the CrowdIS Corpora
Speech Synthesis	Speech Synthesis of Code-Mixed Text A Taxonomy of Specific Problem Classes in Text-to-Speech Synthesis: Comparing Commercial and Open Source Performance TTS for Low Resource Languages: A Bangla Synthesizer AVAB-DBS: an Audio-Visual Affect Bursts Database for Synthesis Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis Chatbot Technology with Synthetic Voices in the Acquisition of an Endangered Language: Motivation, Development and Evaluation of a Platform for Irish CHATR the Corpus; a 20-year-old archive of Concatenative Speech Synthesis
Standards for LRs	An Annotated Corpus of Direct Speech A Proposal for a Part-of-Speech Tagset for the Albanian Language MWEs in Treebanks: From Survey to Guidelines Corpus Query Lingua Franca (CQLF) Corpus Analysis based on Structural Phenomena in Texts: Exploiting TEI Encoding for Linguistic Research Creating a Large Multi-Layered Representational Repository of Linguistic Code Switched Arabic Data RankDCG: Rank-Ordering Evaluation Measure Language Resource Citation: the ISLRN Dissemination and Further Developments Modelling Multi-issue Bargaining Dialogues: Data Collection, Annotation Design and Corpus Quality Assessment of the Reuters Vol. 2 Multilingual Corpus The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources Covering various Needs in Temporal Annotation: a Proposal of Extension of ISO TimeML that Preserves Upward Compatibility A Large-scale Recipe and Meal Data Collection as Infrastructure for Food Research The Universal Dependencies Treebank of Spoken Slovenian Metrical Annotation of a Large Corpus of Spanish Sonnets: Representation, Scansion and Evaluation Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks The DialogBank Facilitating Metadata Interoperability in CLARIN-DK Towards Comparability of Linguistic Graph Banks for Semantic Parsing Graphical Annotation for Syntax-Semantics Mapping
Statistical and Machine Learning Methods	Punctuation Prediction for Unsegmented Transcript Based on Word Vector Transfer-Based Learning-to-Rank Assessment of Medical Term Technicality MARMOT: A Toolkit for Translation Quality Estimation at the Word Level Word Sense-Aware Machine Translation: Including Senses as Contextual Features for Improved Translation Models A Machine Learning based Music Retrieval and Recommendation System Medical Concept Embeddings via Labeled Background Corpora Aspectual Flexibility Increases with Agentivity and Concreteness\\ A Computational Classification Experiment on Polysemous Verbs Evaluating a Deterministic Shift-Reduce Neural Parser for Constituent Parsing POS-tagging of Historical Dutch An Annotated Corpus and Method for Analysis of Ad-Hoc Structures Embedded in Text A Novel Evaluation Method for Morphological Segmentation Text Segmentation of Digitized Clinical Texts How does Dictionary Size Influence Performance of Vietnamese Word Segmentation? Creating Annotated Dialogue Resources: Cross-domain Dialogue Act Classification Solving the AL Chicken-and-Egg Corpus and Model Problem: Model-free Active Learning for Phenomena-driven Corpus Construction Towards Using Social Media to Identify Individuals at Risk for Preventable Chronic Illness A Comparative Study of Text Preprocessing Approaches for Topic Detection of User Utterances Detecting Optional Arguments of Verbs Corpus-Based Diacritic Restoration for South Slavic Languages Differentia compositionem facit. A Slower-Paced and Reliable Parser for Latin A Semi-Supervised Approach for Gender Identification Word Embedding Evaluation and Combination Automatic identification of Mild Cognitive Impairment through the analysis of Italian spontaneous speech productions South African Language Resources: Phrase Chunking Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles Syllable based DNN-HMM Cantonese Speech to Text System What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets Bootstrapping a Hybrid MT System to a New Language Pair Building Language Resources for Exploring Autism Spectrum Disorders A Multimodal Corpus for the Assessment of Public Speaking Ability and Anxiety A Sequence Model Approach to Relation Extraction in Portuguese MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP Cross-lingual and Supervised Models for Morphosyntactic Annotation: a Comparison on Romanian Segmenting Hashtags using Automatically Created Training Data Detection of Major ASL Sign Types in Continuous Signing For ASL Recognition Word Segmentation for Akkadian Cuneiform A Multi-party Multi-modal Dataset for Focus of Visual Attention in Human-human and Human-robot Interaction Specialising Paragraph Vectors for Text Polarity Detection NNBlocks: A Deep Learning Framework for Computational Linguistics Neural Network Models MoBiL: A Hybrid Feature Set for Automatic Human Translation Quality Assessment Learning Thesaurus Relations from Distributional Features
Summarisation	Revisiting Summarization Evaluation for Scientific Articles What’s the Issue Here?: Task-based Evaluation of Reader Comment Summarization Systems The OnForumS corpus from the Shared Task on Online Forum Summarisation at MultiLing 2015 Extractive Summarization under Strict Length Constraints A Publicly Available Indonesian Corpora for Automatic Abstractive and Extractive Chat Summarization Enhancing The RATP-DECODA Corpus With Linguistic Annotations For Performing A Large Range Of NLP Tasks Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization Urdu Summary Corpus Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization

T
Text Mining	Event Coreference Resolution with Multi-Pass Sieves The PsyMine Corpus - A Corpus annotated with Psychiatric Disorders and their Etiological Factors An Empirical Exploration of Moral Foundations Theory in Partisan News Sources Arabic Corpora for Credibility Analysis Medical Concept Embeddings via Labeled Background Corpora Using Data Mining Techniques for Sentiment Shifter Identification Homing in on Twitter Users: Evaluating an Enhanced Geoparser for User Profile Locations Domain Ontology Learning Enhanced by Optimized Relation Instance in DBpedia An Annotated Corpus and Method for Analysis of Ad-Hoc Structures Embedded in Text A Large DataBase of Hypernymy Relations Extracted from the Web. JATE 2.0: Java Automatic Term Extraction with Apache Solr Text Segmentation of Digitized Clinical Texts Creating a General Russian Sentiment Lexicon A Multilingual, Multi-style and Multi-granularity Dataset for Cross-language Textual Similarity Detection WIKIPARQ: A Tabulated Wikipedia Resource Using the Parquet Format Monitoring Disease Outbreak Events on the Web Using Text-mining Approach and Domain Expert Knowledge Odin's Runes: A Rule Language for Information Extraction A Publicly Available Indonesian Corpora for Automatic Abstractive and Extractive Chat Summarization Identifying Content Types of Messages Related to Open Source Software Projects Ensemble Classification of Grants using LDA-based Features Ambiguity Diagnosis for Terms in Digital Humanities A Classification-based Approach to Economic Event Detection in Dutch News Text Corpus for Customer Purchase Behavior Prediction in Social Media NLP and Public Engagement: The Case of the Italian School Reform LanguageCrawl: A Generic Tool for Building Language Models Upon Common-Crawl Tweeting and Being Ironic in the Debate about a Political Reform: the French Annotated Corpus TWitter-MariagePourTous Edit Categories and Editor Role Identification in Wikipedia Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization Crowdsourcing Salient Information from News and Tweets More than Word Cooccurrence: Exploring Support and Opposition in International Climate Negotiations with Semantic Parsing Analyzing Time Series Changes of Correlation between Market Share and Concerns on Companies measured through Search Engine Suggests The Event and Implied Situation Ontology (ESO): Application and Evaluation Typed Entity and Relation Annotation on Computer Science Papers Detection of Reformulations in Spoken French A Study of Reuse and Plagiarism in LREC papers Controlled Propagation of Concept Annotations in Textual Corpora Predictive Modeling: Guessing the NLP Terms of Tomorrow A Crowdsourced Database of Event Sequence Descriptions for the Acquisition of High-quality Script Knowledge Detecting Expressions of Blame or Praise in Text Effects of Sampling on Twitter Trend Detection Studying the Temporal Dynamics of Word Co-occurrences: An Application to Event Detection Automatic Biomedical Term Polysemy Detection Markov Logic Networks for Text Mining: A Qualitative and Empirical Comparison with Integer Linear Programming
Textual Entailment and Paraphrasing	SemAligner: A Method and Tool for Aligning Chunks with Semantic Relation Types and Semantic Similarity Scores Passing a USA National Bar Exam: a First Corpus for Experimentation Corpora for Learning the Mutual Relationship between Semantic Relatedness and Textual Entailment TEG-REP: A corpus of Textual Entailment Graphs based on Relation Extraction Patterns UPPC - Urdu Paraphrase Plagiarism Corpus Crowdsourcing a Large Dataset of Domain-Specific Context-Sensitive Semantic Verb Relations Relation- and Phrase-level Linking of FrameNet with Sar-graphs A Corpus of Word-Aligned Asked and Anticipated Questions in a Virtual Patient Dialogue System Detection of Reformulations in Spoken French A Crowdsourced Database of Event Sequence Descriptions for the Acquisition of High-quality Script Knowledge Monolingual Social Media Datasets for Detecting Contradiction and Entailment
Tools, Systems, Applications	Event Coreference Resolution with Multi-Pass Sieves An Interaction-Centric Dataset for Learning Automation Rules in Smart Homes Two Architectures for Parallel Processing of Huge Amounts of Text Sieve-based Coreference Resolution in the Biomedical Domain How to Address Smart Homes with a Social Robot? A Multi-modal Corpus of User Interactions with an Intelligent Environment Croatian Error-Annotated Corpus of Non-Professional Written Language MARMOT: A Toolkit for Translation Quality Estimation at the Word Level NLP Infrastructure for the Lithuanian Language Enhanced CORILGA: Introducing the Automatic Phonetic Alignment Tool for Continuous Speech Sense-annotating a Lexical Substitution Data Set with Ubyline Coh-Metrix-Esp: A Complexity Analysis Tool for Documents Written in Spanish Annotating Characters in Literary Corpora: A Scheme, the CHARLES Tool, and an Annotated Novel A Machine Learning based Music Retrieval and Recommendation System Publishing the Trove Newspaper Corpus Deriving Morphological Analyzers from Example Inflections SemAligner: A Method and Tool for Aligning Chunks with Semantic Relation Types and Semantic Similarity Scores The on-line version of Grammatical Dictionary of Polish Enriching TimeBank: Towards a more precise annotation of temporal relations in a text The Uppsala Corpus of Student Writings: Corpus Creation, Annotation, and Analysis RankDCG: Rank-Ordering Evaluation Measure CASSAurus: A Resource of Simpler Spanish Synonyms MarsaGram: an excursion in the forests of parsing trees Operational Assessment of Keyword Search on Oral History Defining and Counting Phonological Classes in Cross-linguistic Segment Databases Benchmarking Lexical Simplification Systems Syntax-based Multi-system Machine Translation Phoneme Alignment Using the Information on Phonological Processes in Continuous Speech Farasa: A New Fast and Accurate Arabic Word Segmenter Use of Domain-Specific Language Resources in Machine Translation A Large DataBase of Hypernymy Relations Extracted from the Web. Automatic Anomaly Detection for Dysarthria across Two Speech Styles: Read vs Spontaneous Speech JATE 2.0: Java Automatic Term Extraction with Apache Solr CATaLog Online: Porting a Post-editing Tool to the Web The ILMT-s2s Corpus ― A Multimodal Interlingual Map Task Corpus KorAP Architecture ― Diving in the Deep Sea of Corpus Data mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing SVALex: a CEFR-graded Lexical Resource for Swedish Foreign and Second Language Learners Solving the AL Chicken-and-Egg Corpus and Model Problem: Model-free Active Learning for Phenomena-driven Corpus Construction Detecting Word Usage Errors in Chinese Sentences for Learning Chinese as a Foreign Language TermoPL - a Flexible Tool for Terminology Extraction Correcting Errors in a Treebank Based on Tree Mining Towards Using Social Media to Identify Individuals at Risk for Preventable Chronic Illness LibN3L:A Lightweight Package for Neural NLP Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest EstNLTK - NLP Toolkit for Estonian SemLinker, a Modular and Open Source Framework for Named Entity Discovery and Linking Finding Definitions in Large Corpora with Sketch Engine Fine-Grained Chinese Discourse Relation Labelling Corpus-Based Diacritic Restoration for South Slavic Languages Ensemble Classification of Grants using LDA-based Features Riddle Generation using Word Associations Purely Corpus-based Automatic Conversation Authoring Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles Distribution of Valency Complements in Czech Complex Predicates: Between Verb and Noun 1 Million Captioned Dutch Newspaper Images Multimodal Resources for Human-Robot Communication Modelling The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents NLP and Public Engagement: The Case of the Italian School Reform FLAT: Constructing a CLARIN Compatible Home for Language Resources SCALE: A Scalable Language Engineering Toolkit LanguageCrawl: A Generic Tool for Building Language Models Upon Common-Crawl Construction and Analysis of a Large Vietnamese Text Corpus Accessing and Elaborating Walenty - a Valence Dictionary of Polish - via Internet Browser Visualisation and Exploration of High-Dimensional Distributional Features in Lexical Semantic Classification Evaluating Lexical Simplification and Vocabulary Knowledge for Learners of French: Possibilities of Using the FLELex Resource EasyTree: A Graphical Tool for Dependency Tree Annotation Automatic Recognition of Linguistic Replacements in Text Series Generated from Keystroke Logs Bootstrapping a Hybrid MT System to a New Language Pair Multilevel Annotation of Agreement and Disagreement in Italian News Blogs Adapting an Entity Centric Model for Portuguese Coreference Resolution FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies Staggered NLP-assisted refinement for Clinical Annotations of Chronic Disease Events Cross-validating Image Description Datasets and Evaluation Metrics Using BabelNet to Improve OOV Coverage in SMT A Multimodal Corpus for the Assessment of Public Speaking Ability and Anxiety MADAD: A Readability Annotation Tool for Arabic Text IMS HotCoref DE: A Data-driven Co-reference Resolver for German Resources for building applications with Dependency Minimal Recursion Semantics More than Word Cooccurrence: Exploring Support and Opposition in International Climate Negotiations with Semantic Parsing Guidelines and Framework for a Large Scale Arabic Diacritized Corpus TEITOK: Text-Faithful Annotated Corpora Extracting Interlinear Glossed Text from LaTeX Documents MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP BAS Speech Science Web Services - an Update of Current Developments Evaluation of the KIT Lecture Translation System CirdoX: an on/off-line multisource speech and sound analysis software Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation Tools and Guidelines for Principled Machine Translation Development Interoperability of Annotation Schemes: Using the Pepper Framework to Display AWA Documents in the ANNIS Interface SuperCAT: The (New and Improved) Corpus Analysis Toolkit SPLIT: Smart Preprocessing (Quasi) Language Independent Tool Urdu Summary Corpus Refurbishing a Morphological Database for German OSMAN ― A Novel Arabic Readability Metric UIMA-Based JCoRe 2.0 Goes GitHub and Maven Central ― State-of-the-Art Software Resource Engineering and Distribution of NLP Pipelines The hunvec framework for NN-CRF-based sequential tagging SPA: Web-based Platform for easy Access to Speech Processing Modules Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: the Case of Slovene Towards Multiple Antecedent Coreference Resolution in Specialized Discourse Word Segmentation for Akkadian Cuneiform Towards a Language Service Infrastructure for Mobile Environments NNBlocks: A Deep Learning Framework for Computational Linguistics Neural Network Models Controlled Propagation of Concept Annotations in Textual Corpora The Public License Selector:  Making Open Licensing Easier Searching in the Penn Discourse Treebank Using the PML-Tree Query IRIS: English-Irish Machine Translation System Exploring Language Variation Across Europe - A Web-based Tool for Computational Sociolinguistics corpus-tools.org: An Interoperable Generic Software Tool Set for Multi-layer Linguistic Corpora On Developing Resources for Patient-level Information Retrieval ALT Explored: Integrating an Online Dialectometric Tool and an Online Dialect Atlas Czech Legal Text Treebank 1.0 FlexTag: A Highly Flexible PoS Tagging Framework CLARIN-EL Web-based Annotation Tool Adapting the TANL tool suite to Universal Dependencies Markov Logic Networks for Text Mining: A Qualitative and Empirical Comparison with Integer Linear Programming EDISON: Feature Extraction for NLP, Simplified
Topic Detection & Tracking	Enhancing Access to Online Education: Quality Machine Translation of MOOC Content That'll Do Fine!: A Coarse Lexical Resource for English-Hindi MT, Using Polylingual Topic Models Forecasting Emerging Trends from Scientific Literature Can Topic Modelling benefit from Word Sense Information? Analyzing Time Series Changes of Correlation between Market Share and Concerns on Companies measured through Search Engine Suggests Automatic Construction of Discourse Corpora for Dialogue Translation Predictive Modeling: Guessing the NLP Terms of Tomorrow Studying the Temporal Dynamics of Word Co-occurrences: An Application to Event Detection
Typological Databases	Discriminative Analysis of Linguistic Features for Typological Study The Alaskan Athabascan Grammar Database Defining and Counting Phonological Classes in Cross-linguistic Segment Databases Typology of Adjectives Benchmark for Compositional Distributional Models Legacy language atlas data mining: mapping Kru languages

U
Usability, User Satisfaction	Providing a Catalogue of Language Resources for Commercial Users Data Formats and Management Strategies from the Perspective of Language Resource Producers ― Personal Diachronic and Social Synchronic Data Sharing ― User, who art thou? User Profiling for Oral Corpus Platforms Design and Development of the MERLIN Learner Corpus Platform On the Use of a Serious Game for Recording a Speech Corpus of People with Intellectual Disabilities The dialogue breakdown detection challenge: Task description, datasets, and evaluation metrics Evaluating the Impact of Light Post-Editing on Usability Automatic Corpus Extension for Data-driven Natural Language Generation BAS Speech Science Web Services - an Update of Current Developments Evaluation of the KIT Lecture Translation System The Trials and Tribulations of Predicting Post-Editing Productivity Evaluating Interactive System Adaptation Translation Errors and Incomprehensibility: a Case Study using Machine-Translated Second Language Proficiency Tests

V
Validation of LRs	Semantic Links for Portuguese Croatian Error-Annotated Corpus of Non-Professional Written Language Towards Automatic Transcription of ILSE ― an Interdisciplinary Longitudinal Study of Adult Development and Aging Evaluating a Topic Modelling Approach to Measuring Corpus Similarity Benchmarking Lexical Simplification Systems Monitoring Disease Outbreak Events on the Web Using Text-mining Approach and Domain Expert Knowledge Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities LexFr: Adapting the LexIt Framework to Build a Corpus-based French Subcategorization Lexicon Using BabelNet to Improve OOV Coverage in SMT Evaluating Context Selection Strategies to Build Emotive Vector Space Models Typology of Adjectives Benchmark for Compositional Distributional Models Comparing the Level of Code-Switching in Corpora Detecting Annotation Scheme Variation in Out-of-Domain Treebanks Aspect based Sentiment Analysis in Hindi: Resource Creation and Evaluation Modeling Language Change in Historical Corpora: The Case of Portuguese VPS-GradeUp: Graded Decisions on Usage Patterns ANEW+: Automatic Expansion and Validation of Affective Norms of Words Lexicons in Multiple Languages Analysing Constraint Grammars with a SAT-solver Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View Designing A Long Lasting Linguistic Project: The Case Study of ASIt B2SG: a TOEFL-like Task for Portuguese Effects of Sampling on Twitter Trend Detection Named Entity Resources - Overview and Outlook
Voice Command and Control	The DIRHA Portuguese Corpus: A Comparison of Home Automation Command Detection and Recognition in Simulated and Real Data. A Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus Designing a Speech Corpus for the Development and Evaluation of Dictation Systems in Latvian Vocal Pathologies Detection and Mispronounced Phonemes Identification: Case of Arabic Continuous Speech CirdoX: an on/off-line multisource speech and sound analysis software

W
Web Services	The Gavagai Living Lexicon Axolotl: a Web Accessible Parallel Corpus for Spanish-Nahuatl Tēzaurs.lv: the Largest Open Lexical Database for Latvian Improving corpus search via parsing The Language Application Grid and Galaxy Language Resource Citation: the ISLRN Dissemination and Further Developments The ELRA License Wizard KorAP Architecture ― Diving in the Deep Sea of Corpus Data Issues and Challenges in Annotating Urdu Action Verbs on the IMAGACT4ALL Platform Accessing and Elaborating Walenty - a Valence Dictionary of Polish - via Internet Browser FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies BAS Speech Science Web Services - an Update of Current Developments DALILA: The Dialectal Arabic Linguistic Learning Assistant SPA: Web-based Platform for easy Access to Speech Processing Modules Towards a Language Service Infrastructure for Mobile Environments Exploring Language Variation Across Europe - A Web-based Tool for Computational Sociolinguistics
Word Sense Disambiguation	QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages Word Sense-Aware Machine Translation: Including Senses as Contextual Features for Improved Translation Models Sense-annotating a Lexical Substitution Data Set with Ubyline Combining Semantic Annotation of Word Sense & Semantic Roles: A Novel Annotation Scheme for VerbNet Roles on German Language Data Modelling a Parallel Corpus of French and French Belgian Sign Language Cro36WSD: A Lexical Sample for Croatian Word Sense Disambiguation Synset Ranking of Hindi WordNet Neural Embedding Language Models in Semantic Clustering of Web Search Results CASSAurus: A Resource of Simpler Spanish Synonyms Discovering Fuzzy Synsets from the Redundancy in Different Lexical-Semantic Resources Unsupervised Ranked Cross-Lingual Lexical Substitution for Low-Resource Languages A Corpus of Literal and Idiomatic Uses of German Infinitive-Verb Compounds The SemDaX Corpus ― Sense Annotations with Scalable Sense Inventories Automatic Enrichment of WordNet with Common-Sense Knowledge Ambiguity Diagnosis for Terms in Digital Humanities Addressing the MFS Bias in WSD systems Graded and Word-Sense-Disambiguation Decisions in Corpus Pattern Analysis: a Pilot Study Standard Test Collection for English-Persian Cross-Lingual Word Sense Disambiguation Multi-prototype Chinese Character Embedding A Large-Scale Multilingual Disambiguation of Glosses Can Topic Modelling benefit from Word Sense Information? A comparison of Named-Entity Disambiguation and Word Sense Disambiguation VPS-GradeUp: Graded Decisions on Usage Patterns Generating a Large-Scale Entity Linking Dictionary from Wikipedia Link Structure and Article Text Graph-Based Induction of Word Senses in Croatian SlangNet: A WordNet like resource for English Slang A Multi-domain Corpus of Swedish Word Sense Annotation Automatic Biomedical Term Polysemy Detection

Powered by ELDA © 2016 ELDA/ELRA