Language |
Title |
60 languages |
The OPUS corpus - parallel and free
|
A |
|
Abbey |
WALA: a multilingual resource repository for West African
Languages |
Afrikaans |
A Chatbot as a Novel Corpus Visualization Tool
|
A Spoken Afrikaans Language Resource Designed for Research on
Pronunciation Variations
|
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
The African Speech Technology Project: An Assessment
|
Albanian |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
MED-TYP: A Typological Database for Mediterranean Languages |
All |
A Registry of Standard Data Categories for
Linguistic Annotation
|
Mapping Dependency Structures to Phrase Structures and the
Automatic Acquisition of Mapping Rules |
All text encodable
languages |
Migrating Language Resources from SGML to
XML: the Text Encoding Initiative Recommendations
|
All Unicode supported
languages |
Callisto: A Configurable Annotation Workbench |
American English |
The American English SALA-II Data Collection |
The American National Corpus First Release
|
Any |
WinPitch Corpus, a Text to Speech Alignment Tool for Multimodal
Corpora |
Collecting and Sharing Bilingual Spontaneous Speech Corpora: the
ChinFaDial Experiment |
eGram - a Grammar Development Environment and its Usage for
Language Generation |
ENABLER Thematic Network of National Projects: Technical,
Strategic and Political Issues of LRs |
Anyi |
CoGesT: A Formal Transcription System for Conversational Gesture |
Anyi |
WALA: a multilingual resource repository for
West African Languages |
Arabic |
A Chatbot as a Novel Corpus Visualization
Tool |
A Framework for Evaluating the Suitability of
Non-English Corpora for Language Engineering
|
A Multi-Modal Documentation System for Warao |
A Progress Report from the Linguistic Data Consortium: Recent
Activities in Resource Creation and Distribution and the
Development of Tools and Standards |
An Emerging Transcontinental Collaborative Research and
Education Agenda in Human Language Technologies |
Annotation Tools for Large-Scale Corpus Development: Using AGTK
at the Linguistic Data Consortium |
Automatic Language-Independent Induction of
Gazetteer Lists
|
Collection and Evaluation of Broadcast News Data for Arabic |
Construction of a Bilingual Arabic-Spanish
Lexicon of Verbs Based on a Parallel Corpus
|
Conversational Telephone Speech Corpus
Collection for the NIST Speaker Recognition Evaluation 2004
|
Generating an Arabic full-form lexicon for
bidirectional morphology lookup
|
Language Model Adaptation for Statistical
Machine Translation based on Information Retrieval
|
Linguistic Resources for Effective, Affordable, Reusable
Speech-to-Text |
NEMLAR - An Arabic Language Resources Project |
OrienTel - Telephony Databases Across Northern Africa and the
Middle East |
Talkbank: Building an Open Unified Multimodal
Database of Communicative Interaction
|
The Automatic Content Extraction (ACE)
Program - Tasks, Data, and Evaluation |
The Fisher Corpus: A Resource for the Next Generations of
Speech-to-Text |
The Mixer Corpus of Multilingual, Multichannel Speaker
Recognition Data |
Towards basic categories for describing
properties of texts in a corpus
|
Arabic dialects |
MED-TYP: A Typological Database for Mediterranean Languages |
B |
|
Balkan languages |
The Integral Dictionary: An Ontological
Resource for the Semantic Web Integration of EuroWordNet,
Balkanet, TID and SUMO
|
Basque |
A Xml-Based Term Extraction Tool for Basque
|
Abar-Hitz: An Annotation Tool for the Basque
Dependency Treebank
|
Cross-Language Acquisition of Semantic Models for Verbal
Predicates |
Development of Resources for a Bilingual Automatic Index System
of Broadcast News in Basque and Spanish |
Evaluation of a Spoken Phonetic Databse in
Basque Language |
Exploring Portability of Syntactic Information from English to
Basque |
Towards the MEANING Top Ontology: Sources of Ontological Meaning |
Translation memories enrichment by
statistical bilingual segmentation
|
Basque (standard) |
Designing and Recording an Audiovisual
Database of Emotional Speech in Basque
|
Baule |
WALA: a multilingual resource repository for
West African Languages
|
Bengali |
A Framework for Evaluating the Suitability of
Non-English Corpora for Language Engineering
|
Berber |
An Emerging Transcontinental Collaborative Research and
Education Agenda in Human Language Technologies |
MED-TYP: A Typological Database for Mediterranean Languages |
Bulgarian |
A Hybrid Strategy for Regular Grammar Parsing |
A Language Resources Infrastructure for
Bulgarian
|
A Methodology and Associated Tools for
Building Interlingual Wordnets |
Cluster Analysis and Classification of Named Entities |
Exploring Balkanet Shared Ontology for
Multilingual Conceptual Indexing
|
Making Monolingual Corpora Comparable: a Case
Study of Bulgarian and Croatian
|
MULTEXT-East Version 3: Multilingual
Morphosyntactic Specifications, Lexicons and Corpora
|
Multilingual Pattern Libraries for Question
Answering: a Case Study for Definition Questions
|
Talkbank: Building an Open Unified Multimodal
Database of Communicative Interaction
|
The CLaRK System: XML-based Corpora
Development System for Rapid Prototyping
|
Unexpected Productions May Well be Errors
|
Verb Valency Descriptors for a Syntactic
Treebank |
C |
|
Cantonese |
Talkbank: Building an Open Unified Multimodal
Database of Communicative Interaction |
Catalan
|
ALLES: Integrating NLP in ICALL Applications
|
Bilingual Connections for Trilingual Corpora: An XML Approach |
Creation and Validation of Large Lexica for
Speech-to-Speech Translation Purposes |
FreeLing: An Open-Source Suite of Language
Analyzers
|
MED-TYP: A Typological Database for Mediterranean Languages |
Mercedes, A Term-In-Context Highlighter
|
NLP-enhanced error Checking for Catalan
unrestricted text |
Talkbank: Building an Open Unified Multimodal
Database of Communicative Interaction
|
The GENOMA-KB Platform: Queries Over Integrated Linguistic
Resources |
The GENOMA-KB project: towards the
integration of concepts, terms, textual corpora and entities
|
Towards the MEANING Top Ontology: Sources of Ontological Meaning |
Towards the Use of Word Stems and Suffixes
for Statistical Machine Translation
|
Chinese |
A Model of Semantic Representations Analysis For Chinese
Sentences |
A Multi-Modal Documentation System for Warao |
An Information Repository Model for Advanced Question Answering
Systems |
Annotation Tools for Large-Scale Corpus Development: Using AGTK
at the Linguistic Data Consortium |
Augmenting Manual Dictionaries for
Statistical Machine Translation Systems |
Automatic Language-Independent Induction of
Gazetteer Lists
|
Collecting and Sharing Bilingual Spontaneous Speech Corpora: the
ChinFaDial Experiment |
Collocation Extraction Using Web Statistics
|
Distributional Consistency: As a General Method for Defining a
Core Lexicon |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Korean-Chinese-Japanese Multilingual Wordnet with Shared
Semantic Hierarchy |
Language Model Adaptation for Statistical
Machine Translation based on Information Retrieval
|
Linguistic Resources for Effective, Affordable, Reusable
Speech-to-Text |
MEAD - A Platform for Multidocument Multilingual Text
Summarization |
Pattern Discovery in Named Organization
Corpus |
Sinica BOW (Bilingual Ontological Wordnet):
Integration of Bilingual WordNet and SUMO
|
Sinica BOW (Bilingual Ontological Wordnet):
Integration of Bilingual WordNet and SUMO
|
Speech & Expression - The Value of a Longitudinal Corpus |
Test Collections for Patent-to-Patent
Retrieval and Patent Map Generation in NTCIR-4 Workshop
|
The Automatic Content Extraction (ACE)
Program - Tasks, Data, and Evaluation
|
Chol |
Talkbank: Building an Open Unified Multimodal
Database of Communicative Interaction |
Classical Arabic |
Creation and Validation of Large Lexica for
Speech-to-Speech Translation Purposes
|
Contemporary Italian |
Representing Italian Complex Nominals: a
Pilot Study |
Croatian |
Enlarging the Croatian Morphological Lexicon
by Automatic Lexical Acquisition from Raw Corpora
|
Making Monolingual Corpora Comparable: a Case
Study of Bulgarian and Croatian
|
MULTEXT-East Version 3: Multilingual
Morphosyntactic Specifications, Lexicons and Corpora
|
Talkbank: Building an Open Unified Multimodal
Database of Communicative Interaction |
Cypriot Greek |
Cypriot Speech Database: Data Collection and
Greek to Cypriot Dialect Adaptation
|
Czech |
A Methodology and Associated Tools for
Building Interlingual Wordnets
|
Annotators' Agreement: The Case of
Topic-Focus Articulation
|
Derivational Relations in Flectional
Languages - Czech Case
|
Exploring Balkanet Shared Ontology for
Multilingual Conceptual Indexing
|
Issues in Annotation of the Czech Spontaneous Speech Corpus in
the MALACH Project |
MULTEXT-East Version 3: Multilingual
Morphosyntactic Specifications, Lexicons and Corpora
|
Orthographic and Phonetic Annotation of Very
Large Czech Corpora with Quality Assessment
|
Prague Czech-English Dependency Treebank, Syntactically
Annotated Resources for Machine Translation |
Talkbank: Building an Open Unified Multimodal
Database of Communicative Interaction
|
The Core of the Czech Derivational Dictionary |
The COST278 pan-European Broadcast News
Database |
The Design of Czech Language Formal Listening
Tests for the Evaluation of TTS Systems
|
Tiered Tagging Revisited
|
Top Ontology as a Tool for Semantic Role Tagging |
Word Association Norms as a Unique Supplement of Traditional
Language Resources |
D |
|
DAML+OIL |
Ontology Evaluation Functionalities of RDF(S), DAML+OIL, and OWL
Parsers and Ontology Platforms |
Danish |
A Corpus-based Syntactic Lexicon for Adverbs
|
A Danish Lexicon Resource - Ready for
Applications
|
A Flexible Language Acquisition Tool Kit for Natural Language
Processing |
A Named Entity Recognizer for Danish
|
Evaluation of a Multimodal Dialogue System for Small-screen
Devices |
Human Language Technology Elements in a
Knowledge Organisation System -The VID project
|
Talkbank: Building an Open Unified Multimodal
Database of Communicative Interaction
|
The Bilingual Web Dictionary on Demand
|
Dutch |
Automatic Phonemic Labeling and Segmentation
of Spoken Dutch
|
Automatic Sentence Simplification for Subtitling in Dutch and
English |
Discarding noise in an automatically acquired
lexicon of support verb constructions
|
Evaluating Multimodal NLG using Production
Experiments
|
Evaluation and Adaptation of the Celex Dutch
Morphological Database
|
Improving Automatic Phonetic Transcription of
Spontaneous Speech through Variant-Based Pronunciation Variation
Modelling |
Intelligent Building of Language Resources
for HLT Applications
|
Linguistic annotation of the Spoken Dutch
Corpus: If we had to do it all over again ...
|
On the Usefulness of Large Spoken Language
Corpora for Linguistic Research
|
Putting the Dutch PAROLE Corpus to Work
|
Reusable Lexical Representations for Idioms |
Talkbank: Building an Open Unified Multimodal
Database of Communicative Interaction
|
Term Translations in Parallel Corpora:
Discovery and Consistency Check |
The Centre for Dutch Language and Speech
Technology (TST Centre)
|
The COST278 pan-European Broadcast News
Database |
The Influence of the Labeller’s Regional Background on Phonetic
Transcriptions: Implications for the Evaluation of Spoken
Language Resources |
The Integral Dictionary: An Ontological Resource for the
Semantic Web Integration of EuroWordNet, Balkanet, TID and SUMO
|
The Integrated Language Database of 8th - 21st-Century Dutch
|
The new Dutch-Flemish HLT Programme: a concerted effort to
stimulate the HLT sector
|
Use and Evaluation of Prosodic Annotations in Dutch
|
Using a Parallel Transcript/Subtitle Corpus for Sentence
Compression |
Using large multi-purpose corpora for specific research
questions: discourse phenomena related to wh-questions in the
Spoken Dutch Corpus |
Dutch (historical) |
The Integrated Language Database of 8th - 21st-Century Dutch
|
E |
|
Ega |
Securing Interpretability: The Case of Ega Language
Documentation |
WALA: a multilingual resource repository for West African
Languages |
EL |
Multimodal Multilingual Resources in the Subtitling Process |
EN |
Multimodal Multilingual Resources in the Subtitling Process |
English |
A Chatbot as a Novel Corpus Visualization Tool
|
A Comparative Study on Human Communication Behaviors and
Linguistic Characteristics for Speech-to-Speech Translation
|
A comparison of summarisation methods based on term specificity
estimation |
A Comparison of Two Variant Corpora: The Same Content with
Different Sources |
A Critical Survey of the Methodology for IE Evaluation
|
A Domain-Independent Approach to IE Rule Development
|
A Fine-Grained Evaluation Method for Speech-to-Speech Machine
Translation Using Concept Annotations
|
A Flexible Language Acquisition Tool Kit for Natural Language
Processing |
A Framework for Evaluating the Suitability of Non-English
Corpora for Language Engineering
|
A Framework for Temporal Resolution |
A Freely Available Automatically Generated Thesaurus of Related
Words |
A General-Purpose off-the-shelf Anaphora Resolution Module:
Implementation and Preliminary Evaluation
|
A Grammar and Style Checker Based on Internet Searches
|
A Labelled Corpus for Prepositional Phrase Attachment |
A Large-Scale Resource for Storing and Recognizing Technical
Terminology |
A Lexicon Module for a Grammar Development Environment |
A Methodology and Associated Tools for Building Interlingual
Wordnets |
A Multilingual Database of Idioms |
A Multi-Modal Documentation System for Warao |
A natural language approach to information management: tracking
scientific advances through the structure of words
|
A New ITU-T Recommendation on the Evaluation of Telephone-Based
Spoken Dialogue Systems |
A pattern extraction workbench combining multiple linguistic
levels |
A powerful and versatile XML format for representing
role-semantic annotation
|
A practical competition of different filters used in automatic
term extraction |
A Progress Report from the Linguistic Data Consortium: Recent
Activities in Resource Creation and Distribution and the
Development of Tools and Standards |
A Public Reference Implementation of the RAP Anaphora Resolution
Algorithm |
A Similarity Measure for Unsupervised Semantic Disambiguation |
A Suite of Tools for Marking Up Textual Data for Temporal Text
Mining Scenarios |
A word alignment system based on a translation equivalence
extractor |
A2Q: an agent-based architecure for multilingual Q&A |
Abstracting a Dialogue Act Tagset for Meeting Processing
|
Acquiring Bayesian Networks from Text
|
Acquiring Reusable Multilingual Phonotactic Resources
|
Adding Syntactic Annotations to Transcripts of Parent-Child
Dialogs |
Agreement in Human Factoid Annotation for Summarization
Evaluation |
ALLES: Integrating NLP in ICALL Applications
|
An Analysis of the Relative Difficulty of Reuters-21578 Subsets |
An Annotation Scheme for Information Status in Dialogue |
An argumentative annotation schema for meeting discussions
|
An Automatic Method for Constructing Domain-Specific Ontology
Resources |
An Emerging Transcontinental Collaborative Research and
Education Agenda in Human Language Technologies |
An Information Repository Model for Advanced Question Answering
Systems |
Annotating a corpus for building a domain-specific knowledge
base |
Annotating Noun Argument Structure for NomBank
|
Annotation of anaphoric expressions in an aligned bilingual
corpus |
Annotation OfCoreference Relations Among Linguistic Expressions
And Images In Biological Articles
|
Annotation Tools for Large-Scale Corpus Development: Using AGTK
at the Linguistic Data Consortium |
Application of the BLEU Method for Evaluating Free-text Answers
in an E-learning Environment |
Augmenting Manual Dictionaries for Statistical Machine
Translation Systems |
Automatic Acquisition of Paradigmatic Relations using Iterated
Co-occurrences |
Automatic Acquisition of Sense Examples using ExRetriever
|
Automatic Bilingual Lexicon Acquisition Using Random Indexing of
Aligned Bilingual Data |
Automatic Building Gazetteers of Co-referring Named Entities
|
Automatic Classification of Geographical Named Entities
|
Automatic Generation of Glosses in the OntoLearn System
|
Automatic Keyword Extraction from Spoken Text. A Comparison of
two Lexical Resources: the EDR and WordNet |
Automatic Language-Independent Induction of Gazetteer Lists
|
Automatic Sentence Simplification for Subtitling in Dutch and
English |
Automatic transformation of phrase treebanks to dependency trees
|
Automatic Translation Memory Fuzzy Match Post-Editing: A Step
beyond Traditional TM/MT Integration
|
Bayesian Semantics Incorporation to Web Content for Natural
Language Information Retrieval
|
Beyond TREC's Filtering Track
|
BootCaT: Bootstrapping Corpora and Terms from the Web
|
Building a Maritime Domain Lexicon: a Few Considerations on the
Database Structure and the Semantic Coding
|
Building and Using a Corpus of Shallow Dialog Annotated Meetings
|
Building Part-of-speech Corpora through Histogram Hopping |
Calibrating Resource-light Automatic MT Evaluation: A Cheap
Approach to Ranking MT Systems by the Usability of their Output |
Can Anaphoric Definite Descriptions be Replaced by Pronouns? |
Categorizing Web Pages as a Preprocessing Step for Information
Extraction |
CHeM: A System for the Automatic Analysis of e-mails in the
Restoration and Conservation Domain |
Cluster Analysis and Classification of Named Entities |
Clustering Concept Hierarchies from Text |
CoGesT: A Formal Transcription System for Conversational Gesture |
Collection of SLR in the Asian-Pacific area
|
Collocation Extraction Using Web Statistics
|
Combining Heterogeneous Lexical Resources |
Comparative Evaluation Of A Stochastic Parser On Semantic And
Syntactic-Semantic Labels |
Computing Reliability for Coreference Annotation |
Concept Creation in Lexical Ontologies
|
Connector Usage in the English Essay Writing of Japanese EFL
Learners |
Consistent Storage of Metadata in Inference Lexica: The MetaLex
Approach |
Constructing Word-Sense Association Networks from Bilingual
Dictionary and Comparable Corpora
|
Conversational Telephone Speech Corpus Collection for the NIST
Speaker Recognition Evaluation 2004
|
Converting Treebank Annotations to Language Neutral Syntax
|
Creation of a Doctor-Patient Dialogue Corpus Using Standardized
Patients |
Creation of reusable components and language resources for Named
Entity Recognition in Russian
|
Cross-effective cross-lingual document classification
|
Cross-Language Acquisition of Semantic Models for Verbal
Predicates |
CST Bank: A Corpus for the Study of Cross-document Structural
Relationships |
Data Driven Ontology Evaluation |
Definition, dictionaries and tagger for Extended Named Entity
Hierarchy |
Designing a Realistic Evaluation of an End-to-end Interactive
Question Answering System |
Detecting Errors in English Article Usage with a Maximum Entropy
Classifier Trained on a Large, Diverse Corpus |
Detection of Domain Specific Terminology Using Corpora
Comparison |
Development of Bilingual Domain-Specific Ontology for Automatic
Conceptual Indexing |
Development of Ontologies with Minimal Set of Conceptual
Relations |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Enriching a Thai Lexical Database with Selectional Preferences |
Enriching WordNet Via Generative Metonymy and Creative Polysemy
|
EuroWordNet as a Resource for Cross-language Information
Retrieval |
Evaluating Conversation with Hans Christian Andersen
|
Evaluating Factors Impacting the Accuracy of Forced Alignments
in a Multimodal Corpus |
Evaluating Lexical Resources for A Semantic Tagger
|
Evaluating Name-Matching for Coreference Resolution
|
Evaluating Variants of the Lesk Approach for Disambiguating
Words |
Evaluation and Adaptation of a Specialised Language Checking
Tool for Non-specialised Machine Translation and Non-expert MT
Users for Multi-lingual Telecooperation
|
Evaluation of Cross-Language Information Retrieval Using the
Domain-Specific GIRT Data as Parallel German-English Corpus
|
Evaluation of Different Similarity Measures for the Extraction
of Multiword Units in a Reinforcement Learning Environment
|
Evaluation of Multi-party Virtual Reality Dialogue Interaction |
Evaluation of Transcription and Annotation Tools for a
Multi-modal, Multi-party Dialogue Corpus |
Evaluation Resources for Concept-based Cross-Lingual Information
Retrieval in the Medical Domain
|
Exploiting Anchor Text as a Lexical Resource |
Exploiting Language Resources for Semantic Web Annotations |
Exploiting Semantic Web Technologies for Intelligent Access to
Historical Documents |
Exploring Balkanet Shared Ontology for Multilingual Conceptual
Indexing |
Exploring Portability of Syntactic Information from English to
Basque |
Extending a verb-lexicon using a semantically annotated corpus |
Extending WordNets to Implicit Information |
FreeLing: An Open-Source Suite of Language Analyzers
|
French-English multi-word term alignment based on lexical
context analysis |
Frequent Term Distribution Measures for Dataset Profiling |
How Does Automatic Machine Translation Evaluation Correlate With
Human Scoring as the Number of Reference Translations Increases?
|
How to Disassemble Alphabetical Processions - Morphological
Treatment of Unknown Words |
Human dialogue modelling using annotated corpora
|
Identifying Definitions in Text Collections for Question
Answering |
Improving Collocation Extraction for High Frequency Words
|
Incremental Knowledge Acquisition from WordNet and EuroWordNet
|
Incremental Methods to Select Test Sentences for Evaluating
Translation Ability |
Information Retrieval System Using Latent Contextual Relevance
|
INSPIRE: Evaluation of a Smart-Home System for Infotainment
Management and Device Control
|
Integrated Language Technologies for Multilingual Information
Services in the MEMPHIS Project |
Issues in Corpus Cevelopment for Muli-party Multi-modal
Task-oriented Dialogue |
Language Model Adaptation for Statistical Machine Translation
based on Information Retrieval
|
Large Scale Experiments for Semantic Labeling of Noun Phrases in
Raw Text |
Linguistic Corpus Search
|
Linguistic Resources for Effective, Affordable, Reusable
Speech-to-Text |
MEAD - A Platform for Multidocument Multilingual Text
Summarization |
Meaningful Clusters
|
Mercedes, A Term-In-Context Highlighter |
Mining the Web for Discourse Markers |
Modelling Legitimate Translation Variation for Automatic
Evaluation of MT Quality |
MT Goes Farming: Comparing Two Machine Translation Approaches on
a New Domain |
MULTEXT-East Version 3: Multilingual Morphosyntactic
Specifications, Lexicons and Corpora |
Multi-Document Summarization using Multiple-Sequence Alignment
|
Multilingual Corpus-based Approach to the Resolution of English
-ing |
Multi-lingual Evaluation of a Natural Language Generation System
|
Multilingual Pattern Libraries for Question Answering: a Case
Study for Definition Questions
|
Multimodal Meaning Representation for Generic Dialogue Systems
Architectures |
NameNet: A Self-Improving Resource for Name Classification |
N-Gram Language Modeling for Robust Multi-Lingual Document
Classification |
NLP-enhanced Content Filtering within the POESIA Project |
OntoTag's Linguistic Ontologies: Enhancing Higher Level and
Semantic Web Annotations |
Open Resources for Language Technology
|
Open-source Tools for Creation, Maintenance, and Storage of
Lexical Resources for Language Generation from Ontologies
|
OrienTel - Telephony Databases Across Northern Africa and the
Middle East |
Parsing Ungrammatical Input: An Evaluation Procedure |
Part-of-Speech Annotation of Biology Research Abstracts
|
Polysemy and Category Structure in WordNet: An Evidential
Approach |
Prague Czech-English Dependency Treebank, Syntactically
Annotated Resources for Machine Translation |
Pronominal Anaphora Resolution for Unrestricted Text |
Proper Names and Polysemy: from a Lexicographic Experience
|
Publicly Available Topic Signatures for all WordNet Nominal
Senses |
Querying both time-aligned and hierarchical corpora with NXT
Search |
Raising the Bar: Stacked Conservative Error Correction Beyond
Boosting |
Resources and Techniques for Multilingual Information Extraction
|
Resources for Place Name Analysis |
Reusable Lexical Representations for Idioms
|
Re-using high-quality resources for continued evaluation of
automated summarization systems
|
RevisionBank: A Resource for Revision-based Multi-document
Summarization and Evaluation
|
Road-testing the English Resource Grammar over the British
National Corpus |
SALA II across the finish line: a large collection of mobile
telephone speech databases from North and Latin America
completed |
Selecting the Correct English Synset for a Spanish Sense
|
Semi-Automatic Construction of a Question Treebank |
Semi-automatic Syntactic and Semantic Corpus Annotation with a
Deep Parser |
Sinica BOW (Bilingual Ontological Wordnet): Integration of
Bilingual WordNet and SUMO
|
Some Meaning Procedures of Ontological Semantics
|
Spanish WordNet 1.6: Porting the Spanish WordNet Across
Princeton Versions |
Speech & Expression - The Value of a Longitudinal Corpus |
Steps towards Semantically Annotated Language Resources |
Summarization of Multimodal Information |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Term Translations in Parallel Corpora: Discovery and Consistency
Check |
Test Collections for Patent-to-Patent Retrieval and Patent Map
Generation in NTCIR-4 Workshop
|
Text Corpora, Local Grammars and Prediction |
Textual Distraction as a Basis for Evaluating Automatic
Summarisers |
The AAC [Austrian Academy Corpus] An Enterprise to Develop Large
Electronic Text Corpora
|
The Automatic Content Extraction (ACE) Program - Tasks, Data,
and Evaluation |
The Bilingual Web Dictionary on Demand
|
The Corpógrafo – a Web-based environment for corpora research
|
The Cross-Breeding of Dictionaries
|
The DeepThought Core Architecture Framework |
The Effect of Bias on an Automatically-built Word Sense Corpus |
The Fisher Corpus: A Resource for the Next Generations of
Speech-to-Text |
The GENOMA-KB Platform: Queries Over Integrated Linguistic
Resources |
The GENOMA-KB project: towards the integration of concepts,
terms, textual corpora and entities
|
The Integral Dictionary: An Ontological Resource for the
Semantic Web Integration of EuroWordNet, Balkanet, TID and SUMO
|
The Italian NESPOLE! Corpus: A Multilingual Database with
Interlingua Annotation in Tourism and Medical Domains |
The Mixer Corpus of Multilingual, Multichannel Speaker
Recognition Data |
The MULI Project: Annotation and Analysis of Information
Structure in German and English
|
The NIST Meeting Room Pilot Corpus
|
The OLISSIPO and LECTIO Projects |
The overview of the SST speech corpus of Japanese learner
English and evaluation through the experiment on automatic
detection of learners' errors
|
The Penn Discourse Treebank |
The Rationale for Building Resources Expressly for NLP
|
The Role of MultiWord Terminology in Knowledge Management
|
The Translation Correction Tool: English-Spanish User Studies |
Tiered Tagging Revisited |
Tone-of-Voice and Controlled Language Techniques
|
Top Ontology as a Tool for Semantic Role Tagging |
Towards basic categories for describing properties of texts in a
corpus |
Towards the MEANING Top Ontology: Sources of Ontological Meaning |
Training a Sentence-Level Machine Translation Confidence Measure
|
Unsupervised Text Mining for Ontology Extraction: An Evaluation
of Statistical Measures
|
Using Paradigm Tables to Generate New Utterances Similar to
those Existing in Linguistic Resources |
Using the NITE XML Toolkit on the Switchboard Corpus to Study
Syntactic Choice: A Case Study |
Using the Penn Treebank to Evaluate Non-Treebank Parsers
|
Using the Web as a Corpus for the Syntactic-Based Collocation
Identification |
Using Weighted Abduction to Align Term Variant Translations in
Bilingual Texts |
Using Weighted Abduction to Align Term Variant Translations in
Bilingual Texts |
Using WordNet to Measure Semantic Orientations of Adjectives |
Utilization of Multiple Language Resources for Robust
Grammar-Based Tense and Aspect Classification |
Utilizing the One-Sense-per-Discourse Constraint for Fully
Unsupervised Word Sense Induction and Disambiguation
|
Why do you ignore me? - Proof that not all direct speech is bad
|
Word Association Norms as a Unique Supplement of Traditional
Language Resources |
Word Sense Disambiguation as a Wordnets' Validation Method in
Balkanet |
Word Sense Disambiguation Using Random Indexing
|
You stupid tin box' - children interacting with the AIBO robot:
A cross-linguistic emotional speech corpus |
Semi-automatic Acquisition of Command Grammar |
English (in scientific
texts) |
An Annotation Scheme for a Rhetorical Analysis of Biology
Articles |
English (U.S.,
Belize) |
Developing Language Resources for a Transnational Digital
Government System |
Estonian |
MULTEXT-East Version 3: Multilingual Morphosyntactic
Specifications, Lexicons and Corpora
|
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Tiered Tagging Revisited |
F |
|
Farsi |
Creation of a Doctor-Patient Dialogue Corpus Using Standardized
Patients |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Finnish |
|
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
Infrastructure for Collaborative Annotation of Speech |
FR |
Multimodal Multilingual Resources in the Subtitling Process |
French |
A Chatbot as a Novel Corpus Visualization Tool
|
A complete understanding speech system based on semantic
concepts |
An Evaluation Protocol For Text Mining Tools : ALCESTE SAS TEXT
MINER SPAD-CRM AND TEMIS Text Mining Solutions Testing
|
Annotation of anaphoric expressions in an aligned bilingual
corpus |
Automatic audio and manual transcripts alignment, time-code
transfer and selection of exact transcripts
|
Automatisation Of The Activity Of Term Collection In Different
Languages |
Building Part-of-speech Corpora through Histogram Hopping |
Calibrating Resource-light Automatic MT Evaluation: A Cheap
Approach to Ranking MT Systems by the Usability of their Output |
Collecting and Sharing Bilingual Spontaneous Speech Corpora: the
ChinFaDial Experiment |
Development of New Telephone Speech Databases for French: The
NEOLOGOS Project |
Enriching a French Treebank |
Evaluating an Authentic Audio-Visual Expressive Speech Corpus |
Evaluation Of A Speech Cuer: From Motion Capture To A
Concatenative Text-To-Cued Speech System |
Evaluation of Consensus on the Annotation of Prosodic Breaks in
the Romance Corpus of Spontaneous Speech “C-ORAL-ROM”
|
Experiments on Building Language Resources for Multi-Modal
Dialogue Systems |
French-English multi-word term alignment based on lexical
context analysis |
Generating Coreferential Descriptions from a Structured Model of
the Context |
Intelligent Building of Language Resources for HLT Applications
|
Language Modeling using Dynamic Bayesian Networks |
Measurements of Spoken Language Variability in a Multilingual
Corpus. Predictable Aspects
|
MED-TYP: A Typological Database for Mediterranean Languages |
Metaphors in Wordnets: from Theory to Practice |
Methodology For Building Thematic Indexes In Medecine For French
|
Modelling Legitimate Translation Variation for Automatic
Evaluation of MT Quality |
Morphology Based Automatic Acquisition of Large-coverage Lexica |
Multilingual Corpus-based Approach to the Resolution of English
-ing |
NLP-enhanced Content Filtering within the POESIA Project |
OntoTag's Linguistic Ontologies: Enhancing Higher Level and
Semantic Web Annotations |
OrienTel - Telephony Databases Across Northern Africa and the
Middle East |
Resources and Techniques for Multilingual Information Extraction
|
SALA II across the finish line: a large collection of mobile
telephone speech databases from North and Latin America
completed |
Semi-automatic Acquisition of Command Grammar |
Semi-Automatic Derivation of a French Lexicon from CLIPS
|
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Term Translations in Parallel Corpora: Discovery and Consistency
Check |
The Bilingual Web Dictionary on Demand |
The C-ORAL-ROM CORPUS. A Multilingual Resource of Spontaneous
Speech for Romance Languages
|
The ESTER Evaluation Campaign for the Rich Transcription of
French Broadcast News |
The Integral Dictionary: An Ontological Resource for the
Semantic Web Integration of EuroWordNet, Balkanet, TID and SUMO
|
The Italian NESPOLE! Corpus: A Multilingual Database with
Interlingua Annotation in Tourism and Medical Domains |
Using the Web as a Corpus for the Syntactic-Based Collocation
Identification |
Using Weighted Abduction to Align Term Variant Translations in
Bilingual Texts |
French Sign Language |
Toward an Annotation Software for Video of Sign Language,
Including Image Processing Tools and Signing Space Modelling |
Friulan |
MED-TYP: A Typological Database for Mediterranean Languages |
G |
|
Gaelic |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Galician |
A Galician Textual Corpus for Morphosyntactic Tagging with
Application to Text-to-Speech Synthesis
|
Parallel corpora for the Galician language: building and
processing of the CLUVI (Linguistic Corpus of the University of
Vigo) |
The COST278 pan-European Broadcast News Database
|
Transcrigal: A Bilingual System for Automatic Indexing of
Broadcast News |
General |
Distributional Consistency: As a General Method for Defining a
Core Lexicon |
German |
The BITS Speech Synthesis Corpus for German |
A High Quality Partial Parser for Annotating German Text Corpora
|
A powerful and versatile XML format for representing
role-semantic annotation
|
A Progress Report from the Linguistic Data Consortium: Recent
Activities in Resource Creation and Distribution and the
Development of Tools and Standards |
ALLES: Integrating NLP in ICALL Applications
|
An Annotated Corpus of Tutorial Dialogs on Mathematical Theorem
Proving |
An Annotated German-Language Medical Text Corpus as Language
Resource |
Annotating a corpus for building a domain-specific knowledge
base |
Automated Morphological Segmentation and Evaluation
|
Automatic Acquisition of Paradigmatic Relations using Iterated
Co-occurrences |
Automatic Bilingual Lexicon Acquisition Using Random Indexing of
Aligned Bilingual Data |
Automatic Methods to Supplement Broad-Coverage Subcategorization
Lexicons |
Automatic transformation of phrase treebanks to dependency trees
|
Automatic Translation Memory Fuzzy Match Post-Editing: A Step
beyond Traditional TM/MT Integration
|
Automatisation Of The Activity Of Term Collection In Different
Languages |
Bootstrapping a database of German multi-word expressions
|
CoGesT: A Formal Transcription System for Conversational Gesture |
Consistent Storage of Metadata in Inference Lexica: The MetaLex
Approach |
Corpus based Enrichment of GermaNet Verb Frames
|
Corpus-based Learning of Lexical Resources for German Named
Entity Recognition |
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
Development and Integration of the LDA-Toolkit into the COST249
SpeechDat (II) SIG Reference Recognizer
|
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Evaluation and Adaptation of a Specialised Language Checking
Tool for Non-specialised Machine Translation and Non-expert MT
Users for Multi-lingual Telecooperation
|
Evaluation of Cross-Language Information Retrieval Using the
Domain-Specific GIRT Data as Parallel German-English Corpus
|
Evaluation of Microphone Array Front-Ends for ASR - an Extension
of the AURORA Framework
|
Exploiting Coreference Annotations for Text-to-Hypertext
Conversion |
How to Disassemble Alphabetical Processions - Morphological
Treatment of Unknown Words
|
Identifying Morphosyntactic Preferences in Collocations
|
Integrated Language Technologies for Multilingual Information
Services in the MEMPHIS Project |
Intelligent Building of Language Resources for HLT Applications
|
Linguistic Corpus Search
|
MAUS Goes Iterative
|
Metaphors in Wordnets: from Theory to Practice |
Multilingual Corpus-based Approach to the Resolution of English
-ing |
N-Gram Language Modeling for Robust Multi-Lingual Document
Classification |
OntoTag's Linguistic Ontologies: Enhancing Higher Level and
Semantic Web Annotations |
OrienTel - Telephony Databases Across Northern Africa and the
Middle East |
Querying both time-aligned and hierarchical corpora with NXT
Search |
Resources and Techniques for Multilingual Information Extraction
|
Rethinking readability of digital editions - the case of the
AAC’s "Digital Brenner" |
SMOR: A German Computational Morphology Covering Derivation,
Composition, and Inflection
|
Speech recognition simulation and its application for Wizard of
Oz experiments |
Steps towards Semantically Annotated Language Resources |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction |
The AAC [Austrian Academy Corpus] An Enterprise to Develop Large
Electronic Text Corpora |
The COST 278 MASPER initiative - crosslingual speech recognition
with large telephone databases
|
The DeepThought Core Architecture Framework |
The Integral Dictionary: An Ontological Resource for the
Semantic Web Integration of EuroWordNet, Balkanet, TID and SUMO
|
The Italian NESPOLE! Corpus: A Multilingual Database with
Interlingua Annotation in Tourism and Medical Domains |
The MULI Project: Annotation and Analysis of Information
Structure in German and English
|
The Statistical Analysis of Morphosyntactic Distributions |
The TüBa-D/Z Treebank: Annotating German with a Context-Free
Backbone |
Tools for Upgrading Printed Dictionaries by Means of
Corpus-based Lexical Acquisition |
Towards a Dynamic Lexicon: Predicting the Syntactic Argument
Structure of Complex Verbs |
Unexpected Productions May Well be Errors
|
You stupid tin box' - children interacting with the AIBO robot:
A cross-linguistic emotional speech corpus
|
Pumping Documents Through a Domain and Genre Classification
Pipeline |
German (Deutsch) |
Evaluation Resources for Concept-based Cross-Lingual Information
Retrieval in the Medical Domain
|
German (Deutsch) |
Towards Ontology Engineering Based on Linguistic Analysis
|
Greek |
A Bayesian Model for Shallow Syntactic Parsing of Natural
Language Texts |
A Methodology and Associated Tools for Building Interlingual
Wordnets |
Bayesian Semantics Incorporation to Web Content for Natural
Language Information Retrieval
|
Bypassing Greeklish!
|
Corpus Design, Recording and Phonetic Analysis of Greek
Emotional Database
|
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
Cypriot Speech Database: Data Collection and Greek to Cypriot
Dialect Adaptation |
Exploring Balkanet Shared Ontology for Multilingual Conceptual
Indexing |
Handling Subtle Sense Distinctions through Wordnet Semantic
Types |
Learning to predict Pitch Accents using Bayesian Belief Networks
for Greek Language |
Multi-lingual Evaluation of a Natural Language Generation System
|
OrienTel - Telephony Databases Across Northern Africa and the
Middle East |
Reusing Language Resources for Speech Applications involving
Emotion |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
The COST278 pan-European Broadcast News Database
|
H |
|
Hebrew |
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
OrienTel - Telephony Databases Across Northern Africa and the
Middle East |
Hindi |
|
Automatic Generation of Compound Word Lexicon for Hindi Speech
Synthesis |
Automatic Language-Independent Induction of Gazetteer Lists |
Collection of SLR in the Asian-Pacific area
|
Information Extraction from Hindi Texts
|
Hungarian |
Combining symbolic and statistical methods in morphological
analysis and unknown word guessing
|
Creating open language resources for Hungarian |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction |
The COST 278 MASPER initiative - crosslingual speech recognition
with large telephone databases
|
Tiered Tagging Revisited
|
I |
|
Ibibio |
WALA: a multilingual resource repository for West African
Languages |
Iko |
WALA: a multilingual resource repository for West African
Languages |
Independent |
Towards A Language Infrastructure for the Semantic Web
|
Indic Scripts |
An XML Representation for Annotated Handwriting Datasets for
Online Handwriting Recognition
|
Experiences in Collection of Handwriting Data for Online
Handwriting Recognition in Indic Scripts
|
Irish |
Acquiring Reusable Multilingual Phonotactic Resources
|
Italian |
A2Q: an agent-based architecure for multilingual Q&A |
Automatisation Of The Activity Of Term Collection In Different
Languages |
BootCaT: Bootstrapping Corpora and Terms from the Web
|
Building a Large Grammar for Italian |
Building a Maritime Domain Lexicon: a Few Considerations on the
Database Structure and the Semantic Coding
|
Building Distributed Language Resources by Grid Computing |
CHeM: A System for the Automatic Analysis of e-mails in the
Restoration and Conservation Domain |
Computational Lexicography and Carlo Emilio Gadda, Principe
dell'Analisi e Duca della Buona Cognizione
|
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
Cross-Language Acquisition of Semantic Models for Verbal
Predicates |
Discovery of (New) Knowledge and the Analysis of Text Corpora
|
Evaluation of Consensus on the Annotation of Prosodic Breaks in
the Romance Corpus of Spontaneous Speech “C-ORAL-ROM”
|
How to Disassemble Alphabetical Processions - Morphological
Treatment of Unknown Words
|
Hybrid Constraints for Robust Parsing: First Experiments and
Evaluation |
Integrated Language Technologies for Multilingual Information
Services in the MEMPHIS Project
|
Introducing the La Repubblica Corpus: A large Annotated
TEI(XML)-Compliant Corpus of Newspaper Italian
|
Measurements of Spoken Language Variability in a Multilingual
Corpus. Predictable Aspects
|
MED-TYP: A Typological Database for Mediterranean Languages |
Metaphors in Wordnets: from Theory to Practice |
Multilingual Pattern Libraries for Question Answering: a Case
Study for Definition Questions
|
NLP-enhanced Content Filtering within the POESIA Project |
OntoTag's Linguistic Ontologies: Enhancing Higher Level and
Semantic Web Annotations |
Proper Names and Polysemy: from a Lexicographic Experience
|
Semantic Mark-up of Italian Legal Texts Through NLP-based
Techniques |
Semi-Automatic Derivation of a French Lexicon from CLIPS
|
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Term Translations in Parallel Corpora: Discovery and Consistency
Check |
The C-ORAL-ROM CORPUS. A Multilingual Resource of Spontaneous
Speech for Romance Languages
|
The Integral Dictionary: An Ontological Resource for the
Semantic Web Integration of EuroWordNet, Balkanet, TID and SUMO |
The Italian NESPOLE! Corpus: A Multilingual Database with
Interlingua Annotation in Tourism and Medical Domains |
Towards the MEANING Top Ontology: Sources of Ontological Meaning |
Unifying Lexicons in View of a Phonological and Morphological
Lexical DB |
Using cooccurrence statistics and the web to discover synonyms
in a technical language
|
Using PiTagger for Lemmatization and PoS Tagging of a
Spontaneous Speech Corpus: C-ORAL-ROM Italian |
Using Semantic Language Resources to Support Textual Inference
for Question Answering |
J |
|
Japanese |
A Comparative Study on Human Communication Behaviors and
Linguistic Characteristics for Speech-to-Speech Translation
|
A Comparison of Two Variant Corpora: The Same Content with
Different Sources |
A Lexicon Module for a Grammar Development Environment |
An Information Repository Model for Advanced Question Answering
Systems |
Automatic Extraction of Hyponyms from Japanese Newspapers Using
Lexico-syntactic Patterns
|
Building a Paraphrase Corpus for Speech Translation
|
Classification of Japanese Spatial Nouns
|
Collecting Spontaneously Spoken Queries for Information
Retrieval |
Comparison of some automatic and manual methods for summary
evaluation based on the Text Summarization Challenge 2 |
Concept-based queries: Combining and Reusing Linguistic Corpus
Formats and Query Languages
|
Consistent Storage of Metadata in Inference Lexica: The MetaLex
Approach |
Constructing Word-Sense Association Networks from Bilingual
Dictionary and Comparable Corpora
|
Co-reference in Japanese Task-oriented Dialogues: A Contribution
to the Development of Language-specific and Language-general
Annotation Schemes and Resources
|
Definition, dictionaries and tagger for Extended Named Entity
Hierarchy |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Evaluating the FOKS Error Model
|
Extraction of Hyperonymy of Adjectives from Large Corpora by
Using the Neural Network Model |
How Does Automatic Machine Translation Evaluation Correlate With
Human Scoring as the Number of Reference Translations Increases?
|
Incremental Methods to Select Test Sentences for Evaluating
Translation Ability |
Korean-Chinese-Japanese Multilingual Wordnet with Shared
Semantic Hierarchy |
Making an XML-based Japanese-Slovene Learners' Dictionary
|
Multilingual Corpus-based Approach to the Resolution of English
-ing |
Perceptual Evaluation of Quality Deterioration Owing to Prosody
Modification |
Phrase-Based Dependency Evaluation of a Japanese Parser |
Related Word-pairs Extraction without Dictionaries
|
Semi-supervised learning by Fuzzy clustering and Ensemble
learning |
Speech & Expression - The Value of a Longitudinal Corpus |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction |
Terminal Device Oriented Comparable Corpora and its Alignment --
Towards Extracting Paraphrasing Patterns -- |
Test Collections for Patent-to-Patent Retrieval and Patent Map
Generation in NTCIR-4 Workshop
|
Toward Text Understanding: Integrating Relevance-tagged Corpus
and Automatically Constructed Case Frames
|
Collection of SLR in the Asian-Pacific area
|
K |
|
Korean |
A Comparison of Two Variant Corpora: The Same Content with
Different Sources |
A Progress Report from the Linguistic Data Consortium: Recent
Activities in Resource Creation and Distribution and the
Development of Tools and Standards |
Collection of SLR in the Asian-Pacific area
|
Creation and Assessment of Korean Speech and Noise DB in Car
Environment |
Korean-Chinese-Japanese Multilingual Wordnet with Shared
Semantic Hierarchy |
Lexical Analysis of Agglutinative Languages Using a Dictionary
of Lemmas and Lexical Transducers
|
Sejong Korean Corpora in the Making
|
Test Collections for Patent-to-Patent Retrieval and Patent Map
Generation in NTCIR-4 Workshop
|
L |
|
Language independent |
A Global Data Category Registry for Interoperable Language
Resources |
Data Driven Ontology Evaluation |
Evaluation of Microphone Array Front-Ends for ASR - an Extension
of the AURORA Framework |
Infrastructure for Collaborative Annotation of Speech
|
Online Evaluation of Coreference Resolution
|
Principles of a system for terminological concept modelling
|
A Graphical Tool for Handling Rule Grammars in Java Speech
Grammar Format |
A Search Tool for Corpora with Positional Tagsets and
Ambiguities |
Highlighting latent structure in documents
|
Linguistic Corpus Search
|
Pumping Documents Through a Domain and Genre Classification
Pipeline |
Standardization in Multimodal Content Representation: Some
Methodological Issues |
SVMTool: A general POS tagger generator based on Support Vector
Machines |
Towards an International Standard on Feature Structure
Representation |
Language-independent
(multilingual interface) |
An Environment for Dialogue Corpora Collection (ENDIACC) |
Latvian |
MULTEXT-East Version 3: Multilingual Morphosyntactic
Specifications, Lexicons and Corpora
|
Lithuanian |
MULTEXT-East Version 3: Multilingual Morphosyntactic
Specifications, Lexicons and Corpora
|
M |
|
Maltese |
MED-TYP: A Typological Database for Mediterranean Languages |
Mambila |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Mandarin |
A Progress Report from the Linguistic Data Consortium: Recent
Activities in Resource Creation and Distribution and the
Development of Tools and Standards |
Collection of SLR in the Asian-Pacific area
|
Conversational Telephone Speech Corpus Collection for the NIST
Speaker Recognition Evaluation 2004
|
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction |
The Fisher Corpus: A Resource for the Next Generations of
Speech-to-Text |
The Mixer Corpus of Multilingual, Multichannel Speaker
Recognition Data |
Many |
Current Projects in Languages of Military Interest at the
Defense Language Institute |
Maori |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Mapudungun |
Data Collection and Analysis of Mapudungun Morphology for
Spelling Correction |
Mexican Spanish |
VOXMEX Speech Database: Design of a Phonetically Balanced Corpus |
Modern Greek |
MED-TYP: A Typological Database for Mediterranean Languages |
Modern Greek in a multilingual context |
Creating multi-purpose linguistic resources for Modern Greek: a
deep Modern Greek Grammar |
Modern Hebrew |
MED-TYP: A Typological Database for Mediterranean Languages |
Modern Standard
Arabic |
MED-TYP: A Typological Database for Mediterranean Languages |
Moroccan Arabic |
An Emerging Transcontinental Collaborative Research and
Education Agenda in Human Language Technologies |
Multilingual |
OntoTag's Linguistic Ontologies: Enhancing Higher Level and
Semantic Web Annotations |
Rethinking Reusable Resources
|
Semi-automatic UNL Dictionary Generation using WordNet.PT
|
Multilingual approach |
Automatic Translation Memory Fuzzy Match Post-Editing: A Step
beyond Traditional TM/MT Integration |
Intelligent Building of Language Resources for HLT Applications |
Multiple |
NIST Language Technology Evaluation Cookbook
|
SLR Validation: Current Trends and Developments
|
N |
|
Nahuatl |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Norwegian |
A Lexicon Module for a Grammar Development Environment |
Memory-based Classification of Proper Names in Norwegian |
O |
|
Old-Church Slavonic |
Towards Intelligent Written Cultural Heritage Processing -
Lexical Processing |
OWL |
Ontology Evaluation Functionalities of RDF(S), DAML+OIL, and OWL
Parsers and Ontology Platforms |
P |
|
Persian |
Creation of a Doctor-Patient Dialogue Corpus Using Standardized
Patients |
Polish |
A Search Tool for Corpora with Positional Tagsets and
Ambiguities |
Extraction of Polish Named-Entities
|
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Portuguese |
A Multilingual Database of Idioms |
An Efficient Word Confidence Measure Using Likelihood Ratio
Scores |
Design and Implementation of a Semantic Search Engine for
Portuguese |
Evaluating Solutions for the Rapid Development of
State-of-the-Art POS taggers for Portuguese
|
Evaluation of Consensus on the Annotation of Prosodic Breaks in
the Romance Corpus of Spontaneous Speech “C-ORAL-ROM”
|
Extending WordNets to Implicit Information |
INQUER: A WordNet-based Question-Answering Application
|
Measurements of Spoken Language Variability in a Multilingual
Corpus. Predictable Aspects
|
Multifunctional Computational Lexicon of Contemporary
Portuguese: An Available Resource for Multitype Applications
|
On the problems of creating a golden standard of inflected forms
in Portuguese |
Portuguese Large-scale Language Resources for NLP Applications
|
Providing on-line access to Portuguese language resources:
corpora and lexicons |
SALA II across the finish line: a large collection of mobile
telephone speech databases from North and Latin America
completed |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
The C-ORAL-ROM CORPUS. A Multilingual Resource of Spontaneous
Speech for Romance Languages |
The Corpógrafo – a Web-based environment for corpora research
|
The COST278 pan-European Broadcast News Database
|
The Integral Dictionary: An Ontological Resource for the
Semantic Web Integration of EuroWordNet, Balkanet, TID and SUMO |
The Lácio-Web: Corpora and Tools to advance Brazilian Portuguese
Language Investigations and Computational Linguistic Tools
|
The Verb in the Terminological Collocations. Contribution to the
Development of a Morphological Analyser. MorphoComp |
What is my Style? Using Stylistic Features of Portuguese Web
Texts to classify Web pages according to Users' Needs |
Portuguese (European) |
An Acoustic Corpus Contemplating Regional Variation for Studies
of European Portuguese Nasals |
Potentiallly all |
Using the Penn Treebank to Evaluate Non-Treebank Parsers
|
Provençal |
MED-TYP: A Typological Database for Mediterranean Languages |
Q
|
|
Q’anjob’al (Mayan
Guatemala) |
Applying Computational Linguistic Techniques in a Documentary
Project for Q’anjob’al (Mayan Guatemala)
|
Quechua |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
R |
|
RDF(S)
|
Ontology Evaluation Functionalities of RDF(S), DAML+OIL, and OWL
Parsers and Ontology Platforms |
Resian |
MULTEXT-East Version 3: Multilingual Morphosyntactic
Specifications, Lexicons and Corpora
|
Romanian |
A Methodology and Associated Tools for Building Interlingual
Wordnets |
A word alignment system based on a translation equivalence
extractor |
Exploring Balkanet Shared Ontology for Multilingual Conceptual
Indexing |
MULTEXT-East Version 3: Multilingual Morphosyntactic
Specifications, Lexicons and Corpora |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Tiered Tagging Revisited
|
Word Sense Disambiguation as a Wordnets' Validation Method in
Balkanet |
Russian |
A Flexible Language Acquisition Tool Kit for Natural Language
Processing |
A Progress Report from the Linguistic Data Consortium: Recent
Activities in Resource Creation and Distribution and the
Development of Tools and Standards |
Conversational Telephone Speech Corpus Collection for the NIST
Speaker Recognition Evaluation 2004
|
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
Creation of reusable components and language resources for Named
Entity Recognition in Russian |
Development of Bilingual Domain-Specific Ontology for Automatic
Conceptual Indexing |
Development of Ontologies with Minimal Set of Conceptual
Relations |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Integration of Russian Language Resources |
MULTEXT-East Version 3: Multilingual Morphosyntactic
Specifications, Lexicons and Corpora |
Russian Information Retrieval Evaluation Seminar |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction |
The AAC [Austrian Academy Corpus] An Enterprise to Develop Large
Electronic Text Corpora
|
The Mixer Corpus of Multilingual, Multichannel Speaker
Recognition Data |
Towards basic categories for describing properties of texts in a
corpus |
Word Association Norms as a Unique Supplement of Traditional
Language Resources |
S |
|
Sardinian |
MED-TYP: A Typological Database for Mediterranean Languages |
Scottish |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Serbian |
A Methodology and Associated Tools for Building Interlingual
Wordnets |
Combining Heterogeneous Lexical Resources
|
Exploring Balkanet Shared Ontology for Multilingual Conceptual
Indexing |
MULTEXT-East Version 3: Multilingual Morphosyntactic
Specifications, Lexicons and Corpora |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Towards the Use of Word Stems and Suffixes for Statistical
Machine Translation |
Serbo-Croatian |
MED-TYP: A Typological Database for Mediterranean Languages |
Slovak |
The COST 278 MASPER initiative - crosslingual speech recognition
with large telephone databases
|
Slovakian |
The COST278 pan-European Broadcast News Database
|
Slovene |
Making an XML-based Japanese-Slovene Learners' Dictionary
|
MED-TYP: A Typological Database for Mediterranean Languages |
MULTEXT-East Version 3: Multilingual Morphosyntactic
Specifications, Lexicons and Corpora
|
Tiered Tagging Revisited
|
Slovenian |
A data-driven adaptation of prosody in a multilingual TTS |
Acquisition and Annotation of Slovenian Broadcast News Database
|
Creating Slovenian Language Resources for Development of
Speech-to-Speech Translation Components
|
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
Development and Integration of the LDA-Toolkit into the COST249
SpeechDat (II) SIG Reference Recognizer
|
Development of Slovenian Broadcast News Speech Database
|
The COST 278 MASPER initiative - crosslingual speech recognition
with large telephone databases
|
The COST278 pan-European Broadcast News Database
|
Sotho |
The African Speech Technology Project: An Assessment
|
South African English |
The African Speech Technology Project: An Assessment |
Spanish |
A Progress Report from the Linguistic Data Consortium: Recent
Activities in Resource Creation and Distribution and the
Development of Tools and Standards |
ALLES: Integrating NLP in ICALL Applications |
Application of the BLEU Method for Evaluating Free-text Answers
in an E-learning Environment |
Automatically selecting domain markers for terminology
extraction |
AV@CAR: A Spanish Multichannel Multimodal Corpus for In-Vehicle
Automatic Audio-Visual Speech Recognition
|
Bilingual Connections for Trilingual Corpora: An XML Approach |
Construction of a Bilingual Arabic-Spanish Lexicon of Verbs
Based on a Parallel Corpus |
Conversational Telephone Speech Corpus Collection for the NIST
Speaker Recognition Evaluation 2004
|
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
Cross-effective cross-lingual document classification
|
Cross-Language Acquisition of Semantic Models for Verbal
Predicates |
Development and Integration of the LDA-Toolkit into the COST249
SpeechDat (II) SIG Reference Recognizer
|
Development of Resources for a Bilingual Automatic Index System
of Broadcast News in Basque and Spanish |
Enriching EWN with Syntagmatic Information by means of WSD |
Enriching the Spanish EuroWordNet by Collocations
|
EuroWordNet as a Resource for Cross-language Information
Retrieval |
Evaluation of Consensus on the Annotation of Prosodic Breaks in
the Romance Corpus of Spontaneous Speech “C-ORAL-ROM”
|
FreeLing: An Open-Source Suite of Language Analyzers
|
Intelligent Building of Language Resources for HLT Applications
|
Lexical Entry Templates for Robust Deep Parsing
|
Measurements of Spoken Language Variability in a Multilingual
Corpus. Predictable Aspects
|
MED-TYP: A Typological Database for Mediterranean Languages |
Mercedes, A Term-In-Context Highlighter |
Methodology for Rapid Prototyping and Testing of ASR Based User
Interfaces |
MiniCors and Cast3LB: Two Semantically Tagged Spanish Corpora
|
Multilingual Corpus-based Approach to the Resolution of English
-ing |
Multiple Sequence Alignment for characterizing the linear
structure of revision |
NLP-enhanced Content Filtering within the POESIA Project |
NLP-enhanced error Checking for Catalan unrestricted text |
OntoTag's Linguistic Ontologies: Enhancing Higher Level and
Semantic Web Annotations |
SALA II across the finish line: a large collection of mobile
telephone speech databases from North and Latin America
completed |
Selecting the Correct English Synset for a Spanish Sense
|
Semantic categorization of Spanish se-constructions
|
Spanish WordNet 1.6: Porting the Spanish WordNet Across
Princeton Versions |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction |
The C-ORAL-ROM CORPUS. A Multilingual Resource of Spontaneous
Speech for Romance Languages
|
The COST 278 MASPER initiative - crosslingual speech recognition
with large telephone databases
|
The GENOMA-KB Platform: Queries Over Integrated Linguistic
Resources |
The GENOMA-KB project: towards the integration of concepts,
terms, textual corpora and entities
|
The Integral Dictionary: An Ontological Resource for the
Semantic Web Integration of EuroWordNet, Balkanet, TID and SUMO
|
The Mixer Corpus of Multilingual, Multichannel Speaker
Recognition Data |
The SPARTACUS-Database: a Spanish Sentence Database for Offline
Handwriting Recognition |
The Translation Correction Tool: English-Spanish User Studies |
Towards the MEANING Top Ontology: Sources of Ontological Meaning |
Towards the Use of Word Stems and Suffixes for Statistical
Machine Translation |
Training a Sentence-Level Machine Translation Confidence Measure
|
Transcrigal: A Bilingual System for Automatic Indexing of
Broadcast News |
Translation memories enrichment by statistical bilingual
segmentation |
Spanish (Latin
American) |
Developing Language Resources for a Transnational Digital
Government System |
Swahili |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Swedish |
A pattern extraction workbench combining multiple linguistic
levels |
Finding the Correct Interpretation of Swedish Compounds a
Statistical Approach |
MT Goes Farming: Comparing Two Machine Translation Approaches on
a New Domain |
Open Resources for Language Technology |
Probabilistic Detection of Context-Sensitive Spelling Errors
|
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
T
|
|
Tamil |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Thai |
Collection of SLR in the Asian-Pacific area
|
Enriching a Thai Lexical Database with Selectional Preferences |
Open Collaborative Development of the Thai Language Resources
for Natural Language Processing |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Tibetan |
A Syntactically Annotated Corpus of Tibetan
|
Turkish |
A Methodology and Associated Tools for Building Interlingual
Wordnets |
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
Development of a Corpus Workbench for the METU Turkish Corpus
|
Duration Modeling for Turkish Text-to-Speech Synthesis System |
Exploring Balkanet Shared Ontology for Multilingual Conceptual
Indexing |
MED-TYP: A Typological Database for Mediterranean Languages |
OrienTel - Telephony Databases Across Northern Africa and the
Middle East |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Tzeltal |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
Tzotzil |
Talkbank: Building an Open Unified Multimodal Database of
Communicative Interaction
|
U |
|
Universal |
A Large Metadata Domain of Language Resources
|
Architecture for Distributed Language Resource Management and
Archiving |
Cross-Disciplinary Integration of Metadata Descriptions
|
Design of an Interactive Web-based User Interface for Speech
Database Query Formation
|
US-English |
Bilingual Connections for Trilingual Corpora: An XML Approach |
Creation and Validation of Large Lexica for Speech-to-Speech
Translation Purposes |
V |
|
Various Native
American |
An Emerging Transcontinental Collaborative Research and
Education Agenda in Human Language Technologies |
Vietnamese |
Developping tools and building linguistic resources for
Vietnamese morpho-syntactic processing
|
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Spoken and Written Language Resources for Vietnamese
|
W |
|
Warao |
A Multi-Modal Documentation System for Warao |
Warlpiri |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
X |
|
Xhosa |
The African Speech Technology Project: An Assessment
|
Z |
|
Zeltal |
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary
Development Report |
Zulu |
Software Tools for Morphological Tagging of Zulu Corpora and
Lexicon Development |
The African Speech Technology Project: An Assessment |