In the Information Society, the pervasive character of Human Language Technologies (HLT) and their relevance to practically all fields of Information Society Technologies (IST) has been widely
Two issues are particularly relevant : the availability of Language Resources (LRs) and the methods for the evaluation of resources, technologies, products and applications. Substantial mutual benefits are achieved by addressing these issues through international collaboration.
The term language resources (LRs) refers to sets of language data and descriptions in machine readable form, used in many types of areas/components/systems/applications :
creation and evaluation of natural language, speech and multimodal algorithms and systems,
software localisation and language services,
language enabled information and communication services,
e-commerce, e-publishing, e-learning, e-government,
This large range of uses makes the LRs infrastructure a strategic part of the e-society, where the creation of a basic set of LRs for all languages must be ensured in order to bring all languages
to the same level of usability and availability.
Examples of LRs are written or spoken corpora and lexica, which may be annotated or not, multimodal resources, grammars, terminology or domain specific databases and dictionaries, ontologies,
multimedia databases, etc. LRs also cover basic software tools for the acquisition, preparation, collection, management, customisation and use of the above mentioned examples.
The relevance of evaluation for language technologies development is increasingly recognised. This involves assessing the state-of-the-art for a given technology, measuring the progress achieved
within a programme, comparing different approaches to a given problem, assessing the availability of technologies for a given application, benchmarking, and assessing system usability and user
The aim of this conference is to provide an overview of the state-of-the-art, discuss problems and opportunities, exchange information regarding LRs, their applications, ongoing and planned
activities, industrial uses and needs, requirements coming from the new e-society, both with respect to policy issues and to technological and organisational ones. LREC will also elaborate on
evaluation methodologies and tools, explore the different trends and promote initiatives for international collaboration in the areas mentioned above.
Issues in the design, construction and use of Language Resources (LRs) :
Guidelines, standards, specifications, models and best practices for LRs,
Methods, tools and procedures for the acquisition, creation, management, access, distribution and use of LRs,
Methods for the extraction and acquisition of knowledge (e.g. terms, lexical information, language modelling) from LRs,
Organisational and legal issues in the construction, distribution, access and use of LRs,
Availability and use of generic vs. task/domain specific LRs,
Definition and requirements for a Basic and Extended LAnguage Resource Kit (BLARK, ELARK) for all languages,
Monolingual and multilingual LRs,
Multimedia and multimodal LRs. - Integration of various media and modalities in LRs (speech, vision, language),
Documentation and archiving of languages, including minority and endangered languages,
Ontologies and knowledge representation,
Terminology and NLP, tools and methodologies for terminology and ontology building, term extraction, specialised dictionaries,
LRs for linguistic research in human-machine communication,
Exploitation of LRs in different types of applications (information extraction, information retrieval, speech dictation, translation, summarisation, web services, semantic web, etc.),
Exploitation of LRs in different types of interfaces (dialog systems, natural language and multimodal/multisensorial interactions, etc.)
Industrial LRs requirements, user needs and community’s response,
Industrial production of LRs,
Industrial use of LRs,
Metadata descriptions of LRs.
Issues in Human Language Technologies (HLT) evaluation :
Evaluation, validation, quality assurance of LRs,
Evaluation methodologies, protocols and measures,
Benchmarking of systems and products, resources for benchmarking and evaluation, blackbox, glassbox and diagnostic evaluation of systems,
Usability and user experience evaluation, qualitative and perceptive evaluation,
Evaluation in written language processing (document production and management, text retrieval, terminology extraction, message understanding, text alignment, machine translation, morphosyntactic
tagging, parsing, semantic tagging, word sense disambiguation, text understanding, summarisation, question answering, localisation, etc.),
Evaluation in spoken language processing (speech recognition and understanding, voice dictation, oral dialog, speech synthesis, speech coding, speaker and language recognition, spoken translation,etc.),
Evaluation of multimedia document retrieval and search systems (including detection, indexing, filtering, alert, question answering, etc),
Evaluation of multimodal systems,
From evaluation to standardisation.
General issues :
National and international activities and projects,
LRs and the needs/opportunities of the emerging industries,
LRs and contributions to societal needs (e.g. e-society),
Priorities, perspectives, strategies in national and international policies for LRs,
Needs, possibilities, forms, initiatives of/for international cooperation, and their organisational and technological implications,
Open architectures for LRs.
The Conference targets the integration of different types of LRs (spoken, written and other modalities) and of the respective communities. To this end, LREC encourages submissions covering issues
which are common to different types of Language Technologies, such as dialog strategy, written and spoken translation, domain-specific data, multimodal communication or multimedia document processing, and will organise, in addition to the usual tracks, common sessions encompassing the different areas of LRs.