The scope of this workshop is to bring together designers, developers and users of content encoding practices in order to promote de facto standards for content creation, management and delivery. More specifically, the workshop provides a hands-on meeting meant to generate a manifesto of requirements and recommendations for best practices in content interoperability. Such a manifesto is intended to form the basis for a business-driven alliance of practitioners and users from the industries and research organizations in the area of attention, whose goal is to converge on a specific operational metadata scheme for virtual integration in the creation, management and delivery of content.
In the next five years, Human Language Technologies (HLT) related to content processing will find their steadiest and strongest growth within the market segments of Content Management and Syndication. In the US alone, Content Management software is expected to generate revenues of nearly $5.2 billion by 2005 (Merrill Lynch, 6/01) and up to $7.2 billion by 2006 (Butler Group). Jupiter (5/01) estimates about 70% of the archived content market --- $850 million in 2001 and estimated to grow to $4.6 billion in 2006 --- will be realized through syndication. The ensuing opportunities for language-based solutions to content understanding, extraction and generation tasks are therefore substantial.
The materialization of such prospected opportunities requires the availability of content interoperability standards capable to grant Human Language Technologies seamless integration in the content creation, management and delivery supply chain. Most current Content Management environments offer a relatively open software infrastructure in which a variety of HLT components can be integrated. However, the opportunity of operational integration into Content Management infrastructures, whether implemented in OEM or ASP mode, will have a fragmentary effect on the HLT industry as a whole if it continues to rely on proprietary metadata schemes. HLT solutions will continue to be tightly knit into the specific applications they were engineered to service and less adaptable to other systems, with a consequent lack of fair competition for HLT providers and paucity of choice for prospective buyers. Open content interoperability standards are thus necessary to stimulate market growth in the HLT sector and provide a healthy competitive environment for the HLT industry as a whole.
Emerging Semantic Web standards such as DAML and OIL are now starting to provide a description framework in which complex meaning relationships can be actively encoded and effectively engaged in categorization, search, navigation, retrieval and extraction technologies. However, Semantic Web standards must be tailored to the specific needs of vertical industries in order to promote business viability for the technologies they are intended to facilitate. This is demonstrated by the creation and use of metadata standards such as PRISM, NewsML, NITF and ICE in the publishing industry. In addressing the need for syndicating, aggregating, post-processing and multi-purposing content, these initiatives have built consortia where content providers and content management software vendors work together to create the "right" vocabulary and support in the leading software tools, facilitating widespread adoption throughout the industry.
The workshop comprises two working sessions. The morning session will consist of 3 invited talks (35 minutes each including discussion), each followed by two short papers (20 minutes each including discussion). The afternoon session will be devoted to an in-depth discussion of issues raised during the morning session with the aim of converging on a lockstep approach to the deliberation of content interoperability standards.
8:45-09:00 | Opening |
09:00-09:35 | Invited talk I |
09:35-10:15 | Two Short Papers |
10:15-10:50 | Invited Talk II |
10:50-11:30 | Two Short Papers |
11:30-11:45 | Coffee Break |
11:45-12:20 | Invited Talk II |
12:20-13:00 | Two Short Papers |
13:00-14:00 | On-site Lunch |
14:00-15:00 | Three Breakout Working Groups, tasked to critique the morning sessions |
15:00-15:30 | WG1 presentation & discussion |
15:30-15:45 | Coffee Break |
15:45-16:15 | WG2 presentation & discussion |
16:15-16:45 | WG3 presentation & discussion |
16:45-17:00 | Concluding remarks |
Admission to the workshop is limited to 40 participants and will be established upon submission of a one page statement of interest including:
Those interested in giving a talk, should also submit a position paper (~1500 words). Topics to be addressed in the statements of interest and position papers include, but are not limited to:
Statements of interest and position papers will be circulated among participants ahead of time to create a shared background and facilitate discussion before the event. A selection of statements of interest and position papers will be distributed in printed form as workshop proceedings. The results of the workshop will be compiled and published as a collection in a major trade journal.
Deadline for workshop abstract submission | 18th February 2002 |
Notification of acceptance | 8th March 2002 |
Final version of paper for proceedings | 5th April 2002 |
Workshop | 1st June 2002 |
David Allen | International PressTelecommunications Council (UK) |
Chris Porter | Factiva (UK) |
Chinastu Aone | SRA International (USA) |
Anna Bjarnestam | Getty Images (USA) |
Nicoletta Calzolari | Istituto di Linguistica Computazionale del CNR (Italy) |
Ido Dagan | LingoMotors, Inc. |
Ron Daniel Jr. | Interwoven (USA) |
Sharon Flank | eMotion (USA) |
Chris Green | Time Warner, Inc. (USA) |
Nancy Ide | Vassar College (USA) |
Roger Medlin | Artesia Technologies (USA) |
Eric Miller | W3C World Wide Web Consortium (USA) |
Tony Rose | Reuters (UK) |
Piek Vossen | Irion (The Netherlands) |