|
ACL-2000 Tutorials Schedule
In this tutorial we
will review the state of the art in the development and application of
broad-coverage declarative grammars built on sound linguistic foundations
(the `deep' processing paradigm) and present several aspects of an international
research effort---a consortium involving Saarbruecken (Germany), Stanford
(USA) and the University of Tokyo (Japan)---to produce comprehensive,
re-usable grammars and efficient technology for parsing and generating
with such grammars. While statistical methods, often described as `shallow'
processing techniques, can bring real advantages in robustness and efficiency,
they do not provide the precise, reliable representations of meaning that
more conventional symbolic grammars can supply for natural language. We
will illustrate the benefits and viability of the declarative approach
both in multilingual grammar development (for English, German, and Japanese),
and in commercially relevant applications including machine translation,
speech prosthesis, and automated email response. The topics we will discuss
and demonstrate will include: descriptive formalism requirements, linguistic
framework and resources, grammar development tools, diagnostics and measurement,
processing efficiency, semantic engineering, re-usability and exchange
to support collaboration, and practical applications.
The
statistical approach to machine translation (MT) seeks to extract translation
knowledge automatically from online bilingual texts (e.g., publications
of the Canadian or Hong Kong governments). This idea can be traced back
to suggestions made by Warren Weaver in the 1940s. It was pioneered at
IBM in the 1990s and continues to be inspired by relative successes in
statistical speech recognition. We will present a technical, focused tutorial
that will cover the statistical MT literature to date. This tutorial will
not cover MT in the broad sense (transfer and interlingua approaches,
evaluation, commercial products, etc.)---we will instead concentrate on
statistical models proposed for the translation process, using accessible
graphical influence diagrams to explain models used in different research
projects around the world. We will also cover language models and "decoding"
algorithms that perform online translations.
The ACL meeting in Hong Kong presents a rare opportunity to bring together a number of well-known experts on NLP issues specific to Asian languages, especially word segmentation (morphology). Unlike English, it is a non-trivial problem in many languages to split a sequence of characters into a sequence of words because there isn't any white space. There is a lot of work on these problems taking place in many countries, but relatively little of the literature crosses language boundaries. We felt that the this ACL meeting would be an excellent chance to bring much of this work together. The tutorial will consist of a half dozen invited talks covering three languages (Chinese, Japanese and Korean) from computational as well as linguistic perspectives. We intend this tutorial to be as inclusive as possible. It should be of interest to both engineers and linguists, and accessible to a diverse audience including experts who have spent considerable time with Asian languages as well as novices like the chair whose experience is, for the most part, limited to a single Western language, namely English.
This tutorial will address
the application of techniques at the intersection of computational linguistics
and information retrieval to help users search multilingual collections.
The tutorial will draw from several perspectives, examining the contributions
of the computational linguistics and information retrieval communities
to cross-language information retrieval, and augmenting that with a discussion
of related issues from machine translation, text summarization and human-computer
interaction. Alternative techniques for each key component will be explained
and illustrated using working systems and reported experimental results.
Evaluation issues, best present practice, and open research questions
will be highlighted throughout the tutorial. The worldwide series of cross-language
information retrieval evaluation venues will also be introduced, with
particular attention to evaluations that focus on Asian languages. The
tutorial will conclude with an assessment of the prospects for adoption
of this technology for Internet searching, commercial information retrieval
systems, and special-purpose applications. |
|||||||||||||||||||||||||||||||||||||||||||