Tutorials

Program Information

ACL-2000 Main Page

Conference Program

Local Arrangement

Online Registration

Call for Sponsorship

Workshops

Student Research Workshop

Tutorials

Demos

Demos Program
Info for Demoers

Thematic Sessions

Conference Committee

Venue & Local Organization

Historical WebPages

Other Events

ICSLP 2000

IRAL-2000

Other Information

Accomodations

Interesting Hong Kong

Commuting to the HKUST

Main ACL Web Site

ACL-2000 Tutorials Schedule

Sunday 1 October, Monday 2 October 2000

Unification-based Processing Underway to Dot Com

October13:30-17:00

Dan Flickinger, CSLI Stanford & YY Software Corporation

Stephan Oepen, Saarland University

In this tutorial we will review the state of the art in the development and application of broad-coverage declarative grammars built on sound linguistic foundations (the `deep' processing paradigm) and present several aspects of an international research effort---a consortium involving Saarbruecken (Germany), Stanford (USA) and the University of Tokyo (Japan)---to produce comprehensive, re-usable grammars and efficient technology for parsing and generating with such grammars. While statistical methods, often described as `shallow' processing techniques, can bring real advantages in robustness and efficiency, they do not provide the precise, reliable representations of meaning that more conventional symbolic grammars can supply for natural language. We will illustrate the benefits and viability of the declarative approach both in multilingual grammar development (for English, German, and Japanese), and in commercially relevant applications including machine translation, speech prosthesis, and automated email response. The topics we will discuss and demonstrate will include: descriptive formalism requirements, linguistic framework and resources, grammar development tools, diagnostics and measurement, processing efficiency, semantic engineering, re-usability and exchange to support collaboration, and practical applications.

Statistical Machine Translation

1 October 13:30-17:00

Kevin Knight

USC/ISI

The statistical approach to machine translation (MT) seeks to extract translation knowledge automatically from online bilingual texts (e.g., publications of the Canadian or Hong Kong governments). This idea can be traced back to suggestions made by Warren Weaver in the 1940s. It was pioneered at IBM in the 1990s and continues to be inspired by relative successes in statistical speech recognition. We will present a technical, focused tutorial that will cover the statistical MT literature to date. This tutorial will not cover MT in the broad sense (transfer and interlingua approaches, evaluation, commercial products, etc.)---we will instead concentrate on statistical models proposed for the translation process, using accessible graphical influence diagrams to explain models used in different research projects around the world. We will also cover language models and "decoding" algorithms that perform online translations.
- Introduction
- History of statistical MT
- Substitution ciphers, light probability, noisy channel framework
- Transliteration: a case study of MT as codebreaking
- Sketch of a complete statistical MT system (training/translation modules)

- Building Blocks
- Acquisition and cleaning of training data
- Language modeling and training
- Translation modeling and training
- Online translation ("decoding")

- Assessment
- Empirical results: does it work?
- Strengths and weaknesses of statistical MT
- Related applications
- Immediate and long-term prospects

Morphology for Asian Languages

2 October 08:30-12:00

Kenneth Church

AT&T Labs Research (chair)

The ACL meeting in Hong Kong presents a rare opportunity to bring together a number of well-known experts on NLP issues specific to Asian languages, especially word segmentation (morphology). Unlike English, it is a non-trivial problem in many languages to split a sequence of characters into a sequence of words because there isn't any white space. There is a lot of work on these problems taking place in many countries, but relatively little of the literature crosses language boundaries. We felt that the this ACL meeting would be an excellent chance to bring much of this work together. The tutorial will consist of a half dozen invited talks covering three languages (Chinese, Japanese and Korean) from computational as well as linguistic perspectives. We intend this tutorial to be as inclusive as possible. It should be of interest to both engineers and linguists, and accessible to a diverse audience including experts who have spent considerable time with Asian languages as well as novices like the chair whose experience is, for the most part, limited to a single Western language, namely English.

Speakers:

Keh-Jiann Chen, Academia Sinica, Taiwan
Key-Sun Choi, Korea Advanced Institute of Science and Technology
Kiyong Lee, Korea University
Yuji Matsumoto, Nara Institute of Science and Technology, Japan
Masaaki Nagata, NTT Information and Communication Systems Laboratories, Japan
Benjamin K Tsou, City University of Hong Kong
Tianshun Yao, Northeast University, China

Committee Members:

Kenneth Church (chair), Key-Sun Choi, Yuji Matsumoto, Sung Hyon Myaeng,
Masaaki Nagata, Keh-Yih Su, Lua Kim Teng

Multilingual Information Access

2 October 08:30-12:00

Douglas W. Oard

University of Maryland

This tutorial will address the application of techniques at the intersection of computational linguistics and information retrieval to help users search multilingual collections. The tutorial will draw from several perspectives, examining the contributions of the computational linguistics and information retrieval communities to cross-language information retrieval, and augmenting that with a discussion of related issues from machine translation, text summarization and human-computer interaction. Alternative techniques for each key component will be explained and illustrated using working systems and reported experimental results. Evaluation issues, best present practice, and open research questions will be highlighted throughout the tutorial. The worldwide series of cross-language information retrieval evaluation venues will also be introduced, with particular attention to evaluations that focus on Asian languages. The tutorial will conclude with an assessment of the prospects for adoption of this technology for Internet searching, commercial information retrieval systems, and special-purpose applications.
[The tutorial will be joint between ACL and IRAL'2000, the 5th
International Workshop on Information Retrieval with Asian Languages]

ACL-2000 Tutorials Co-Chairs:

John Carroll
University of Sussex

Hemant Darbari
CDAC - Pune University

acl2k-tutorials@cogs.susx.ac.uk