Johns Hopkins Computer Science Home
Johns Hopkins University The Whiting School of Engineering

Statistical Language Learning
Prof. Jason Eisner
Course # 600.665 - Spring 2002

parse trees

"When the going gets tough, the tough get empirical" -- Jon Carroll


Course Description

Catalog description: This course focuses on past and present research that has attempted, with mixed success, to induce the structure of language from raw data such as text. Lectures will be intermixed with reading and discussion of the primary literature. Students will critique the readings, answer open-ended homework questions, and undertake a final project. [Applications]
Prereq: 600.465 or perm req'd.

The main goals of the seminar are (a) to cover some techniques people have tried for inducing hidden structure from text, (b) to get you thinking about how to do it better.

Since most of the techniques in (a) don't perform that well, (b) is more important.

The course should also help to increase your comfort with the building blocks of statistical NLP - weighted transducers, probabilistic grammars, graphical models, etc., and the supervised training procedures for these building blocks.

Links:


Vital Statistics

Lectures:MTW 2-3 pm, Shaffer 304 (but we'll move to the NEB 325a conference room if we're not too big)
Prof:Jason Eisner - jason@cs.jhu.edu
Office hrs: MW 3-4 pm, or by appt, in NEB 326
Web page:http://cs.jhu.edu/~jason/665
Mailing list:cs665@cs.jhu.edu (cs665 also works on NLP lab machines)
Textbook:none, but the textbooks for 465 may come in handy
Policies: Grading: 30% written responses (graded as check/check-plus, etc.), 30% class participation, 40% project.
Announcements: New readings announced by email and posted below.
Submission: Email me written responses to the whole week's readings by 11 am each Monday.
Academic honesty: dept. policy (but you can work in pairs on reading responses)

Readings and Responses

Generally we will discuss about 3 related papers each week. Since we may flit from paper to paper, comparing and contrasting, you should read all the papers by the start of the week.

A centerpiece of the course is the requirement to respond thoughtfully to each paper in writing. You should email me your responses to the upcoming week's papers, in separate plaintext or postscript messages, by noon each Monday. (Include "665 response" and the paper's authors in the subject line.) I will print the responses out for everyone, and they will anchor our class discussion. They will also be a useful source of ideas for your final projects.

A typical response is 1-3 paragraphs; in a given week you might respond at greater length to some papers than others. It's okay to work with another person. What should you write about? Some possibilities:

Please be as concrete as possible - and write clearly, since your classmates will be reading your words of wisdom!


The Readings

Suggestions for readings are welcome, especially well in advance.