Johns Hopkins Computer Science Home
Johns Hopkins University The Whiting School of Engineering

Natural Language Processing
Prof. Jason Eisner
Course # 600.465 - Fall 2011

parse trees

Announcements


Vital Statistics [1]

Course catalog entry: This course is an in-depth overview of techniques for processing human language. How should linguistic structure and meaning be represented? What algorithms can recover them from text? And crucially, how can we build statistical models to choose among the many legal answers?

The course covers methods for trees (parsing and semantic interpretation), sequences (finite-state transduction such as tagging and morphology), and words (sense and phrase induction), with applications to practical engineering tasks such as information retrieval and extraction, text classification, part-of-speech tagging, speech recognition, and machine translation. There are a number of structured but challenging programming assignments. Prerequisite: 600.226 or equivalent. [Eisner, Applications, Fall] 3 credits

More information: Welcome! This course is designed to introduce you to some of the problems and solutions of NLP, and their relation to linguistics and statistics. You need to know how to program (e.g., 600.120) and use common data structures (600.226). It might also be nice to have some previous familiarity with automata (600.271) and probabilities (600.475, 550.420, or 550.310). At the end you should agree (I hope!) that language is subtle and interesting, feel some ownership over some of NLP's formal and statistical techniques, and be able to understand research papers in the field.

Lectures:MWF 3-4 pm (sometimes 3-4:30), Hackerman B17
Prof:Jason Eisner - (image of email address) ((image of email address))
TA:Xuchen Yao - (image of email address) (xuchen at cs dot jhu dot edu)
CA:TBA
Office hrs: For Prof: MW 4pm after class (or by appt) in Hackerman 324C
For TA: M 12-1, Th 11-12, in Hackerman 226
Discussion
session:
TA-led session (optional) for activities/discussion/questions/review: Th 3:30-4:30, in Hackerman 306
Discussion site: http://piazza.com/class#fall2011/600465 ... public questions, discussion, announcements
Web page:http://cs.jhu.edu/~jason/465
Textbook: Jurafsky & Martin, 2nd ed. (semi-required - P98.J87 2009 in Science Ref section on C-Level)
Roark & Sproat (recommended - P98.R63 2007 in same section)
Manning & Schütze (recommended - free online PDF version here!)
Policies: Grading: homework 50%, participation 5%, midterm 15%, final 30%
Submission: TBA
Lateness: floating late days policy
Honesty: here's what it means
Intellectual engagement: much encouraged
Announcements: Read mailing list and this page!
Related
sites:
List of many courses - some may have useful material!

Schedule

This class is in the 3-4:30 time slot, a "flex slot" that is intended to permit either three 50-minute lectures or two 75-minute lectures per week. Please try to keep the whole slot free, especially on MW:

Warning: For future lectures and assignments, the links below take you to last year's versions, which are subject to change. The schedule may also change.

Warning: I sometimes turn off the PDF links when they are not up to date with the PPT links. If they don't work, just click on "ppt" instead.


Week Monday Wednesday Friday Suggested Reading
8/29 Introduction (ppt)
  • Why is NLP hard?
  • Levels of language
  • NLP applications
  • Random language via n-grams
  • Questionnaire
  • Assignment 1 given: Designing CFGs
    Chomsky hierarchy (ppt)
  • What's wrong with n-grams?
  • Regular expressions, CFGs, & more
  • Lists, trees, and vectors
  • Language models (ppt)
  • Language ID
  • Text categorization
  • Spelling correction
  • Segmentation
  • Speech recognition
  • Machine translation
  • Intro: J&M chapter 1
  • Chomsky hierarchy: J&M 16
  • Language models: M&S 6 (or R&S 6)
  • Text cat: Demo
  • Homework: J&M 12, M&S 3
  • 9/5 No class (Labor Day) Probability concepts (ppt; video lecture)
  • Joint & conditional prob
  • Chain rule and backoff
  • Modeling sequences
  • Cross-entropy and perplexity
  • Bayes' Theorem (ppt)
    Smoothing n-grams (ppt)
  • Maximum likelihood estimation
  • Bias and variance
  • Add-one or add-lambda smoothing
  • Conditional log-linear models
  • Regularization
  • Prob/Bayes: M&S 2; slides from Martin or Moore
  • Smoothing: M&S 6; J&M 4; Rosenfeld (2000)
  • 9/12
  • Assignment 1 due
        (& another sign meant 3 ... ?)
    Assignment 2 given: Probabilities
    Limitations of CFG
  • Discussion of Asst. 1
  • Improving CFG with features (ppt)
  • Morphology
  • Lexicalization
  • Tenses
  • Gaps (slashes)
  • Catch-up day
        (we'll be behind schedule by now)
  • Features: J&M 15
  • 9/19 Assignment 2 due
    Assignment 3 given: Language Models
    Context-free parsing (ppt)
  • What is parsing?
  • Why is it useful?
  • Brute-force algorithm
  • CKY and Earley algorithms
  • Context-free parsing
  • From recognition to parsing
  • Incremental strategy
  • Dotted rules
  • Sparse matrices
  • Earley's algorithm (ppt)
  • Top-down parsing
  • Earley's algorithm
  • Parsing: J&M 13
  • 9/26 (not covered this year)
    Extending CFG (summary (ppt))
  • CCG
  • TSG, TAG, TIG
  • No class (Rosh Hashanah)
    Probabilistic parsing (ppt)
  • PCFG parsing
  • Dependency grammar
  • Lexicalized PCFGs
  • CCG: Steedman & Baldridge; more
  • TAG/TSG: Van Noord, Guo, Zhang 1/2/3
  • Prob. parsing: M&S 12, J&M 14
  • 10/3 Assignment 3 due
    Assignment 4 given: Parsing
    Parsing tricks (ppt)
  • Pruning; best-first
  • Rules as regexps
  • Left-corner strategy
  • Smoothing
  • Evaluation
  • Catch-up day
        (we'll be behind schedule by now)
  • A song about parsing
  • (not covered this year)
    Human sentence processing (ppt)
  • Methodology
  • Frequency sensitivity
  • Incremental interpretation
  • Unscrambling text (ppt)
  • TBA
    10/10 (Monday 10/10 is fall break day; but class meets on Tuesday 10/11, which will follow a Monday schedule)
    Semantics (ppt)
  • What is understanding?
  • Lambda terms
  • Semantic phenomena and representations
  • Semantics continued
  • More semantic phenomena and representations
  • Assignment 5 given: Semantics
    Semantics continued
  • Adding semantics to CFG rules
  • Compositional semantics
  • J&M 17-18; also this web page, up to but not including "denotational semantics" section; and you could try the Penn Lambda Calculator; and how about lambda calculus for kids?
    10/17 Midterm exam
    (3-4:30 in classroom)
    Forward-backward algorithm (ppt) (Excel spreadsheet; Viterbi version; lesson plan; video lecture)
  • Ice cream, weather, words and tags
  • Forward and backward probabilities
  • Inferring hidden states
  • Controlling the smoothing effect
  • Forward-backward continued
  • Reestimation
  • Likelihood convergence
  • Symmetry breaking
  • Local maxima
  • Uses of states
  • J&M 6 or perhaps Allen pp. 195-208 (handout); M&S 11
    10/24 Assignment 4 due
    Assignment 6 given: Hidden Markov Models
    Expectation Maximization (ppt)
  • Generalizing the forward-backward strategy
  • Inside-outside algorithm
  • Posterior decoding
  • Finite-state algebra (ppt)
  • Regexp review
  • Properties
  • Functions, relations, composition
  • Simple applications
  • Finite-state machines
  • Acceptors
  • Expressive power
  • Weights and semirings
  • Lattice parsing
  • Transducers
  • John Lafferty's inside-outside notes; R&S 1
    10/31 Finite-state implementation (ppt)
  • Finite-state operators
  • Uses of composition
  • Implementing the operators
  • Finite-state tagging (ppt)
  • The task
  • Hidden Markov Models
  • Transformation-based
  • Constraint-based
  • Assignment 5 due
    Noisy channels and FSTs (ppt)
  • Regexps and segmentation
  • The noisy channel generalization
  • Applications of the noisy channel
  • Implementation using FSTs
  • chaps 2-3 of xfst book draft (only accessible from barley and other Solaris machines at JHU CS; don't distribute); R&S 2; perhaps also this paper
    11/7 More FST examples (ppt)
  • Baby talk
  • Edit distance
  • Back-transliteration
  • Machine translation
  • Programming with regexps (ppt)
  • Analogy to programming
  • Extended finite-state operators
  • Date parsing
  • FASTUS
  • (covered only partially this year)
  • Morphology and phonology (ppt)
  • English, Turkish, Arabic
  • Stemming
  • Compounds, segmentation
  • Two-level morphology
  • Punctuation
  • Rewrite rules
  • OT
  • J&M 5 or M&S 10
    11/14 Assignment 7 given: Finite-State Modeling
    Log-linear models (ppt)
  • Text classification
  • Maximum entropy perspective
  • Gradient ascent
  • Assignment 6 due
    Structured prediction
  • Global vs. local normalization
  • Computing the gradient of conditional likelihood
  • Latent variables
  • Structured perceptron and PA
  • Global models
  • Conditioning on less
  • Energy-based modeling / MRFs
  • Simulated annealing
  • Gibbs sampling
  • J&M 6
    11/21 (not covered this year)
    Topic models (video lecture part 1, part 2)
  • Bayesian multinomial
  • Bayesian text categorization
  • Latent Dirichlet allocation
  • Dynamic topic models
  • No class (Thanksgiving break) No class
    (Thanksgiving break)
    M&S 15.2, 15.4
    11/28 (not covered this year)
    Machine translation
  • Synchronous grammars
  • Language models and lattice parsing
  • Alignments
  • Phrase-based and HMM models
  • Training MT systems
  • Resource discovery
  • Alignment
  • Extraction (with or without parses)
  • Evaluation metrics and tuning
  • Current NLP tasks and competitions (ppt)
  • The NLP community
  • Text annotation tasks
  • Other tasks
  • J&M 25, M&S 13, statmt.org;
    tutorial (2003), workbook (1999), introductory essay (1997), technical paper (1993);
    tutorial (2006) focusing on more recent developments (slides, 3-hour video part 1, part 2)
    12/5 Assignment 7 due
    NLP tasks continued

    Sun 12/11 is absolute deadline for late assignments --->

    Final exam: Thu 12/15, 9am-noon --->


    Old Materials

    Lectures from past years, some still useful: