Natural Language Processing
Prof. Jason Eisner
Course # 601.465/665 — Fall 2024
|
|
Announcements
- 11/23/24 HW8 is out! It's 1 page long with no reading handout, but goes with a couple of Python notebooks and some code to inspect.
You may work in pairs. Due date is Sunday, December 8, at
11:59pm (as late as we could make it without illegally cutting into reading period).
- 11/16/24 HW7
is finally available as well. We provided its medium-length reading handout
several days ago. You may continue to work with your HW6 partner if you like.
HW7 is now due on Tuesday, November 26, at
noon -- but try to finish it by Friday, Nov 22 so it doesn't ruin your Thanksgiving break!
(There may not be office hours etc. during Thanksgiving break.)
- 11/3/24 The rest of HW6
is finally available. We provided a long reading handout (several days ago) to review the
ideas and fill in some details. This homework shouldn't be
too hard conceptually if you followed the HMM and CRF
lectures, but you'll still have to keep track of a lot of
ideas, code, and experiments. You may work in pairs. The
deadline is Tuesday 11/12.
Note that HW7 will build on HW6, so you'll continue working
with this codebase (and optionally with the same partner).
- 10/21/24 HW5 is available, with a short "reading handout" appended to it. It deals with
attaching semantic λ-expressions to grammar rules. It is
due on Monday, 10/28, at 11pm.
- 9/27/24 HW4
is available, with a separate "reading handout" appended to it. You may want to do HW3 first,
but we're making HW4 available now so that you can read the
handout while parsing is still fresh in your mind from lecture.
The reading might also help you study parsing for the midterm.
This is the conceptually hardest homework project in the course, with two
major challenges: probabilistic Earley parsing, and making parsing
efficient. It is due on Monday, 10/21, at 11pm. You
may work with a partner on this one.
- 9/22/24 HW3
is now available, with a separate "reading handout" appended
to it. The due date is Sun, 10/6, at 11pm. Start
early: This is a long and detailed homework that
requires you to write some smoothing code and experiment with
their parameters and design to see what happens. It should be
manageable because we've already covered the ideas in class
and on HW2, and because we've provided you with a good deal of
code. But it may take some time to understand that code and
the libraries that it uses (especially PyTorch).
I strongly suggest that you start reading the 27-page
reading handout now, then study the starter code and ask
questions on Piazza as needed. Spread the work out. You may
work in pairs.
- 9/6/24 HW2
(11 pages) is available. It's due in a little over 2 weeks: Mon 9/23 at
2pm. This homework is mostly a problem set about manipulating
probabilities. But it is a long homework! Most
significantly, question 6 asks you to read a separate handout
and to work through a series of online lessons, preferably
with a partner or two. Question 8 asks you to write a small
program. It is okay to work on questions 6 and 8 out of
order.
- 8/28/24 HW1
(12 pages) is available. It is due on Wed 9/11 at 2pm: please
get this one in on time so we can discuss it in class an hour
later.
- 8/26/24 First class is Mon 8/26, 3pm, Krieger 205. As explained on the syllabus, please keep MWF 3-4:30 pm open to accommodate a variable class schedule as well as office hours after class. Our weekly recitations are Tue 6-7:30 pm.
- 8/26/24 Please bookmark this
page.
All enrolled students will soon be added to Piazza.
(If you are waitlisted, I will send you a code by email that you can use to join Piazza
if you are attending the class in hopes of getting a seat.)
Later, when Homework 1 is due, we will tell you how to join Gradescope.
Key Links
- Syllabus -- reference info about
the course's staff, meetings, office hours, textbooks, goals,
expectations, and policies. May be updated on occasion.
- Piazza
site for discussion and announcements. Sign up, follow, and participate!
- Gradescope
for submitting your homework.
- Office hours for the course staff.
- Video recordings (see policy on syllabus)
Schedule
Warning: The schedule below is adapted from last year's schedule and may still change! Links to future lecture slides, homeworks, and dates currently point to last year's versions. Watch Piazza for important updates, including when assignments are given and when they are due.
What's Important? What's Hard? What's Easy? [1 week]
Mon 8/26:
Wed 8/28:
Fri 8/30:
- Uses of language models
- Language ID
- Text categorization
- Spelling correction
- Segmentation
- Speech recognition
- Machine translation
- Optional reading about n-gram language models: J&M 3 (or M&S 6)
Probabilistic Modeling [1 week]
Mon 9/2 (Labor Day: no class)
Wed 9/4,
Fri 9/6:
- Probability concepts
- Joint & conditional prob
- Chain rule and backoff
- Modeling sequences
- Surprisal, cross-entropy, perplexity
- Optional reading about probability, Bayes' Theorem, information theory: M&S 2; slides by Andrew Moore
- Smoothing n-grams (video lessons, 52 min. total)
- Maximum likelihood estimation
- Bias and variance
- Add-one or add-λ smoothing
- Cross-validation
- Smoothing with backoff
- Good-Turing, Witten-Bell (bonus slides)
- Optional reading about smoothing: M&S 6; J&M 4; Rosenfeld (2000)
- HW2 given: Probabilities
Mon 9/9:
- Bayes' Theorem
- Log-linear models (self-guided interactive visualization with handout)
- Parametric modeling: Features and their weights
- Maximum likelihood and moment-matching
- Non-binary features
- Gradient ascent
- Regularization (L2 or L1) for smoothing and generalization
- Conditional log-linear models
- Application: Language modeling
- Application: Text categorization
- Optional readings about log-linear models: Collins (pp. 1-4), Smith (section 3.5), J&M 5
Grammars and Parsers [3- weeks]
Wed 9/11:
- HW1 due
- In-class discussion of HW1
- Improving CFG with attributes (video lessons, 62 min. total)
- Morphology
- Lexicalization
- Post-processing (CFG-FST composition)
- Tenses
- Gaps (slashes)
- Optional reading about syntactic attributes: J&M 15 (2nd ed.)
Wed 9/11 (continued),
Fri 9/13,
Mon 9/16:
Wed 9/18,
Fri 9/20:
Mon 9/23:
- HW2 due
- Quick in-class quiz: Log-linear models
- Probabilistic parsing
- PCFG parsing
- Dependency grammar
- Lexicalized PCFGs
- Optional reading on probabilistic parsing: M&S 12, J&M Appendix C
Wed 9/25:
Fri 9/27:
Representing Meaning [1 week]
Mon 9/30,
Wed 10/2,
Fri 10/4:
- HW3 due on Wed 10/2
- Semantics
- What is understanding?
- Lambda terms
- Semantic phenomena and representations
- More semantic phenomena and representations
- Adding semantics to CFG rules
-
Compositional semantics
-
Optional readings on semantics:
- HW5 given: Semantics
Midterm
Mon 10/7 Fri 10/11:
- Midterm exam (3-4:30, in classroom)
Representing Everything: Deep Learning for NLP [1+ week]
Wed 10/9,
Fri 10/11,
Mon 10/14,
Wed 10/16:
- Back-propagation (video lesson, 33 min.)
- Neural methods
- Vectors, matrices, tensors; PyTorch operations; linear and affine operations
- Log-linear models, temperatures, learned features, nonlinearities
- Vectors as an alternative semantic representation
- Training signals: Categorical labels, similarity, matching
- Encoders and decoders
- End-to-end training, multi-task training, pretraining + fine-tuning
- Self-supervised learning
- word2vec (skip-gram / CBOW)
- Recurrent neural nets (RNNs, BiRNNs, ELMo)
- Optional reading about neural nets and RNNs: J&M 7, 8
Fri 10/18 (fall break: no class)
Unsupervised Learning [1+ week]
Mon 10/21,
Wed 10/23:
Fri 10/25,
Mon 10/28:
Discriminative Modeling [1- week]
Wed 10/30,
Fri 11/1:
Deep Learning for Structured Prediction; Transformers [1- week]
Mon 11/4,
Wed 11/6:
- Neural methods (continued)
- seq2seq: Structure prediction via sequence prediction (or via tagging)
- Decoders: Exact, greedy, beam search, independent, dynamic programming, stochastic, Minimum Bayes Risk (MBR)
- Attention
- Transformers (encoder-decoder, encoder-only (BERT), decoder-only (LM))
- Positional embeddings
- Tokenization
- Parameter-efficient fine tuning, distillation, RLHF (REINFORCE, PPO, DPO)
- Optional Reading on Transformers: The Illustrated Transformer; J&M 9, J&M 11;
GPT-2 spreadsheet
Harnessing Large Language Models [1+ week]
Fri 11/8,
Mon 11/11,
Wed 11/13, Fri 11/15:
NLP Applications [2 weeks]
Mon 11/18,
Wed 11/20,
Fri 11/22,
Mon 11/25 (Thanksgiving break),
Wed 11/27 (more break),
Fri 11/29 (more break),
Mon 12/2,
Wed 12/4,
Fri 12/6:
- HW7 due on Fri 11/22
- Current NLP tasks and competitions
- The NLP research community
- Text annotation tasks
- Other types of tasks
-
Optional reading: Explore links in the "NLP tasks" slides!
- HW8 due on 12/8
Final
Exam period (12/11 - 12/19):
- Final exam review session (date TBA)
- Final exam (Wed 12/18, 6pm-9pm, Krieger 205)
Unofficial Summary of Homework Schedule
These dates were copied from the schedule above, which is subject to change.
Homeworks are due approximately every two weeks, with longer homeworks getting more time. But the
homework periods are generally longer than two weeks -- they overlap. This gives you more flexibility
about when to do each assignment, which is useful if you have other classes and activities.
We assign homework n as soon as you've seen the lectures you need, rather than waiting
until after homework n-1 is due. So you can jump right in while the material is fresh.
- HW1 (grammar): given Wed 8/28, due Wed 9/11
- HW2 (probability): given Fri 9/6, due Mon 9/23
- HW3 (empiricism): given Fri 9/13, due
Wed 10/2 Sun 10/6
- Midterm:
Mon 10/7 Fri 10/11:
- HW4 (algorithms): given Wed 9/25, due Mon 10/21
- HW5 (logic): given Fri 10/4, due Mon 10/28
- HW6 (unsupervised learning): given Wed 10/23, due Fri 11/8
- HW7 (discriminative learning): given Fri 11/1, due Fri 11/22
- HW8 (large language models): given Mon 11/11, due Fri 12/6 (last day of class)
Recitation Schedule
Recitations are normally held on Tuesdays (see the syllabus). Enrolled students are expected to attend the recitation and participate in solving practice problems. This will be more helpful than an hour of solo study. The following schedule is subject to change.
Old Materials
Lectures from past years, some still useful:
Old homeworks: