Professor
ACL Fellow
Department of Computer Science
Johns Hopkins University 3400 N. Charles Street, Hackerman 226 Baltimore, MD 21218-2680 U.S.A. Email: jason@cs.jhu.edu
Web: http://cs.jhu.edu/~jason
Facebook: @jeisner
Twitter: @adveisner
Scholar: citations G+: profile page Office: Hackerman 324C Phone: (410) 516-8438 (dial 516-THETA) Skype: jasoneisner (email me to set up a time)
Fax: (410) 516-6134 |
Department of Computer Science | (my primary appointment) |
Center for Language and Speech Processing | (my major multi-departmental center at JHU) |
Data Science and AI Institute | (large community of ML/AI researchers at JHU) |
Mathematical Institute for Data Science | (see also older Machine Learning Group page) |
Department of Cognitive Science | (my joint appointment) |
Novel methods in NLP and ML. Focusing on probabilistic modeling and inference in complex, structured, or ill-defined settings.
This often involves new machine learning; creative uses and modifications of large language models; probabilistic models of linguistic structure, human behavior, and machine behavior; combinatorial algorithms and approximate inference.
I'm also into designing declarative specification languages backed by general efficient algorithms (and adaptive execution). This produces a coherent view of all of the modeling and algorithmic options, and accelerates the research of others.
The questions: Large language models attempt to imitate typical human behavior. How can we combine this with disciplines for ensuring rational behavior, such as statistics, case analysis and planning, reinforcement learning, the scientific method, and probabilistic modeling of the world? How can we use this to support humans, including by integrating human preferences and expertise?
The engineering motivation: Computers must learn to understand human language. A huge portion of human communication, thought, and culture now passes through computers. Ultimately, we want our devices to help us by understanding text and speech as a human would—both at the small scale of intelligent user interfaces and at the large scale of the entire multilingual Internet.
The scientific motivation: Human language is fascinatingly complex and ambiguous. Yet babies are born with the incredible ability to discover the structure of the language around them. Soon they are able to rapidly comprehend and produce that language and relate it to events and concepts in the world. Figuring out how this is possible is a grand challenge for both cognitive science and machine learning.
The disciplines: My research program combines computer science with statistics and linguistics. The challenge is to fashion statistical models that are nuanced enough to capture good intuitions about linguistic structure, and especially, to develop efficient algorithms to apply these models to data (including training them with as little supervision as possible, or making use of large pre-trained models).
Models: I've developed significant modeling approaches for a wide variety of domains in natural language processing—syntax, phonology, morphology, and machine translation, as well as semantic preferences, name variation, and even database-backed websites. The goal is to capture not just the structure of sentences, but also deep regularities within the grammar and lexicon of a language (and across languages). My students and I are always thinking about new problems and better models. For example, latent variables and nonparametric Bayesian methods let us construct a linguistically plausible account of how the data arose. Our latest models continue to include linguistic ideas, but they also include deep neural networks in order to fit unanticipated regularities and large pre-trained language models to exploit the knowledge implicit in large corpora.
Algorithms: A good mathematical model will define the best analysis of the data, but can we compute that analysis? My students and I are constantly developing new algorithms, to cope with the tricky structured prediction and learning problems posed by increasingly sophisticated models. Unlike many areas of machine learning, we have to deal with probability distributions over unboundedly large structured variables such as strings, trees, alignments, and grammars. My favorite tools include dynamic programming, Markov chain Monte Carlo (MCMC), belief propagation and other variational approximations, automatic differentiation, deterministic annealing, stochastic local search, coarse-to-fine search, integer linear programming, and relaxation methods. I especially enjoy connecting disparate techniques in fruitful new ways.
General paradigms: My students and I also work to pioneer general statistical and algorithmic paradigms that cut across problems (not limited to NLP). We are developing a high-level declarative programming language, Dyna, which allows startlingly short programs, backed up by many interesting general efficiency tricks so that these don't have to be reinvented and reimplemented in new settings all the time. We are also showing how to learn execution strategies that do fast and accurate approximate statistical inference, and how to properly train these essentially discriminative strategies in a Bayesian way. We have also developed other machine learning techniques and modeling frameworks of general interest, primarily for structured prediction and temporal sequence modeling.
Measuring success: We implement our new methods and evaluate them carefully on collections of naturally occurring language. We have repeatedly improved the state of the art. While our work can certainly be used within today's end-user applications, such as machine translation and information extraction, we ourselves are generally focused on building up the long-term fundamentals of the field.
In general, I have broad interests and have worked on a wide range of fundamental topics in NLP, drawing on varied areas of computer science. See my papers, CV, and research summary for more information; see also notes on my advising style.
See also other tutorial material.
Undergraduates are often curious about their teachers' secret lives. In the name of encouraging curiosity-driven research, here are a few photos:
And some non-photos:
If I had a geek code, it would be GCS/O/M/MU d-(+) s:- a+ C++$ ULS+(++) L++ P++ E++>+++ W++ N++ o+ K++ w@ !O V- PS++ PE- Y+ PGP b++>+++ !tv G e++++ h- r+++ y+++, but I disapprove of the feeping creaturism of these things.