Fall 1999

December 3, 1999

Information extraction systems usually require two dictionaries: a semantic lexicon and a dictionary of extraction patterns for the domain. We will present a multi-level bootstrapping algorithm that generates both the semantic lexicon and extraction patterns simultaneously. As input, our technique requires only unannotated training texts and a handful of seed words for a category. We use a “mutual bootstrapping” technique to alternately select the best extraction pattern for the category and bootstrap its extractions into the semantic lexicon, which then becomes the basis for selecting the next extraction pattern. To make this approach more robust, we add a second level of bootstrapping (meta-bootstrapping) that retains only the most reliable lexicon entries produced by mutual bootstrapping and restarts the process. We evaluated this multi-level bootstrapping technique on a collection of corporate web pages and a corpus of terrorism news articles. The algorithm produced high-quality dictionaries for several semantic categories.

December 7, 1999

As many other PhD-students in Computer Science, when faced with the task of writing a thesis, from time to time I was too occupied with considering the tools at hand, ignoring the more important issue of the actual content of the thesis. Should one use FrameMaker, LaTeX, or what?

I decided to turn Haskell, my favorite programming language at that time, into a domain specific programming language for formatting text documents. One goal was to have a document preparation system which was more orthogonal than LaTeX, so that different features were easy to combine. As a bonus, I could write the thesis without giving up functional programming.

In my talk, I will present the result (developed together with Thomas Hallgren), HacWrite, which consists of a preprocessor and a library of markup combinators and back-ends for producing HTML and LaTeX.

December 10, 1999

The Naval Surface Warfare Center has developed a system (SHADOW) for monitoring network traffic and detecting various kinds of intrusion attempts. This talk will discuss the SHADOW architecture, and some lessons learned in network monitoring. We will also discuss some statistical approaches to network monitoring and ID. Finally, we will discuss some of the new threats that we are seeing, and that we expect to be seeing more of in the months to come.

December 16, 1999

Computer understanding of human actions from video has gained recognition as a challenging research area with applications in human-computer interaction, coding, animation and surveillance.

In this talk I will discuss computational approaches for modeling, estimation and recognition of deformable and articulated motions in video sequences. I will demonstrate the application of these models to the analysis of human face expressions and articulated movement. The motion models will explore the following dimensions:

  1. Instantaneous versus temporal formulations

  2. General models versus learned movement-specific models

  3. Stationary versus moving camera

I will initially propose models for instantaneous motion estimation based on spatial constraints. Then, I will define the concept of spatio-temporal motion trajectories of brightness in image sequences, formalize learning and estimation of movement trajectories and demonstrate its use in tracking and interpreting human motion as observed from stationary and moving cameras. Finally, I will discuss future research directions and challenges that lie ahead.

December 20, 1999

An important aim of robotics is to design and build machines that can recognize and exploit opportunities afforded by new situations. Traditionally in artificial intelligence, this task has fallen on abstract representations, but that has left the problem of how to ground the representations in sensorimotor activity. In this talk, I propose a computational architecture whereby a mobile robot internalizes representations based on its experience. I first examine a fast on-line learning algorithm that allows the robot to build up a mapping of how its motor signals transform sensory data. Then I propose a way of categorizing object affordances according to their internal effects. Based on these effects, wavelet analysis is applied to sensory data to uncover invariance for developing a representation of goals. Finally, I’ll consider heuristics for projecting a learned sensorimotor mapping into the future to attain these goals.