This tutorial page should help you get started with the 3 different finite-state packages we have installed:
If you have trouble getting the programs to run, please contact Radu Florian (rflorian@cs.jhu.edu). If you have corrections or suggestions for this page, email Jason Eisner (jason@cs.jhu.edu).
Note that you can find other finite-state packages on the web (e.g., this Mathematica library and Java applet), but most of them only deal with finite-state acceptors, not with weights or transducers.
As you probably know, some useful unambiguous transductions are straightforward to carry out in Perl or Python or with regular expression libraries (the GNU regex library for C, C++, Java; other Java packages; and the extended Regex++ library for C/C++). The idea is to simply match a generalized regular expression against different pieces of a text, recording the locations of subexpression matches so that they can then be manipulated by arbitrary code. This hybrid approach is useful if one really needs more power than a finite-state approach - thanks to the possibility of backreferences (which require backtracking) and arbitrary replacement code (most often used to swap substrings of unbounded length). But it sacrifices efficiency, economy, elegance, intermediate nondeterminism, and specification via best weighted path.
This section assumes you use csh
or a csh
derivative
as your shell. If not, you may have to modify these instructions
slightly.
.cshrc
file:
setenv PATH /users/rtfm/rflorian/software/bin:$PATH setenv MANPATH /users/rtfm/rflorian/software/man:/software/local/uman:$MANPATH setenv LEFTYPATH /users/rtfm/rflorian/software/lib/graphviz/lefty
.cshrc
file:
setenv PATH /home/0/rflorian/software/bin:$PATH setenv MANPATH /home/0/rflorian/software/man:/software/local/uman:$MANPATH setenv LEFTYPATH /home/0/rflorian/software/lib/graphviz/lefty
FSA Utilities | Xerox FST | AT&T FSM + lextools | |
interactive startup | fsa -tk | xfst | N/A |
manual | Holland, local (local is accessible from CS research network only) | book draft (only accessible from CS research and undergrad networks; don't distribute) | man fsmintro man lextools |
complete regexp guide | Holland, local | sects 2.3-2.4 (p. 42ff), XRCE | man -s5 lextools |
automaton format guide | Holland, local | man -s5 fsm man -s3 fsmclass man -s5 lextools | |
interface guide | chaps 11, 12, 13 Prolog manual Prolog-Emacs interface | overview + ref chap 3 (p. 73ff) help, apropos | N/A |
parentheses | (E) | [E] | (E) |
comments | % comment | # comment | # comment |
atomic expressions | a, a:b, a::3, a:b:3 | a, a:b | a, a:b, <3> (<3> = epsilon with weight 3) |
literal symbol | '*' (or escape(*)) | "*" (or %*) | \* |
symbol escape codes | as in Prolog | as in C | |
long symbol names | foo bar | foobar (greedy) | [foo][bar] |
complex symbol | predicates | [noun num=pl gender=fem] | |
symbol class | a..z (or predicates) | superclass defn in .sym file ( lexmakelab compilesto .scl filefor -S option) | |
any symbol | ? (surround with spaces) | ? | [<sigma>] (if defined as superclass) |
symbol complement (? - E) | `E | \E | |
edge of string | .#. | [<bos>], [<eos>] | |
concat | [E1,E2,E3] | E1 E2 | E1 E2 (fsmconcat) |
character concat | {abcd} | abcd | |
union | {E1,E2,E3} | E1 | E2 | E1 | E2 (fsmunion) |
empty string (epsilon) | [] | 0 [] | 0 [<epsilon>] (if defined) |
empty language | {} | \? | |
optionality | E^ | (E) | E? |
Kleene closure | E* | E* | E* (fsmclosure) |
Kleene plus | E+ | E+ | E+ |
repetition | E^n, E^<n, E^>n, E^{n,m} | E^n | |
contains | $E | $E | |
reverse | reverse(E) | E.r | fsmreverse |
intersect | E1 & E2 | E1 & E2 | & (fsmintersect) |
difference | E1 - E2 | E1 - E2 | E1 - E2 (fsmdifference) |
complement (?* - E) | ~E | ~E | !E (fsmcomplement) |
cross-product | E1 x E2 E1:E2 (high-precedence) | E1 .x. E2, also {abcd}:{aceg} | E1:E2 |
same-length cross-product | E1 xx E2 | ||
project | domain(E), range(E) | E.u, E.l | fsmproject |
epsilon-remove | efree(E), et al. | fsmrmepsilon | |
determinize | E! {t,w,wt}_determinize(E) | fsmdeterminize | |
minimize | E# {t,w,wt}_minimize(E) | fsmminimize | |
compose | E1 o E2 | E1 .o. E2 | E1 @ E2 (fsmcompose) |
invert | invert(E) | E.i | fsminvert |
ignore (insert freely) | A / B A ./. B (blocked at edges) | ||
restriction | [A => L1 _ R1, L2 _ R2] | ||
replacement | Relevant macros in ~rflorian/software/lib/fsa | A -> B (exhaustive nondeterm.) A (->) B (optional) A @-> B (LR longest) A @> B (LR shortest) A ->@ B (RL longest) A >@ B (RL shortest) Reverse arrow swaps lower, upper A may have form [. E .] (blocks repeat matches of epsilon in same place) B may have form L ... R (inserts L, R around A) Contextual restrictions: || L _ R (upper upper) // L _ R (lower upper) \\ L _ R (upper lower) \/ L _ R (lower lower) Parallel replacements joined by , or by ,, if restricted |
A regular expression over an alphabet of part-of-speech tags.
The regexp is basically (art | quant)? adj* noun+
although the notation varies from package to package. It is
intended to match simple noun phrases: an optional article or
quantifier, followed by 0 or more adjectives, followed by 1 or
more nouns.
A transducer that matches exactly the same input as the previous
regular expression, and outputs a transformed version where
non-final noun
tags are replaced by
nmod
("nominal modifier") tags. For example,
it would map the input adj noun noun noun
deterministically to adj nmod nmod noun
.
It would map the input adj
to no outputs
at all, since that input is not a noun phrase and therefore
does not allow even one accepting path.
A transducer that reads an arbitrary input string and nondeterministically brackets and transforms, as above, some arbitrary set of non-overlapping noun phrases.
A transducer that reads an arbitrary input string and outputs a single version where all the maximal noun phrases (chosen greedily from left to right) have been bracketed and transformed as above.
The output of this last transducer on the input string
verb art adj noun noun noun prep art adj noun
and
on the class of input strings quant noun+
verb
.
To start fsa with the graphical interface, type fsa
-tk
. This opens an interactive Prolog window (where any errors
and outputs will appear) and an interactive Tk/Tcl window.
In the "Regex:" field, type [{art,quant}^, adj*,
noun+]
. Look at the above
to see what these symbols mean. (Note: Emacs editing keys
work in the "Regex:" field.)
Notice that the automaton that appears has been automatically determinized and minimized (this often happens but not always). Convince yourself that it's correct. The initial state is green; the final state is red.
To get a better view of the automaton, drag the states around and point to the arcs. Clicking on an arc will print its description in the Prolog window.
Request a slower but excellent view made by the dot
program for automatic directed graph layout. Select
dot_ghostview
or dot_xv
from the
Visualization
menu. (This calls dot
to lay out a postscript or gif file, and then invokes the
appropriate viewer on that file.)
In your regexp, art
, noun
, etc. are
elements of the formal alphabet. Try changing noun
to
Noun
and hit return; you will get an error message
in the Prolog window. The problem is that capitalized symbols
have a special meaning to the Prolog back-end. So try changing
Noun
to the quoted 'Noun'
: this should
work. Then change it back.
(You also have to quote symbols like '*' to block their special meaning. If you want numbers in your formal alphabet, be warned that the digit character '5' (which you can enter into the "String:" field) is distinct in Prolog from the number 5, but both can be in the alphabet, as can 23 and '23'.)
Type art adj adj noun
into the "String:" field.
The Prolog window will tell you "no" (i.e., it doesn't match).
The problem is that the string was interpreted as a sequence
of characters 'a','r','t',' ', ... To fix this, on the Settings
menu, change symbol_separator
to 32, which is the
ASCII code for space. (You can also arrange this at startup by
including symbol_separator=32
on the command line.)
Now when you submit the string, the Prolog window should say "yes".
Another way to test a string is to intersect it with the
automaton. If you get the empty language (easy to see after
minimization), it didn't match. Thus, type
([art,adj,adj,noun] & [{art,quant}^, adj*, noun+])#
into the Regexp box: what happens? What if you delete the first
noun
from the string? (Notice that we are not
really using the string itself, but rather the regular language
that consists of only that string.)
Now we'll compose our automaton with a transducer that
replaces each noun
that immediately precedes
another noun
with nmod
. Such a
transducer can be constructed by one of the directed replacement
operators of Karttunen (1997). We need to load a macro file
that defines this operator. On the File
menu,
choose LoadAux...
and load the file
~rflorian/software/lib/fsa/Karttunen97/Karttunen97_2.pl
.
(You can also load a file at startup time by adding -aux
file
to the command line; you may use the
-aux
flag multiple times on the same line to load multiple files.)
Having loaded the file, we can now write replace(Old,New,Left,Right)
, to get
a directed replacement transducer that scans its input
from left to right. At every
position i, it
replaces the longest substring (if any) that starts at i,
matches the regexp Old
, and falls directly
between substrings matching Left
and
Right
. The match for Old
is replaced
nondeterministically by every string matching New
.
The transducer then resumes scanning immediately after the end of the
match text.
(This is actually an abbreviation for Old => New lu Left -- Right
.
=>
chooses left-to-right scanning, and
lu
says that the left context is matched against
the "lower" (output) version of the string -- i.e., it can see
the previous changes -- whereas the right
context is matched against the "upper" (input) version that
has not been changed yet. <=
, ul
,
uu
, and ll
are also available.)
Specifically, type replace(noun,nmod,[],noun)
into
the "Regex:" field. (The resulting transducer will mention a few
harmless odd characters that were introduced during its construction.)
Try typing foo bar noun baz noun noun
bing noun noun noun noun
into the "String:" field. You
should get the expected output, but you get 16 identical copies
of it.
The multiple copies arise from multiple equivalent paths in
the transducer. Change the regexp to
t_determinize(replace(noun,nmod,[],noun))
.
(This does transducer (t_
) determinization, which
is not automatically requested by the graphical interface
because it may not terminate.)
Now try again; you should get only one copy.
Let's compose our original regexp with the transducer, using
the composition operator o
. If R and S are
relations (i.e., sets of ordered pairs), then
R o S = {(x,z): for some y, (x,y) in R and (y,z) in S}.
This is a generalization of function composition.
The original regexp matches a particular language L.
It will automatically be coerced to an "identity"
transduction (namely [{art:art,quant:quant}^, adj:adj*, noun:noun+]
)
that maps every string in L to itself, and
rejects all other strings. In other words, it describes
the relation {(x,x): x in L}.
So try the composition: [{art,quant}^, adj*, noun+] o
replace(noun,nmod,[],noun)
. Determinize it using
t_determinize
(again introducing some odd
characters that you can ignore). How does this machine work?
How does it manage to be deterministic, given that it must look
ahead to the next symbol in order to decide whether to change
noun
to nmod
?
Modify the transducer so that it also introduces angle brackets
around the noun phrase in the output: [ []:'<', [{art,quant}^, adj*, noun+] o
replace(noun,nmod,[],noun), []:'>' ]
.
Try it on the inputs art adj noun noun noun
and
art adj
. Again, determinize it and try again.
The symbol ?
matches any character, so
? *
matches any string. (Warning: the space after
?
is crucial!) If ? *
is used
as a transduction, the usual rules mean it will be coerced to the
transduction that maps any string to itself - i.e., leaves the
input unchanged in the output.
Therefore, given a transducer T
, we can write
[? *,[T, ? *]*]
for a transducer that applies
T
to some arbitrary set of disjoint substrings
of the input that it matches, and leaves the rest of the input
unchanged.
Take T to be the deterministic transducer we have just
constructed, and apply this transformation to it.
Try it on the strings verb art adj noun noun noun
and art
adj
. How would you characterize the output?
Use the LoadAux command as before to load
~rflorian/software/lib/fsa/GerdemannVannoord99/replace.pl
.
This defines 1-argument and 3-argument versions of the
replace
macro; in these versions the first argument is a transducer.
If T
is our noun phrase transducer as above, then replace(T)
is a transducer that scans the input from left to right,
applying T
to the maximal substring
If T
is deterministic, then so is
replace(T)
.
Try applying replace(T)
to
strings like verb art adj noun noun noun prep art adj
noun
. Fun, eh? (Note:
replace(T,Left,Right)
will only replace
matches that fall directly between substrings matching
Left
and right
.)
Try doing the above application by using composition. That is,
build the automaton [verb, art, adj,
... ]
o replace(T). This yields the automaton that maps a
particular string to its output, and nothing else. Try
applying the function range
to this automaton,
to get only the output.
Finally, try applying the transducer to a class of
strings in parallel: in the "range" expression above, replace
the specific string by the regexp [quant, noun+,
verb]
. The resulting automaton describes the set of
outputs for these inputs. Add #
to the end of the
expression to determinize and minimize it. Have a look.
Quit FSA Utilities, either by typing halt.
in
the Prolog window, or by existing through the File
menu.
a..z
: arc labels may be
predicates that pick out subsets of a potentially infinite alphabet.
Note: The tutorial material in chapters 2-3 of the xfst book will be very helpful if you want to do real work with xfst.
Note: A somewhat nicer and more standalone version of the following instructions is found in problem 1 of this assignment (postscript version).
To start xfst, type xfst
. This gives you a command
line. Useful commands are help
, help
command,
and apropos
topic.
Because xfst doesn't have good command-line editing and recall facilities,
you may want to start a shell in emacs and run xfst in that
shell. (M-x shell
starts the shell and C-h m
tells
you how it works.)
Define a regular expression over an alphabet of
part-of-speech tags: define nounphrase (Art|Quant) Adj*
Noun+ ;
Note: When we type in strings to match against this regexp,
xfst will manage to interpret ArtAdjAdjNoun
as a
length-4 string. It tokenizes by greedy left-to-right longest
match, but it is safest not to rely on that. Instead, the convention of
starting all multicharacter symbols with a distinguished
character or character class prevents any accidental
ambiguities. Here we use capital letters for this. In
practice, the authors of xfst recommend using +
(typed in a regexp as %+
or "+"
since
+
has a special meaning in regexps).
Get information about the nounphrase machine:
print words nounphrase
and print
net nounphrase
.
The former command appears to list all words that can be accepted along acyclic paths (a finite number).
The latter command lists the transitions from each state. States are named sn or fsn depending on whether they are final; s0 or fs0 is the start state. Notice that as in FSA Utilities, the machine has been automatically determinized and minimized for us, since it is an acceptor rather than a transducer.
Let's see whether Art Adj Adj Noun
is a noun phrase.
We'll test that by intersection:
push nounphrase
puts a copy of our automaton on the stack.
read regex Art Adj Adj Noun ;
creates an unnamed automaton reading just the test string, and
pushes it on the stack too. (Note that the xsft prompt
shows the current stack depth; use print stack
to see the
contents.)
intersect net
replaces the two machines on top of the stack
with their intersection.
test non-null
confirms that the intersection is not null,
so Art Adj Adj Noun
did indeed match the automaton.
pop stack
to pop the intersection automaton off the stack and
discard it.
There is an easier way to do the test. Just as in FSA Utilities, an automaton is also regarded as as an identity transducer on the language it accepts. xfst makes it easy to apply transducers to raw strings, as follows:
push nounphrase
pushes the nounphrase machine onto the stack, ready to apply;
and then apply down
enters a mode where you can
type in test strings and discover what nounphrase transduces
them to, in the usual ("down") direction of transduction. If
the input string is copied to the output, then it's part of the
language; if there are no output strings, then the input was not
part of the language. Try typing in ArtAdjAdjNoun
and ArtAdjAdj
.
Press Ctrl-D to exit this mode, and do clear stack
.
Type read regex Noun -> Nmod || _ Noun ;
,
which pushes an unnamed transducer that replaces Noun with Nmod
before any Noun. Test it out: apply down
FooBarNounBazNounNounBingNounNounNounNoun
.
Note: The "down" direction of application (and in the stack) is identified
with "right" in regular expressions (and in strings). Applying down transforms
along the direction of the rightward arrow in Noun ->
Nmod
, and from a
to b
in
a:b
or a .x. b
.
We now want to compose nounphrase
with this
transducer. Type push nounphrase
and then compose net
.
Did this compose the transducers in the right order? We want
the lower language (the usual output) of nounphrase
to feed into the upper language (input) of the new transducer.
Since nounphrase
is sitting on top of the stack,
in the usual metaphor, it is above the new transducer, so
its lower language was "touching" the new transducer's upper language,
as desired.
Now that we have composed them, we can apply
down
, which is as if we send a string downward through
nounphrase
first and then through the noun-to-nmod
transducer. Try some inputs like ArtAdjNounNounNoun
and VerbAdjNounNounNoun
.
We want to modify the composed machine so it inserts angle brackets
around the noun phrase. First let's pop it into a named variable:
define noun2nmod
. (If the stack isn't clear now, type clear
.)
Now let's push the three things we have to concatenate:
read regex []:%< ;
(replaces the empty string with <
- recall that %
says to treat the next character literally)
push noun2nmod
read regex []:%> ;
turn stack
. (Type print stack
to see the stack.) Now we can finally
concatenate together all three transducers on the stack:
concatenate net
. Try using apply down
to
test it on the same strings as before.
Give a name to our new transducer: define npbracket
.
We can use this name directly in a regexp: read regex [?* [npbracket ?*]*];
As we did with the FSA utilities, try this on the strings
VerbArtAdjNounNounNoun
and ArtAdj
.
For our last major trick, we again want to "correct" the
previous transducer so that it behaves like replace(T) in the
FSA Utilities. This means applying npbracket
to
maximal noun phrases from left to right. This will be a little
tricky, since xfst has no operator that uses a transducer to do
directed replacement. However, the solution is completely general.
The first
one puts special markers around the strings that npbracket
ought
to transform: define r1 [npbracket.u @-> "{{{" ... "}}}"] ;
(Here npbracket.u
is the
upper language of npbracket
, i.e., the strings npbracket
it will accept on
the input side. @->
calls for left-to-right maximal replacement of this language, and the ...
notation stands for an output copy of whatever
string was actually matched on the input side.)
The second one applies applies the npbracket
transducer
within curly braces, deleting the braces and leaving everything else alone:
define r2 [\"{{{"* [ "{{{":[] npbracket "}}}":[] \"{{{"* ]*] ;
So the full machine is obtained on the stack by read regex r1 .o. r2 ;
Check that the machine we just
constructed does the right thing: apply down VerbArtAdjNounNounNounPrepArtAdjNoun
.
And as before, try applying it to a class of strings:
read regex Quant Noun+ Verb ;
compose
print net
Questions for you to try yourself:
nounphrase
, how would you construct a
transducer that maps accepted strings to yes
and rejected strings
to no
? (Hint: use the cross product .x.
operator.)
npbracket
, how would you construct
a version that is willing to take input in the form
Art Adj Adj Noun
(instead of ArtAdjAdjNoun
) and
also produces output in this form?
Type print defined
and print storage
to see
what you've done. Then quit xfst by typing exit
or quit
.
optimize net
command that tries to reduce number of arcs.
recursive-define
variable).
twolc
for compiling morphological grammars in Koskenniemi's "Two-Level Rule" formalism,
and the accompanying tool lexc
for compiling morphological lexicons.
a:b
as a
single symbol. So the result is not necessarily deterministic with
respect to the input (there may be a:b
and
a:c
arcs from the same state); nor is it necessarily the
minimal transducer implementing this relation, since xfst will not
change the alignment between input and output (will not split up the
two halves of a:b
and put them on different arcs).
This software comes in the form of C libraries (type
man -s3 fsm
) and command-line tools (type man
-s1 fsm
and man lextools
). There is no
interactive interface other than the shell, makefiles, etc. So
you will need to create some files representing the machines.
Make a directory to hold the files for this exercise.
The FSM package denotes states and symbols internally by numbers.
Let's start by giving them symbolic labels for our convenience.
Create a file called partofspeech.lab
with these contents:
eps 0 Noun 1 Verb 2 Quant 3 Art 4 Adj 5 Nmod 6 Prep 7 Adj 8
0
is reserved for the empty string. (It should be assigned
a name like epsilon
or #
.) Otherwise the numbers here are arbitrary;
they need not be consecutive or even small.
Now let's create a textual description of a FSA (finite-state acceptor)
for the language (Art | Quant)? Adj* Noun+. Put this in a file nounphrase.txt
:
0 1 Art 0 1 Quant 1 1 Adj 1 2 Noun 2 2 Noun 2The first line means, for example, that there is an arc from state 0 to state 1 labeled with the symbol
Art
. The last line
means that state 2 is a final state.
The first state mentioned in the file (state 0 in this case) is the start state. The state numbering is arbitrary (in fact, the FSM programs may renumber the states, so the -s option for assigning names to states usually won't do what you want when printing or drawing a machine, although it can be helpful when compiling one from input you create).
The order of lines in the file is irrelevant, and any amount of whitespace may be used to separate fields on a line. Any line can take an additional field at the end giving the cost of the arc or the final state.
Now compile this textual representation into a binary representation:
fsmcompile -ipartofspeech.lab nounphrase.txt > nounphrase.fsa
The -i
flag gives our file that turns input symbol labels into numbers.
(Use -o
for output symbols and -s
for state labels.)
Compilation or other operations may renumber the states.
Let's look at our compiled FSA:
fsmprint nounphrase.fsa
fsmprint -ipartofspeech.lab nounphrase.fsa
fsmdraw -ipartofspeech.lab nounphrase.fsa | dot -Tps | ghostview -landscape -
lexfsmstrings -l partofspeech.lab nounphrase.fsa
(refuses to list strings because infinitely many)
fsminfo nounphrase.fsa
fsminfo -n nounphrase.fsa
fsminfo -v nounphrase.fsa
(verbose; other options get subsets of this info)
You may be interested to know that fsmdraw
produces a directed graph
specification in the dot
language (documentation: man dot
, user's guide).
While there are several options to fsmdraw
, it is
sometimes convenient
to edit this specification it produces, or filter it with a Perl script - for example,
to get color. You can also use dotty nounphrase.dot
to make minor
changes to the layout, e.g., drag states around with the mouse or set colors and styles.
dot -Tps
then lays out the graph and produces postscript output,
which can be saved in a file or piped directly to ghostview -
as above.
There are some easier ways to produce these machines than by hand-coding them!
The lextools
package is designed for that. In particular, lexcompre
compiles
a regular expression specified with -s
:
lexcompre -l partofspeech.lab -s '([Art]|[Quant])?[Adj]*[Noun]+' > nounphrase2.fsaView the output and make sure it's right! Other options:
Create the .txt
textual description automatically with a program.
Directly create an internal representation of the machine with a C program. (See man pages
for fsmclass
, fsmobject
, fsmaccess
, fsmcost
).
Build up the machine from simpler machines by using the FSM tools. For example,
regexp *
is implemented by the fsmclosure
utility
or the FSMClosure()
function. Note the possibility of defining
higher-level operators (like directed replacement) by using scripts or C
functions that call the FSM tools.
Write a regexp in FSA Utilities, and save it in FSM format! This allows access to the nice features of the FSA Utilities' regexp language.
Just for fun, try reversing the automaton:
fsmreverse nounphrase2.fsa | fsmdraw -ipartofspeech.lab | dot -Tps | ghostview -landscape -The result contains an epsilon arc and other instances of nondeterminism. Thus, try
fsmreverse nounphrase2.fsa | fsmrmepsilon | fsmdeterminize | fsmminimize | fsmdraw -ipartofspeech.lab | dot -Tps | ghostview -landscape -Notice that we needed a sequence of 3 commands: we must remove epsilons before determinization (which would otherwise treat them as ordinary symbols), and determinization is required for minimization to run.
Note also that some of the FSM commands are being used as pipes here. As for many Unix commands,
the one-argument commands will conveniently read from standard input if their filename argument is omitted,
and all the commands take the special filename -
to refer to standard input.
Now let's create our transducer that maps Noun to Nmod before
another noun. First by hand: create a file
noun2nmod.txt
with the following contents (but delete
the comments).
### state 0: neither inside or immediately after a noun sequence 0 # final 0 1 Noun Nmod # nondeterministically start multi-word noun sequence: 0 2 Noun Noun # nondeterministically start single-word noun sequence: 0 0 Verb Verb 0 0 Quant Quant 0 0 Art Art 0 0 Adj Adj 0 0 Nmod Nmod 0 0 Prep Prep 0 0 Adv Adv ### state 1: strictly inside a maximal input noun sequence ### must continue it 1 1 Noun Nmod 1 2 Noun Noun ### state 2: right at the end of a maximal input noun sequence ### no way to leave here on a Noun input 2 # final 2 0 Verb Verb 2 0 Quant Quant 2 0 Art Art 2 0 Adj Adj 2 0 Nmod Nmod 2 0 Prep Prep 2 0 Adv AdvCompile this via
fsmcompile -t -ipartofspeech.lab
-opartofspeech.lab noun2nmod.txt > noun2nmod.fst
. (The
-t
option indicates that the input describes a transducer.) View the result
with fsmdraw
.
(coming soon! But you do know enough now to use these tools in assignment 1. You could check out this tutorial if you want more info on FSM.)
jason@cs.jhu.edu
- $Date: 2001/11/19 16:08:31 $