Inheritance diagram for joshua.decoder.ff.tm.SentenceFilteredGrammar:

Collaboration diagram for joshua.decoder.ff.tm.SentenceFilteredGrammar:

Classes
class	SentenceFilteredTrie
Public Member Functions
Trie	getTrieRoot ()
boolean	hasRuleForSpan (int startIndex, int endIndex, int pathLength)
int	getNumRules ()
int	getNumRules (Trie node)
Rule	constructManualRule (int lhs, int[] sourceWords, int[] targetWords, float[] scores, int aritity)
boolean	isRegexpGrammar ()
Package Functions
	SentenceFilteredGrammar (AbstractGrammar baseGrammar, Sentence sentence)
Private Member Functions
SentenceFilteredTrie	filter (Trie unfilteredTrieRoot)
void	filter (int i, SentenceFilteredTrie trieNode, boolean lastWasNT)
SentenceFilteredTrie	filter_regexp (Trie unfilteredTrie)
boolean	matchesSentence (Trie childTrie)
Private Attributes
AbstractGrammar	baseGrammar
SentenceFilteredTrie	filteredTrie
int[]	tokens
Sentence	sentence

Detailed Description

This class implements dynamic sentence-level filtering. This is accomplished with a parallel trie, a subset of the original trie, that only contains trie paths that are reachable from traversals of the current sentence.

Author:: Matt Post post@.nosp@m.cs.j.nosp@m.hu.ed.nosp@m.u

Constructor & Destructor Documentation

joshua.decoder.ff.tm.SentenceFilteredGrammar.SentenceFilteredGrammar	(	AbstractGrammar	baseGrammar,
		Sentence	sentence
	)		`[package]`

Construct a new sentence-filtered grammar. The main work is done in the enclosed trie (obtained from the base grammar, which contains the complete grammar).

Parameters:

baseGrammar
sentence

Here is the call graph for this function:

Member Function Documentation

Rule joshua.decoder.ff.tm.SentenceFilteredGrammar.constructManualRule	(	int	lhs,
		int[]	sourceWords,
		int[]	targetWords,
		float[]	scores,
		int	arity
	)

This is used to construct a manual rule supported from outside the grammar, but the owner should be the same as the grammar. Rule ID will the same as OOVRuleId, and no lattice cost

Reimplemented from joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar.

SentenceFilteredTrie joshua.decoder.ff.tm.SentenceFilteredGrammar.filter ( Trie unfilteredTrieRoot ) [private]

What is the algorithm?

Take the first word of the sentence, and start at the root of the trie. There are two things to consider: (a) word matches and (b) nonterminal matches.

For a word match, simply follow that arc along the trie. We create a parallel arc in our filtered grammar to represent it. Each arc in the filtered trie knows about its corresponding/underlying node in the unfiltered grammar trie.

A nonterminal is always permitted to match. The question then is how much of the input sentence we imagine it consumed. The answer is that it could have been any amount. So the recursive call has to be a set of calls, one each to the next trie node with different lengths of the sentence remaining.

A problem occurs when we have multiple sequential nonterminals. For scope-3 grammars, there can be four sequential nonterminals (in the case when they are grounded by terminals on both ends of the nonterminal chain). We'd like to avoid looking at all possible ways to split up the subsequence, because with respect to filtering rules, they are all the same.

We accomplish this with the following restriction: for purposes of grammar filtering, only the first in a sequence of nonterminal traversals can consume more than one word. Each of the subsequent ones would have to consume just one word. We then just have to record in the recursive call whether the last traversal was a nonterminal or not.

Returns:: the root of the filtered trie

Here is the caller graph for this function:

void joshua.decoder.ff.tm.SentenceFilteredGrammar.filter	(	int	i,
		SentenceFilteredTrie	trieNode,
		boolean	lastWasNT
	)		`[private]`

Matches rules against the sentence. Intelligently handles chains of sequential nonterminals. Marks arcs that are traversable for this sentence.

Parameters:

i	the position in the sentence to start matching
trie	the trie node to match against
lastWasNT	true if the match that brought us here was against a nonterminal

Here is the call graph for this function:

SentenceFilteredTrie joshua.decoder.ff.tm.SentenceFilteredGrammar.filter_regexp ( Trie unfilteredTrie ) [private]

Alternate filter that uses regular expressions, walking the grammar trie and matching the source side of each rule collection against the input sentence. Failed matches are discarded, and trie nodes extending from that position need not be explored.

Returns:: the root of the filtered trie if any rules were retained, otherwise null

Here is the call graph for this function:

int joshua.decoder.ff.tm.SentenceFilteredGrammar.getNumRules ( )

Gets the number of rules stored in the grammar.

Returns:: the number of rules stored in the grammar

Reimplemented from joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar.

Here is the call graph for this function:

Here is the caller graph for this function:

int joshua.decoder.ff.tm.SentenceFilteredGrammar.getNumRules ( Trie node )

A convenience function that counts the number of rules in a grammar's trie.

Parameters:

node

Returns:

Here is the call graph for this function:

Trie joshua.decoder.ff.tm.SentenceFilteredGrammar.getTrieRoot ( )

Gets the root of the Trie backing this grammar.

Note: This method should run as a small constant-time function.

Returns:: the root of the Trie backing this grammar

Reimplemented from joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar.

Here is the caller graph for this function:

boolean joshua.decoder.ff.tm.SentenceFilteredGrammar.hasRuleForSpan	(	int	startIndex,
		int	endIndex,
		int	pathLength
	)

This function is poorly named: it doesn't mean whether a rule exists in the grammar for the current span, but whether the grammar is permitted to apply rules to the current span (a grammar-level parameter). As such we can just chain to the underlying grammar.

Reimplemented from joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar.

Here is the call graph for this function:

boolean joshua.decoder.ff.tm.SentenceFilteredGrammar.isRegexpGrammar ( )

This returns true if the grammar contains rules that are regular expressions, possibly matching many different inputs.