edu.umn.cs.nlp.parser
Class PGrammar

java.lang.Object
  extended by edu.umn.cs.nlp.parser.PGrammar
All Implemented Interfaces:
GrammarInterface

public class PGrammar
extends Object
implements GrammarInterface

Probabilistic context-free grammar capable of parsing sentences using a variant of the CKY+ parsing algorithm.

Rules in this grammar are not required to be in Chomsky Normal Form (CNF).

This class uses a Berkeley DB Java Edition database to store the rules.

Version:
$LastChangedDate: 2007-08-06 11:31:05 -0500 (Mon, 06 Aug 2007) $
Author:
Lane Schwartz

Constructor Summary
PGrammar(String databaseName, String databaseDirectory)
          Opens a grammar stored in the specified database.
PGrammar(String databaseName, String databaseDirectory, boolean allowDuplicates, int maximumUnaryChainDepth, boolean treatTerminalRulesSpecially)
          Opens a grammar stored in the specified database.
 
Method Summary
 void addRule(BasicRuleLHS lhs, String rhs)
           
 void addRule(BasicRuleLHS lhs, String[] rhs)
          Add a new rule to the grammar.
 void close()
          Close the grammar.
static void main(String[] args)
          Example usage of this class with a small sample grammar and test sentences.
 ParseTree parse(String... token)
          Attempts to parse the given series of tokens using this grammar.
 ParseTree parse(String sentence)
          Attempts to parse the given sentence using this grammar.
 Collection<ParseTree> parseAll(String... sentences)
          Attempts to parse each of the given sentences using parse(String sentence).
 boolean parses(String... token)
          Determines if the given series of tokens can be successfully parsed by this grammar.
 boolean parses(String sentence)
          Determines if the given sentence can be successfully parsed by this grammar.
 void setLocale(Locale locale)
          Sets the locale.
 String toString()
          Returns a string representation of the grammar, with one line per rule.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

PGrammar

public PGrammar(String databaseName,
                String databaseDirectory,
                boolean allowDuplicates,
                int maximumUnaryChainDepth,
                boolean treatTerminalRulesSpecially)
Opens a grammar stored in the specified database.

Parameters:
databaseName - the name of the rules database
databaseDirectory - the directory where the database file are stored - this directory must already exist
allowDuplicates - indicates whether duplicate rules are allowed in the specified database
maximumUnaryChainDepth - the maximum depth that unary rule chains are allowed
treatTerminalRulesSpecially - if true, rules of the form (NT => terminal) do not count towards the unary chain depth

PGrammar

public PGrammar(String databaseName,
                String databaseDirectory)
Opens a grammar stored in the specified database.

This constructor is equivalent to PGrammar(databaseName,databaseDirectory,true,Integer.MAX_VALUE,false)

Parameters:
databaseName - the name of the rules database
databaseDirectory - the directory where the database file are stored - this directory must already exist
Method Detail

close

public void close()
Close the grammar. This method closes the underlying database which backs the grammar. This method must be called to properly close the database. After this method is called, no further calls on this grammar object should be made.


addRule

public void addRule(BasicRuleLHS lhs,
                    String[] rhs)
Add a new rule to the grammar.

Parameters:
lhs - The left-hand-side of the rule. Must be a non-terminal.
rhs - The right-hand-side children of the rule. Each rhs element may be a terminal or a non-terminal.

addRule

public void addRule(BasicRuleLHS lhs,
                    String rhs)

parse

public ParseTree parse(String sentence)
Attempts to parse the given sentence using this grammar. Uses a modified CKY+ parsing algorithm.

Returns the most likely complete parse of the sentence.

Specified by:
parse in interface GrammarInterface
Parameters:
sentence - The sentence to be parsed. Will be lowercased using the default locale and then tokenized prior to parsing.
Returns:
a valid parse tree representing the most likely parse if this grammar can parse the given sentence; ParseTree.NULL_PARSE otherwise.
See Also:
setLocale(Locale)

parse

public ParseTree parse(String... token)
Attempts to parse the given series of tokens using this grammar. Uses a modified CKY+ parsing algorithm.

Returns the most likely complete parse of the sentence.

Specified by:
parse in interface GrammarInterface
Parameters:
token - Array of tokens specifying the sentence to be parsed. No lowercasing will be performed by this method.
Returns:
a valid parse tree if this grammar can parse the given sentence; ParseTree.NULL_PARSE otherwise.

parseAll

public Collection<ParseTree> parseAll(String... sentences)
Attempts to parse each of the given sentences using parse(String sentence).

Specified by:
parseAll in interface GrammarInterface
Parameters:
sentences - the sentences to be parsed.
Returns:
one ParseTree per sentence

parses

public boolean parses(String sentence)
Determines if the given sentence can be successfully parsed by this grammar.

Specified by:
parses in interface GrammarInterface
Parameters:
sentence - The sentence to be parsed. Will be lowercased using the default locale and then tokenized prior to parsing.
Returns:
true if this grammar can parse the given sentence; false otherwise.
See Also:
setLocale(Locale)

parses

public boolean parses(String... token)
Determines if the given series of tokens can be successfully parsed by this grammar.

Specified by:
parses in interface GrammarInterface
Parameters:
token - Array of tokens specifying the sentence to be parsed. No lowercasing will be performed by this method.
Returns:
true if this grammar can parse the given sentence; false otherwise.

setLocale

public void setLocale(Locale locale)
Sets the locale. This will be used by parse(String sentence) and by parses(String sentence) when lowercasing the sentence.

Parameters:
locale - the locale to use when performing lowercasing
See Also:
String.toLowerCase(Locale)

toString

public String toString()
Returns a string representation of the grammar, with one line per rule.

WARNING: This method calls PRuleDatabase.toString. That method involves creating a Collection that contains every rule in the database. If the rule database that backs the grammar is large, calling this method will use very large amounts of memory.

Overrides:
toString in class Object

main

public static void main(String[] args)
Example usage of this class with a small sample grammar and test sentences.

Parameters:
args - arguments are ignored