edu.umn.cs.nlp.mt.tools
Class ExtractPhrasePairs

java.lang.Object
  extended by edu.umn.cs.nlp.mt.tools.ExtractPhrasePairs

public class ExtractPhrasePairs
extends Object

Utility to extract aligned phrase pairs from an aligned corpus.

Input files are expected to already be lower case.

This utility will produce a file where each line contains an aligned phrase pair and the source-to-target lexical probability, target-to-source lexical probability, and relative frequency weight for that phrase pair.

Version:
$LastChangedDate: 2007-11-26 10:42:44 -0600 (Mon, 26 Nov 2007) $
Author:
Lane Schwartz
See Also:
2 of "A Hierarchical Phrase-Based Model for Statistical Machine Translation" by David Chiang (ACL, 2005), 4.4 of "Statistical Phrase-Based Translation" by Philipp Koehn, Franz Josef Och, & Daniel Marcu (HLT-NAACL, 2003)

Constructor Summary
ExtractPhrasePairs()
           
 
Method Summary
static void main(String[] args)
          Extract aligned phrase pairs from an aligned corpus.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ExtractPhrasePairs

public ExtractPhrasePairs()
Method Detail

main

public static void main(String[] args)
Extract aligned phrase pairs from an aligned corpus.

Parameters:
args -
Throws:
FileNotFoundException