edu.umn.cs.nlp.mt.tools
Class ExtractPhrasePairs
java.lang.Object
edu.umn.cs.nlp.mt.tools.ExtractPhrasePairs
public class ExtractPhrasePairs
- extends Object
Utility to extract aligned phrase pairs from an aligned corpus.
Input files are expected to already be lower case.
This utility will produce a file where each line contains an aligned phrase pair
and the source-to-target lexical probability, target-to-source lexical probability, and relative frequency weight
for that phrase pair.
- Version:
- $LastChangedDate: 2007-11-26 10:42:44 -0600 (Mon, 26 Nov 2007) $
- Author:
- Lane Schwartz
- See Also:
2 of "A Hierarchical Phrase-Based Model for Statistical Machine Translation" by David Chiang (ACL, 2005)
,
4.4 of "Statistical Phrase-Based Translation" by Philipp Koehn, Franz Josef Och, & Daniel Marcu (HLT-NAACL, 2003)
Method Summary |
static void |
main(String[] args)
Extract aligned phrase pairs from an aligned corpus. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ExtractPhrasePairs
public ExtractPhrasePairs()
main
public static void main(String[] args)
- Extract aligned phrase pairs from an aligned corpus.
- Parameters:
args
-
- Throws:
FileNotFoundException