Package edu.berkeley.nlp.lm
Class ArrayEncodedProbBackoffLm<W>
java.lang.Object
edu.berkeley.nlp.lm.AbstractNgramLanguageModel<W>
edu.berkeley.nlp.lm.AbstractArrayEncodedNgramLanguageModel<W>
edu.berkeley.nlp.lm.ArrayEncodedProbBackoffLm<W>
- Type Parameters:
W
-
- All Implemented Interfaces:
ArrayEncodedNgramLanguageModel<W>
,NgramLanguageModel<W>
,Serializable
public class ArrayEncodedProbBackoffLm<W>
extends AbstractArrayEncodedNgramLanguageModel<W>
implements ArrayEncodedNgramLanguageModel<W>, Serializable
Language model implementation which uses Kneser-Ney-style backoff
computation.
Note that unlike the description in Pauls and Klein (2011), we store trie for
which the first word in n-gram points to its prefix for this particular
implementation. This is in contrast to
ContextEncodedProbBackoffLm
,
which stores a trie for which the last word points to its suffix. This was
done because it simplifies the code significantly, without significantly
changing speed or memory usage.- Author:
- adampauls
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.ArrayEncodedNgramLanguageModel
ArrayEncodedNgramLanguageModel.DefaultImplementations
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
NgramLanguageModel.StaticMethods
-
Field Summary
Fields inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
lmOrder, oovWordLogProb
-
Constructor Summary
ConstructorsConstructorDescriptionArrayEncodedProbBackoffLm
(int lmOrder, WordIndexer<W> wordIndexer, NgramMap<ProbBackoffPair> map, ConfigOptions opts) -
Method Summary
Modifier and TypeMethodDescriptionfloat
getLogProb
(int[] ngram) Equivalent togetLogProb(ngram, 0, ngram.length)
float
getLogProb
(int[] ngram, int startPos, int endPos) Calculate language model score of an n-gram.float
getLogProb
(List<W> ngram) Scores an n-gram.Methods inherited from class edu.berkeley.nlp.lm.AbstractArrayEncodedNgramLanguageModel
scoreSentence
Methods inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
getLmOrder, getWordIndexer, setOovWordLogProb
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
getLmOrder, getWordIndexer, scoreSentence, setOovWordLogProb
-
Constructor Details
-
ArrayEncodedProbBackoffLm
public ArrayEncodedProbBackoffLm(int lmOrder, WordIndexer<W> wordIndexer, NgramMap<ProbBackoffPair> map, ConfigOptions opts)
-
-
Method Details
-
getLogProb
public float getLogProb(int[] ngram, int startPos, int endPos) Description copied from interface:ArrayEncodedNgramLanguageModel
Calculate language model score of an n-gram. Warning: if you pass in an n-gram of length greater thangetLmOrder()
, this call will silently ignore the extra words of context. In other words, if you pass in a 5-gram (endPos-startPos == 5
) to a 3-gram model, it will only score the words fromstartPos + 2
toendPos
.- Specified by:
getLogProb
in interfaceArrayEncodedNgramLanguageModel<W>
- Specified by:
getLogProb
in classAbstractArrayEncodedNgramLanguageModel<W>
- Parameters:
ngram
- array of words in integer representationstartPos
- start of the portion of the array to be readendPos
- end of the portion of the array to be read.- Returns:
-
getLogProb
public float getLogProb(int[] ngram) Description copied from interface:ArrayEncodedNgramLanguageModel
Equivalent togetLogProb(ngram, 0, ngram.length)
- Specified by:
getLogProb
in interfaceArrayEncodedNgramLanguageModel<W>
- Overrides:
getLogProb
in classAbstractArrayEncodedNgramLanguageModel<W>
- See Also:
-
getLogProb
Description copied from interface:NgramLanguageModel
Scores an n-gram. This is a convenience method and will generally be relatively inefficient. More efficient versions are available inArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)
andContextEncodedNgramLanguageModel.getLogProb(long, int, int, edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel.LmContextInfo)
.- Specified by:
getLogProb
in interfaceNgramLanguageModel<W>
- Overrides:
getLogProb
in classAbstractArrayEncodedNgramLanguageModel<W>
-
getNgramMap
-