Package org.apache.lucene.analysis.ngram
Class EdgeNGramTokenFilter
- java.lang.Object
-
- org.apache.lucene.util.AttributeSource
-
- org.apache.lucene.analysis.TokenStream
-
- org.apache.lucene.analysis.TokenFilter
-
- org.apache.lucene.analysis.ngram.EdgeNGramTokenFilter
-
- All Implemented Interfaces:
Closeable,AutoCloseable,Unwrappable<TokenStream>
public final class EdgeNGramTokenFilter extends TokenFilter
Tokenizes the given token into n-grams of given size(s).This
TokenFiltercreate n-grams from the beginning edge of a input token.As of Lucene 4.4, this filter handles correctly supplementary characters.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.State
-
-
Field Summary
Fields Modifier and Type Field Description static booleanDEFAULT_PRESERVE_ORIGINAL-
Fields inherited from class org.apache.lucene.analysis.TokenFilter
input
-
Fields inherited from class org.apache.lucene.analysis.TokenStream
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
-
-
Constructor Summary
Constructors Constructor Description EdgeNGramTokenFilter(TokenStream input, int gramSize)Creates an EdgeNGramTokenFilter that produces edge n-grams of the given size.EdgeNGramTokenFilter(TokenStream input, int minGram, int maxGram, boolean preserveOriginal)Creates an EdgeNGramTokenFilter that, for a given input term, produces all edge n-grams with lengths >= minGram and <= maxGram.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidend()booleanincrementToken()voidreset()-
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, unwrap
-
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
-
-
-
-
Field Detail
-
DEFAULT_PRESERVE_ORIGINAL
public static final boolean DEFAULT_PRESERVE_ORIGINAL
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
EdgeNGramTokenFilter
public EdgeNGramTokenFilter(TokenStream input, int minGram, int maxGram, boolean preserveOriginal)
Creates an EdgeNGramTokenFilter that, for a given input term, produces all edge n-grams with lengths >= minGram and <= maxGram. Will optionally preserve the original term when its length is outside of the defined range.- Parameters:
input-TokenStreamholding the input to be tokenizedminGram- the minimum length of the generated n-gramsmaxGram- the maximum length of the generated n-gramspreserveOriginal- Whether or not to keep the original term when it is outside the min/max size range.
-
EdgeNGramTokenFilter
public EdgeNGramTokenFilter(TokenStream input, int gramSize)
Creates an EdgeNGramTokenFilter that produces edge n-grams of the given size.- Parameters:
input-TokenStreamholding the input to be tokenizedgramSize- the n-gram size to generate.
-
-
Method Detail
-
incrementToken
public final boolean incrementToken() throws IOException- Specified by:
incrementTokenin classTokenStream- Throws:
IOException
-
reset
public void reset() throws IOException- Overrides:
resetin classTokenFilter- Throws:
IOException
-
end
public void end() throws IOException- Overrides:
endin classTokenFilter- Throws:
IOException
-
-