Class TrecParserByPath
- java.lang.Object
-
- org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
-
- org.apache.lucene.benchmark.byTask.feeds.TrecParserByPath
-
public class TrecParserByPath extends TrecDocParser
Parser for trec docs which selects the parser to apply according to the source files path, defaulting toTrecGov2Parser.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
TrecDocParser.ParsePathType
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
DEFAULT_PATH_TYPE
-
-
Constructor Summary
Constructors Constructor Description TrecParserByPath()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DocDataparse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType)parse the text prepared in docBuf into a result DocData, no synchronization is required.-
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
extract, pathType, stripTags, stripTags
-
-
-
-
Method Detail
-
parse
public DocData parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType) throws IOException
Description copied from class:TrecDocParserparse the text prepared in docBuf into a result DocData, no synchronization is required.- Specified by:
parsein classTrecDocParser- Parameters:
docData- reusable resultname- name that should be set to the resulttrecSrc- calling trec content sourcedocBuf- text to parsepathType- type of parsed file, or null if unknown - may be used by parsers to alter their behavior according to the file path type.- Throws:
IOException
-
-