Package org.apache.lucene.util
Class OfflineSorter
- java.lang.Object
-
- org.apache.lucene.util.OfflineSorter
-
public class OfflineSorter extends Object
On-disk sorting of byte arrays. Each byte array (entry) is composed of the following fields:- (two bytes) length of the following byte array,
- exactly the above count of bytes for the sequence to be sorted.
- See Also:
sort(String)- WARNING: This API is experimental and might change in incompatible ways in the next release.
- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classOfflineSorter.BufferSizeA bit more descriptive unit for constructors.static classOfflineSorter.ByteSequencesReaderUtility class to read length-prefixed byte[] entries from an input.static classOfflineSorter.ByteSequencesWriterUtility class to emit length-prefixed byte[] entries to an output stream for sorting.classOfflineSorter.SortInfoSort info (debugging mostly).
-
Field Summary
Fields Modifier and Type Field Description static longABSOLUTE_MIN_SORT_BUFFER_SIZEAbsolute minimum required buffer size for sorting.static Comparator<BytesRef>DEFAULT_COMPARATORDefault comparator: sorts in binary (codepoint) orderstatic longGBConvenience constant for gigabytesstatic intMAX_TEMPFILESMaximum number of temporary files before doing an intermediate merge.static longMBConvenience constant for megabytesstatic longMIN_BUFFER_SIZE_MBMinimum recommended buffer size for sorting.
-
Constructor Summary
Constructors Constructor Description OfflineSorter(Directory dir, String tempFileNamePrefix)Defaults constructor.OfflineSorter(Directory dir, String tempFileNamePrefix, Comparator<BytesRef> comparator)Defaults constructor with a custom comparator.OfflineSorter(Directory dir, String tempFileNamePrefix, Comparator<BytesRef> comparator, OfflineSorter.BufferSize ramBufferSize, int maxTempfiles, int valueLength, ExecutorService exec, int maxPartitionsInRAM)All-details constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Comparator<BytesRef>getComparator()Returns the comparator in use to sort entriesDirectorygetDirectory()Returns theDirectorywe use to create temp files.protected OfflineSorter.ByteSequencesReadergetReader(ChecksumIndexInput in, String name)Subclasses can override to change how byte sequences are read from disk.StringgetTempFileNamePrefix()Returns the temp file name prefix passed toDirectory.createTempOutput(java.lang.String, java.lang.String, org.apache.lucene.store.IOContext)to generate temporary files.protected OfflineSorter.ByteSequencesWritergetWriter(IndexOutput out, long itemCount)Subclasses can override to change how byte sequences are written to disk.Stringsort(String inputFileName)Sort input to a new temp file, returning its name.
-
-
-
Field Detail
-
MB
public static final long MB
Convenience constant for megabytes- See Also:
- Constant Field Values
-
GB
public static final long GB
Convenience constant for gigabytes- See Also:
- Constant Field Values
-
MIN_BUFFER_SIZE_MB
public static final long MIN_BUFFER_SIZE_MB
Minimum recommended buffer size for sorting.- See Also:
- Constant Field Values
-
ABSOLUTE_MIN_SORT_BUFFER_SIZE
public static final long ABSOLUTE_MIN_SORT_BUFFER_SIZE
Absolute minimum required buffer size for sorting.- See Also:
- Constant Field Values
-
MAX_TEMPFILES
public static final int MAX_TEMPFILES
Maximum number of temporary files before doing an intermediate merge.- See Also:
- Constant Field Values
-
DEFAULT_COMPARATOR
public static final Comparator<BytesRef> DEFAULT_COMPARATOR
Default comparator: sorts in binary (codepoint) order
-
-
Constructor Detail
-
OfflineSorter
public OfflineSorter(Directory dir, String tempFileNamePrefix) throws IOException
Defaults constructor.- Throws:
IOException- See Also:
OfflineSorter.BufferSize.automatic()
-
OfflineSorter
public OfflineSorter(Directory dir, String tempFileNamePrefix, Comparator<BytesRef> comparator) throws IOException
Defaults constructor with a custom comparator.- Throws:
IOException- See Also:
OfflineSorter.BufferSize.automatic()
-
OfflineSorter
public OfflineSorter(Directory dir, String tempFileNamePrefix, Comparator<BytesRef> comparator, OfflineSorter.BufferSize ramBufferSize, int maxTempfiles, int valueLength, ExecutorService exec, int maxPartitionsInRAM)
All-details constructor. IfvalueLengthis -1 (the default), the length of each value differs; otherwise, all values have the specified length. If you pass a non-nullExecutorServicethen it will be used to run sorting operations that can be run concurrently, and maxPartitionsInRAM is the maximum concurrent in-memory partitions. Thus the maximum possible RAM used by this class while sorting ismaxPartitionsInRAM * ramBufferSize.
-
-
Method Detail
-
getTempFileNamePrefix
public String getTempFileNamePrefix()
Returns the temp file name prefix passed toDirectory.createTempOutput(java.lang.String, java.lang.String, org.apache.lucene.store.IOContext)to generate temporary files.
-
sort
public String sort(String inputFileName) throws IOException
Sort input to a new temp file, returning its name.- Throws:
IOException
-
getWriter
protected OfflineSorter.ByteSequencesWriter getWriter(IndexOutput out, long itemCount) throws IOException
Subclasses can override to change how byte sequences are written to disk.- Throws:
IOException
-
getReader
protected OfflineSorter.ByteSequencesReader getReader(ChecksumIndexInput in, String name) throws IOException
Subclasses can override to change how byte sequences are read from disk.- Throws:
IOException
-
getComparator
public Comparator<BytesRef> getComparator()
Returns the comparator in use to sort entries
-
-