Field that indexes int
values
for efficient range filtering and sorting. Here's an example usage:
document.add(new IntField(name, 6, Field.Store.NO));
For optimal performance, re-use the
IntField
and
Document instance for more than one document:
IntField field = new IntField(name, 6, Field.Store.NO);
Document document = new Document();
document.add(field);
for(all documents) {
...
field.setIntValue(value)
writer.addDocument(document);
...
}
See also
LongField,
FloatField,
DoubleField.
To perform range querying or filtering against a
IntField
, use
NumericRangeQuery.
To sort according to a
IntField
, use the normal numeric sort types, eg
org.apache.lucene.search.SortField.Type#INT. IntField
values can also be loaded directly from
org.apache.lucene.index.LeafReader#getNumericDocValues.
You may add the same field name as an IntField
to
the same document more than once. Range querying and
filtering will be the logical OR of all values; so a range query
will hit all documents that have at least one value in
the range. However sort behavior is not defined. If you need to sort,
you should separately index a single-valued IntField
.
An IntField
will consume somewhat more disk space
in the index than an ordinary single-valued field.
However, for a typical index that includes substantial
textual content per document, this increase will likely
be in the noise.
Within Lucene, each numeric value is indexed as a
trie structure, where each term is logically
assigned to larger and larger pre-defined brackets (which
are simply lower-precision representations of the value).
The step size between each successive bracket is called the
precisionStep
, measured in bits. Smaller
precisionStep
values result in larger number
of brackets, which consumes more disk space in the index
but may result in faster range search performance. The
default value, 8, was selected for a reasonable tradeoff
of disk space consumption versus performance. You can
create a custom
FieldType and invoke the
FieldType#setNumericPrecisionStep method if you'd
like to change the value. Note that you must also
specify a congruent value when creating
NumericRangeQuery.
For low cardinality fields larger precision steps are good.
If the cardinality is < 100, it is fair
to use
Integer#MAX_VALUE, which produces one
term per value.
For more information on the internals of numeric trie
indexing, including the precisionStep
configuration, see
NumericRangeQuery. The format of
indexed values is described in
NumericUtils.
If you only need to sort by numeric value, and never
run range querying/filtering, you can index using a
precisionStep
of
Integer#MAX_VALUE.
This will minimize disk space consumed.
More advanced users can instead use
NumericTokenStream directly, when indexing numbers. This
class is a wrapper around this token stream type for
easier, more intuitive usage.