private static Set<String> getFieldNamesToSample( LeafReaderContext readerContext ) throws IOException { Fields fields = readerContext.reader().fields(); Set<String> fieldNames = new HashSet<>(); for ( String field : fields ) { if ( !LuceneDocumentStructure.NODE_ID_KEY.equals( field ) ) { fieldNames.add( field ); } } return fieldNames; } }
private Set<String> allFields() throws IOException { Set<String> allFields = new HashSet<>(); for ( LeafReader leafReader : allLeafReaders() ) { Iterables.addAll( allFields, leafReader.fields() ); } return allFields; }
private Terms termsForField( String fieldName ) throws IOException { List<Terms> terms = new ArrayList<>(); List<ReaderSlice> readerSlices = new ArrayList<>(); for ( LeafReader leafReader : allLeafReaders() ) { Fields fields = leafReader.fields(); Terms leafTerms = fields.terms( fieldName ); if ( leafTerms != null ) { ReaderSlice readerSlice = new ReaderSlice( 0, Math.toIntExact( leafTerms.size() ), 0 ); terms.add( leafTerms ); readerSlices.add( readerSlice ); } } Terms[] termsArray = terms.toArray( new Terms[terms.size()] ); ReaderSlice[] readerSlicesArray = readerSlices.toArray( new ReaderSlice[readerSlices.size()] ); return new MultiTerms( termsArray, readerSlicesArray ); }
for ( LeafReaderContext leafReaderContext : searcher.getIndexReader().leaves() ) Fields fields = leafReaderContext.reader().fields(); for ( String field : fields )
@Override public Fields fields() throws IOException { ensureOpen(); return in.fields(); }
@Override public Fields fields() throws IOException { ensureOpen(); return in.fields(); }
/** This may return null if the field does not exist.*/ public final Terms terms(String field) throws IOException { return fields().terms(field); }
/** This may return null if the field does not exist.*/ public final Terms terms(String field) throws IOException { return fields().terms(field); }
public void fieldsList() throws IOException{ // we'll just look at the first segment - generally, the fields // list will be the same for all segments LeafReader leafReader = reader.leaves().get(0).reader(); for (String field : leafReader.fields()) { System.out.println(field); } }
public OrdWrappedTermsEnum(LeafReader reader) throws IOException { assert indexedTermsArray != null; assert 0 != indexedTermsArray.length; termsEnum = reader.fields().terms(field).iterator(); }
public OrdWrappedTermsEnum(LeafReader reader) throws IOException { assert indexedTermsArray != null; assert 0 != indexedTermsArray.length; termsEnum = reader.fields().terms(field).iterator(); }
@Override public Fields fields() throws IOException { return new SortingFields(in.fields(), in.getFieldInfos(), docMap); }
@Override public Fields fields() throws IOException { return new SortingFields(in.fields(), in.getFieldInfos(), docMap); }
/** * Initialize lookup for the provided segment */ PerThreadIDAndVersionLookup(LeafReader reader, String uidField) throws IOException { this.uidField = uidField; Fields fields = reader.fields(); Terms terms = fields.terms(uidField); if (terms == null) { throw new IllegalArgumentException("reader misses the [" + uidField + "] field"); } termsEnum = terms.iterator(); versions = reader.getNumericDocValues(VersionFieldMapper.NAME); if (versions == null) { throw new IllegalArgumentException("reader misses the [" + VersionFieldMapper.NAME + "] field"); } Object readerKey = null; assert (readerKey = reader.getCoreCacheKey()) != null; this.readerKey = readerKey; }
private Query buildFilterClause(LeafReader reader) throws IOException { Terms terms = reader.fields().terms(field); if (terms == null) return null; BooleanQuery.Builder bq = new BooleanQuery.Builder(); int docsInBatch = reader.maxDoc(); BytesRef term; TermsEnum te = terms.iterator(); while ((term = te.next()) != null) { // we need to check that every document in the batch has the same field values, otherwise // this filtering will not work if (te.docFreq() != docsInBatch) throw new IllegalArgumentException("Some documents in this batch do not have a term value of " + field + ":" + Term.toString(term)); bq.add(new TermQuery(new Term(field, BytesRef.deepCopyOf(term))), BooleanClause.Occur.SHOULD); } BooleanQuery built = bq.build(); if (built.clauses().size() == 0) return null; return built; }
/** * Initialize lookup for the provided segment */ public PerThreadIDAndVersionLookup(LeafReader reader) throws IOException { TermsEnum termsEnum = null; NumericDocValues versions = null; boolean hasPayloads = false; Fields fields = reader.fields(); if (fields != null) { Terms terms = fields.terms(UidFieldMapper.NAME); if (terms != null) { hasPayloads = terms.hasPayloads(); termsEnum = terms.iterator(); assert termsEnum != null; versions = reader.getNumericDocValues(VersionFieldMapper.NAME); } } this.versions = versions; this.termsEnum = termsEnum; this.hasPayloads = hasPayloads; }
/** * @return the estimate for loading the entire term set into field data, or 0 if unavailable */ public long estimateStringFieldData() { try { LeafReader reader = context.reader(); Terms terms = reader.terms(getFieldName()); Fields fields = reader.fields(); final Terms fieldTerms = fields.terms(getFieldName()); if (fieldTerms instanceof FieldReader) { final Stats stats = ((FieldReader) fieldTerms).getStats(); long totalTermBytes = stats.totalTermBytes; if (logger.isTraceEnabled()) { logger.trace("totalTermBytes: {}, terms.size(): {}, terms.getSumDocFreq(): {}", totalTermBytes, terms.size(), terms.getSumDocFreq()); } long totalBytes = totalTermBytes + (2 * terms.size()) + (4 * terms.getSumDocFreq()); return totalBytes; } } catch (Exception e) { logger.warn("Unable to estimate memory overhead", e); } return 0; }
/** * Returns total in-heap bytes used by all suggesters. This method has CPU cost <code>O(numIndexedFields)</code>. * * @param fieldNamePatterns if non-null, any completion field name matching any of these patterns will break out its in-heap bytes * separately in the returned {@link CompletionStats} */ public CompletionStats completionStats(IndexReader indexReader, String ... fieldNamePatterns) { CompletionStats completionStats = new CompletionStats(); for (LeafReaderContext atomicReaderContext : indexReader.leaves()) { LeafReader atomicReader = atomicReaderContext.reader(); try { Fields fields = atomicReader.fields(); for (String fieldName : fields) { Terms terms = fields.terms(fieldName); if (terms instanceof CompletionTerms) { CompletionTerms completionTerms = (CompletionTerms) terms; completionStats.add(completionTerms.stats(fieldNamePatterns)); } } } catch (IOException ioe) { logger.error("Could not get completion stats", ioe); } } return completionStats; }
@Override public DocIdSet getDocIdSet(LeafReaderContext context, Bits acceptDocs) throws IOException { final LeafReader reader = context.reader(); BitDocIdSet.Builder builder = new BitDocIdSet.Builder(reader.maxDoc()); final Fields fields = reader.fields(); final BytesRef spare = new BytesRef(this.termsBytes); Terms terms = null; TermsEnum termsEnum = null; PostingsEnum docs = null; for (TermsAndField termsAndField : this.termsAndFields) { if ((terms = fields.terms(termsAndField.field)) != null) { termsEnum = terms.iterator(); // this won't return null for (int i = termsAndField.start; i < termsAndField.end; i++) { spare.offset = offsets[i]; spare.length = offsets[i+1] - offsets[i]; if (termsEnum.seekExact(spare)) { docs = termsEnum.postings(docs, PostingsEnum.NONE); // no freq since we don't need them builder.or(docs); } } } } return BitsFilteredDocIdSet.wrap(builder.build(), acceptDocs); }
/** * @return the estimate for loading the entire term set into field data, or 0 if unavailable */ public long estimateStringFieldData() { try { LeafReader reader = context.reader(); Terms terms = reader.terms(getFieldNames().indexName()); Fields fields = reader.fields(); final Terms fieldTerms = fields.terms(getFieldNames().indexName()); if (fieldTerms instanceof FieldReader) { final Stats stats = ((FieldReader) fieldTerms).getStats(); long totalTermBytes = stats.totalTermBytes; if (logger.isTraceEnabled()) { logger.trace("totalTermBytes: {}, terms.size(): {}, terms.getSumDocFreq(): {}", totalTermBytes, terms.size(), terms.getSumDocFreq()); } long totalBytes = totalTermBytes + (2 * terms.size()) + (4 * terms.getSumDocFreq()); return totalBytes; } } catch (Exception e) { logger.warn("Unable to estimate memory overhead", e); } return 0; }