How to use
Zemberek3StemFilter
in
org.apache.lucene.analysis.tr

Best Java code snippets using org.apache.lucene.analysis.tr.Zemberek3StemFilter (Showing top 4 results out of 315)

@Override
public TokenStream create(TokenStream input) {
  return new Zemberek3StemFilter(input, morphology, strategy);
}

static String stem(WordAnalysis results, String aggregation) {
  List<SingleAnalysis> alternatives = selectMorphemes(results, "minMorpheme");
  List<String> candidates = morphToString(alternatives, "lemmas");
  switch (aggregation) {
    case "maxLength":
      return Collections.max(candidates, Comparator.comparing(String::length));
    case "minLength":
      return Collections.min(candidates, Comparator.comparing(String::length));
    default:
      throw new RuntimeException("unknown strategy " + aggregation);
  }
}

private static void parse(String word, TurkishMorphology morphology) {
  WordAnalysis results = morphology.analyze(word);
  System.out.println("Word = " + word + " has " + results.analysisCount() + " many solutions");
  if (results.analysisCount() == 0) return;
  System.out.println("Parses: ");
  for (SingleAnalysis result : results) {
    System.out.println("number of morphemes = " + result.getMorphemeDataList().size()) ;
    System.out.println(result.formatLong());
    System.out.println("\tStems = " + result.getStems());
    System.out.println("\tLemmas = " + result.getLemmas());
    System.out.println("\tStemAndEnding = " + result.getStemAndEnding());
    System.out.println("-------------------");
  }
  System.out.println("final selected stem : " + Zemberek3StemFilter.stem(results, "maxLength"));
  System.out.println("==================================");
}

  @Override
  public boolean incrementToken() throws IOException {

    if (!input.incrementToken()) return false;
    if (keywordAttribute.isKeyword()) return true;

    /**
     *  copied from {@link org.apache.lucene.analysis.br.BrazilianStemFilter#incrementToken}
     */
    final String word = termAttribute.toString();

    final WordAnalysis parses = morphology.analyze(word);
    if (parses.analysisCount() == 0) return true;

    final String s = stem(parses, aggregation);
    // If not stemmed, don't waste the time adjusting the token.
    if ((s != null) && !s.equals(word))
      termAttribute.setEmpty().append(s);

    return true;
  }
}

Javadoc

Stemmer based on Zemberek3

Most used methods

Popular in Java

Running tasks concurrently on multiple threads
onCreateOptionsMenu (Activity)
startActivity (Activity)
getOriginalFilename (MultipartFile)
Return the original filename in the client's filesystem.This may contain path information depending
EOFException (java.io)
Thrown when a program encounters the end of a file or stream during an input operation.
ReentrantLock (java.util.concurrent.locks)
A reentrant mutual exclusion Lock with the same basic behavior and semantics as the implicit monitor
Graphics2D (java.awt)
This Graphics2D class extends the Graphics class to provide more sophisticated control overgraphics
Rectangle (java.awt)
A Rectangle specifies an area in a coordinate space that is enclosed by the Rectangle object's top-
JList (javax.swing)
Join (org.hibernate.mapping)
Top 12 Jupyter Notebook extensions

How to useZemberek3StemFilter in org.apache.lucene.analysis.tr

Best Java code snippets using org.apache.lucene.analysis.tr.Zemberek3StemFilter (Showing top 4 results out of 315)

How to use
Zemberek3StemFilter
in
org.apache.lucene.analysis.tr