How to use
classifyToCharacterOffsets
method
in
edu.stanford.nlp.ie.AbstractSequenceClassifier

Best Java code snippets using edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyToCharacterOffsets (Showing top 2 results out of 315)

List<Triple<String, Integer, Integer>> list = classifier.classifyToCharacterOffsets(fileContents);
for (Triple<String, Integer, Integer> item : list) {
 System.out.println(item.first() + ": " + fileContents.substring(item.second(), item.third()));
for (String str : example) {
 j++;
 List<Triple<String,Integer,Integer>> triples = classifier.classifyToCharacterOffsets(str);
 for (Triple<String,Integer,Integer> trip : triples) {
  System.out.printf("%s over character offsets [%d, %d) in sentence %d.%n",

public List<Type> findTags(Sentence sentence) {
  List<Triple<String, Integer, Integer>> response = classifier.classifyToCharacterOffsets(sentence.originalText);
  List<Type> tags = new ArrayList<Type>();
  for (Triple<String, Integer, Integer> triple : response) {
    try {
      Range range = sentence.getTokenRange(triple.second, triple.third);
      StringBuffer nerStringBuilder = new StringBuffer(triple.first.toLowerCase());
      nerStringBuilder.setCharAt(0, Character.toUpperCase(nerStringBuilder.charAt(0)));
      String nerString = nerStringBuilder.toString();
      if (passesHack(nerString, sentence.tokenizedText(range))) {
        tags.add(new Type(sentence, this.descriptor + nerString, "Stanford", range));
      }
    }
    catch (IllegalArgumentException e) {
      // we might have trouble lining up the
      // character offsets to token offsets
      logger.error("offsets did not line up for: " + triple.second + ", " + triple.third, e);
    }
  }
  return tags;
}

Javadoc

Classify the contents of a String to classified character offset spans. Plain text or XML input text is expected and the PlainTextDocumentReaderAndWriter is used by default. Output is a (possibly empty, but not null) List of Triples. Each Triple is an entity name, followed by beginning and ending character offsets in the original String. Character offsets can be thought of as fenceposts between the characters, or, like certain methods in the Java String class, as character positions, numbered starting from 0, with the end index pointing to the position AFTER the entity ends. That is, end - start is the length of the entity in characters.

Fine points: Token offsets are true wrt the source text, even though the tokenizer may internally normalize certain tokens to String representations of different lengths (e.g., " becoming `` or ''). When a period counts as both part of an abbreviation and as an end of sentence marker, and that abbreviation is part of a named entity, the reported entity string excludes the period.

Popular methods of AbstractSequenceClassifier

classifySentence
Classify a List of IN. This method returns a new list of tokens, not the list of tokens passed in, a
backgroundSymbol
Returns the background class for the classifier.
classify
classifyAndWriteAnswers
classifyKBest
Takes a list of tokens and provides the K best sequence labelings of these tokens with their scores.
classifyToString
Classify the contents of a String to one of several String representations that shows the classes. P
classifyWithGlobalInformation
Classify a List of something that extends CoreMap using as additional information whatever is stored
classifyWithInlineXML
Classify the contents of a String. Plain text or XML is expected and the PlainTextDocumentReaderAndW
countResults
Count the successes and failures of the model on the given document. Fills numbers in to counters fo
getSequenceModel
getViterbiSearchGraph
labels

Popular in Java

Finding current android device location
getSharedPreferences (Context)
orElseThrow (Optional)
Return the contained value, if present, otherwise throw an exception to be created by the provided s
requestLocationUpdates (LocationManager)
BufferedReader (java.io)
Wraps an existing Reader and buffers the input. Expensive interaction with the underlying reader is
ByteBuffer (java.nio)
A buffer for bytes. A byte buffer can be created in either one of the following ways: * #allocate
PriorityQueue (java.util)
A PriorityQueue holds elements on a priority heap, which orders the elements according to their natu
UUID (java.util)
UUID is an immutable representation of a 128-bit universally unique identifier (UUID). There are mul
Filter (javax.servlet)
A filter is an object that performs filtering tasks on either the request to a resource (a servlet o
JPanel (javax.swing)
Top plugins for Android Studio

How to use classifyToCharacterOffsetsmethodin edu.stanford.nlp.ie.AbstractSequenceClassifier

Best Java code snippets using edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyToCharacterOffsets (Showing top 2 results out of 315)

How to use
classifyToCharacterOffsets
method
in
edu.stanford.nlp.ie.AbstractSequenceClassifier