Code example for Multiset

Methods: countelementSetsize

0
  @Override 
  protected double weight(byte[] originalForm) {
    // the counts here are adjusted so that every observed value has an extra 0.5 count 
    // as does a hypothetical unobserved value.  This smooths our estimates a bit and 
    // allows the first word seen to have a non-zero weight of -log(1.5 / 2) 
    double thisWord = dictionary.count(new String(originalForm, Charsets.UTF_8)) + 0.5;
    double allWords = dictionary.size() + dictionary.elementSet().size() * 0.5 + 0.5;
    return -Math.log(thisWord / allWords);
  } 
 
  public Multiset<String> getDictionary() {
    return dictionary;
  } 
}