Codota Logo
DetectorFactory
Code IndexAdd Codota to your IDE (free)

How to use
DetectorFactory
in
com.cybozu.labs.langdetect

Best Java code snippets using com.cybozu.labs.langdetect.DetectorFactory (Showing top 19 results out of 315)

  • Add the Codota plugin to your IDE and get smart completions
private void myMethod () {
StringBuilder s =
  • Codota Iconnew StringBuilder()
  • Codota Iconnew StringBuilder(32)
  • Codota IconString str;new StringBuilder(str)
  • Smart code suggestions by Codota
}
origin: pl.edu.icm.synat/synat-process-common

public static YLanguage getLanguage(String text, Set<YLanguage> possibleLanguages) {
  try {
    Detector detector = DetectorFactory.create(0.5f);
    detector.append(text);
    return detectLanguage(possibleLanguages, detector);
  } catch (LangDetectException e) {
    log.debug("Couldn't determine content language", e);
    return YLanguage.Undetermined;
  }
}
origin: org.opencms/opencms-core

/**
 * Initializes the language detection.<p>
 */
private void initLanguageDetection() {
  try {
    // use a seed for initializing the language detection for making sure the
    // same probabilities are detected for the same document contents
    DetectorFactory.clear();
    DetectorFactory.setSeed(42L);
    DetectorFactory.loadProfile(loadProfiles(getAvailableLocales()));
  } catch (Exception e) {
    LOG.error(Messages.get().getBundle().key(Messages.INIT_I18N_LANG_DETECT_FAILED_0), e);
  }
}
origin: apache/stanbol

public LanguageIdentifier() throws LangDetectException {
  DetectorFactory.clear();
  try {
    DetectorFactory.loadProfile(loadProfiles("profiles","profiles.cfg"));
  } catch (Exception e) {
    throw new LangDetectException(null, "Error in Initialization: "+e.getMessage());
  } 
}
/**
origin: com.norconex.language/langdetect

/**
 * load profiles
 * @return false if load success
 */
private boolean loadProfile() {
  String profileDirectory = get("directory") + "/"; 
  try {
    DetectorFactory.loadProfile(profileDirectory);
    Long seed = getLong("seed");
    if (seed != null) DetectorFactory.setSeed(seed);
    return false;
  } catch (LangDetectException e) {
    System.err.println("ERROR: " + e.getMessage());
    return true;
  }
}
 
origin: org.opencms/langdetect-opencms

DetectorFactory.clear();
int count = 0;
int langsize = profiles.size();
for (LangProfile profile : profiles) {
  DetectorFactory.addProfile(profile, count, langsize);
  count++;
origin: com.norconex.language/langdetect

/**
 * Load profiles from specified directory.
 * This method must be called once before language detection.
 *  
 * @param profileDirectory profile directory path
 * @throws LangDetectException  Can't open profiles(error code = {@link ErrorCode#FileLoadError})
 *                              or profile's format is wrong (error code = {@link ErrorCode#FormatError})
 */
public static void loadProfile(String profileDirectory) throws LangDetectException {
  loadProfile(new File(profileDirectory));
}
origin: com.norconex.language/langdetect

  is = new FileInputStream(file);
  LangProfile profile = JSON.decode(is, LangProfile.class);
  addProfile(profile, index, langsize);
  ++index;
} catch (JSONException e) {
origin: pl.edu.icm.synat/synat-process-common

public static synchronized void loadData() {
  if (loaded) {
    return;
  }
  loaded = true;
  List<String> profileData = new ArrayList<String>();
  try {
    Charset encoding = Charset.forName("UTF-8");
    for (YLanguage language : detectableLanguages) {
      try (InputStream stream = new ClassPathResource("langdetect-profiles/" + language.getShortCode()).getInputStream();
          BufferedReader reader = new BufferedReader(new InputStreamReader(stream, encoding));) {
        profileData.add(new String(IOUtils.toCharArray(reader)));
      }
    }
    DetectorFactory.loadProfile(profileData);
    DetectorFactory.setSeed(System.currentTimeMillis());
  } catch (IOException | LangDetectException e) {
    throw new GeneralBusinessException(e);
  }
}
origin: com.norconex.language/langdetect

/**
 * Load profiles from specified directory.
 * This method must be called once before language detection.
 *  
 * @param profileDirectory profile directory path
 * @throws LangDetectException  Can't open profiles(error code = {@link ErrorCode#FileLoadError})
 *                              or profile's format is wrong (error code = {@link ErrorCode#FormatError})
 */
public static void loadProfile(List<String> json_profiles) throws LangDetectException {
  int index = 0;
  int langsize = json_profiles.size();
  if (langsize < 2)
    throw new LangDetectException(ErrorCode.NeedLoadProfileError, "Need more than 2 profiles");
    
  for (String json: json_profiles) {
    try {
      LangProfile profile = JSON.decode(json, LangProfile.class);
      addProfile(profile, index, langsize);
      ++index;
    } catch (JSONException e) {
      throw new LangDetectException(ErrorCode.FormatError, "profile format error");
    }
  }
}
origin: pl.edu.icm.synat/synat-process-common

public static YLanguage getLanguage(Reader text, Set<YLanguage> possibleLanguages) {
  try {
    Detector detector = DetectorFactory.create(0.5f);
    detector.append(text);
    return detectLanguage(possibleLanguages, detector);
  } catch (LangDetectException | IOException e) {
    log.debug("Couldn't determine content language", e);
    return YLanguage.Undetermined;
  }
}
origin: apache/stanbol

public List<Language> getLanguages(String text) throws LangDetectException {
  Detector detector = DetectorFactory.create();
  detector.append(text);
  return detector.getProbabilities();
}
origin: apache/stanbol

public String getLanguage(String text) throws LangDetectException {
  Detector detector = DetectorFactory.create();
  detector.append(text);
  return detector.detect();
}
 
origin: ch.epfl.bbp.nlp/bluima_commons

  public static String detect(String text) throws LangDetectException {

    Detector detector = DetectorFactory.create(0.5);
    detector.append(text);

    return detector.detect();
  }
}
origin: ViDA-NYU/ache

/**
 * Try to detect the language of the text in the String.
 * 
 * @param page
 * @return true if the String contains English language, false otherwise
 */
public Boolean isEnglish(String content) {
  try {
    if (content == null || content.isEmpty()) {
      return false;
    }
    Detector detector = DetectorFactory.create();
    detector.append(content);
    ArrayList<Language> langs = detector.getProbabilities();
    if (langs.size() == 0) {
      return false;
    }
    for (Language l : langs) {
      if (l.lang.equals("en")) {
        return true;
      }
    }
    return false;
  } catch (Exception ex) {
    logger.warn("Problem while detecting language in text: " + content, ex);
    return false;
  }
}
origin: com.norconex.language/langdetect

String text = line.substring(idx + 1);
Detector detector = DetectorFactory.create(getDouble("alpha", DEFAULT_ALPHA));
detector.append(text);
String lang = "";
origin: com.norconex.language/langdetect

/**
 * Language detection test for each file (--detectlang option)
 * 
 * <pre>
 * usage: --detectlang -d [profile directory] -a [alpha] -s [seed] [test file(s)]
 * </pre>
 * 
 */
public void detectLang() {
  if (loadProfile()) return;
  for (String filename: arglist) {
    BufferedReader is = null;
    try {
      is = new BufferedReader(new InputStreamReader(new FileInputStream(filename), "utf-8"));
      Detector detector = DetectorFactory.create(getDouble("alpha", DEFAULT_ALPHA));
      if (hasOpt("--debug")) detector.setVerbose();
      detector.append(is);
      System.out.println(filename + ":" + detector.getProbabilities());
    } catch (IOException e) {
      e.printStackTrace();
    } catch (LangDetectException e) {
      e.printStackTrace();
    } finally {
      try {
        if (is!=null) is.close();
      } catch (IOException e) {}
    }
  }
}
origin: org.opencms/opencms-core

/**
 * Returns the locale for the given text based on the language detection library.<p>
 *
 * The result will be <code>null</code> if the detection fails or the detected locale is not configured
 * in the 'opencms-system.xml' as available locale.<p>
 *
 * @param text the text to retrieve the locale for
 *
 * @return the detected locale for the given text
 */
public static Locale getLocaleForText(String text) {
  // try to detect locale by language detector
  if (isNotEmptyOrWhitespaceOnly(text)) {
    try {
      Detector detector = DetectorFactory.create();
      detector.append(text);
      String lang = detector.detect();
      Locale loc = new Locale(lang);
      if (OpenCms.getLocaleManager().getAvailableLocales().contains(loc)) {
        return loc;
      }
    } catch (LangDetectException e) {
      LOG.debug(e);
    }
  }
  return null;
}
origin: pl.edu.icm.synat/synat-process-common

public static YLanguage processLanguage(Collection<String> inputs, YLanguage currentLanguage) {
  loadData();
  boolean inputsEmpty = true;
  for (String input : inputs) {
    inputsEmpty = inputsEmpty && input.isEmpty();
  }
  if (currentLanguage.getShortCode().isEmpty() && unknownLanguages.contains(currentLanguage) && !inputsEmpty) {
    try {
      Detector detector = DetectorFactory.create(0.5);
      for (String input : inputs) {
        detector.append(input);
      }
      for (Language lang : detector.getProbabilities()) {
        YLanguage yLang = YLanguage.byCode(lang.lang);
        if (isSupported(yLang))
          return yLang;
      }
      return currentLanguage;
    } catch (LangDetectException e) {
      log.debug("Couldn't determine content language", e);
    }
  }
  return currentLanguage;
}
origin: nu.validator/validator

Detector detector = DetectorFactory.create();
detector.append(textContent);
detector.getProbabilities();
com.cybozu.labs.langdetectDetectorFactory

Javadoc

Language Detector Factory Class This class manages an initialization and constructions of Detector. Before using language detection library, load profiles with DetectorFactory#loadProfile(String) method and set initialization parameters. When the language detection, construct Detector instance via DetectorFactory#create(). See also Detector's sample code.
  • 4x faster improvement based on Elmer Garduno's code. Thanks!

Most used methods

  • create
    Construct Detector instance with smoothing parameter
  • loadProfile
    Load profiles from specified directory. This method must be called once before language detection.
  • clear
    Clear loaded language profiles (reinitialization to be available)
  • setSeed
  • addProfile
  • createDetector

Popular in Java

  • Reactive rest calls using spring rest template
  • getOriginalFilename (MultipartFile)
    Return the original filename in the client's filesystem.This may contain path information depending
  • getContentResolver (Context)
  • notifyDataSetChanged (ArrayAdapter)
  • Pointer (com.sun.jna)
    An abstraction for a native pointer data type. A Pointer instance represents, on the Java side, a na
  • Point (java.awt)
    A point representing a location in (x, y) coordinate space, specified in integer precision.
  • Selector (java.nio.channels)
    A controller for the selection of SelectableChannel objects. Selectable channels can be registered w
  • LinkedHashMap (java.util)
    Hash table and linked list implementation of the Map interface, with predictable iteration order. Th
  • UUID (java.util)
    UUID is an immutable representation of a 128-bit universally unique identifier (UUID). There are mul
  • ServletException (javax.servlet)
    Defines a general exception a servlet can throw when it encounters difficulty.
Codota Logo
  • Products

    Search for Java codeSearch for JavaScript codeEnterprise
  • IDE Plugins

    IntelliJ IDEAWebStormAndroid StudioEclipseVisual Studio CodePyCharmSublime TextPhpStormVimAtomGoLandRubyMineEmacsJupyter
  • Company

    About UsContact UsCareers
  • Resources

    FAQBlogCodota Academy Plugin user guide Terms of usePrivacy policyJava Code IndexJavascript Code Index
Get Codota for your IDE now