Codota Logo
CmsExtractorPdf
Code IndexAdd Codota to your IDE (free)

How to use
CmsExtractorPdf
in
org.opencms.search.extractors

Best Java code snippets using org.opencms.search.extractors.CmsExtractorPdf (Showing top 7 results out of 315)

  • Add the Codota plugin to your IDE and get smart completions
private void myMethod () {
ScheduledThreadPoolExecutor s =
  • Codota Iconnew ScheduledThreadPoolExecutor(corePoolSize)
  • Codota IconThreadFactory threadFactory;new ScheduledThreadPoolExecutor(corePoolSize, threadFactory)
  • Codota IconString str;new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder().setNameFormat(str).build())
  • Smart code suggestions by Codota
}
origin: org.opencms/opencms-solr

String result = removeControlChars(stripper.getText(pdfDocument));
StringBuffer content = new StringBuffer(result);
if (CmsStringUtil.isNotEmpty(result)) {
combineContentItem(info.getTitle(), I_CmsExtractionResult.ITEM_TITLE, content, contentItems);
combineContentItem(info.getKeywords(), I_CmsExtractionResult.ITEM_KEYWORDS, content, contentItems);
combineContentItem(info.getSubject(), I_CmsExtractionResult.ITEM_SUBJECT, content, contentItems);
combineContentItem(info.getAuthor(), I_CmsExtractionResult.ITEM_AUTHOR, content, contentItems);
combineContentItem(info.getCreator(), I_CmsExtractionResult.ITEM_CREATOR, content, contentItems);
combineContentItem(info.getProducer(), I_CmsExtractionResult.ITEM_PRODUCER, content, contentItems);
origin: org.opencms/opencms-core

  /**
   * @see org.opencms.search.extractors.I_CmsTextExtractor#extractText(java.io.InputStream, java.lang.String)
   */
  @Override
  public I_CmsExtractionResult extractText(InputStream in) throws Exception {

    return extractText(in, new PDFParser());
  }
}
origin: org.opencms/opencms-core

  textExtractor = CmsExtractorPdf.getExtractor();
} else if (path1.endsWith(".doc") && path2.endsWith(".doc")) {
  textExtractor = CmsExtractorMsOfficeOLE2.getExtractor();
origin: org.opencms/org.opencms.workplace

  textExtractor = CmsExtractorPdf.getExtractor();
} else if (path1.endsWith(".doc") && path2.endsWith(".doc")) {
  textExtractor = CmsExtractorMsOfficeOLE2.getExtractor();
origin: org.opencms/opencms-solr

  textExtractor = CmsExtractorPdf.getExtractor();
} else if (path1.endsWith(".doc") && path2.endsWith(".doc")) {
  textExtractor = CmsExtractorMsWord.getExtractor();
origin: org.opencms/opencms-solr

/**
 * Returns the raw text content of a given vfs resource containing Adobe PDF data.<p>
 * 
 * @see org.opencms.search.documents.I_CmsSearchExtractor#extractContent(CmsObject, CmsResource, CmsSearchIndex)
 */
public I_CmsExtractionResult extractContent(CmsObject cms, CmsResource resource, CmsSearchIndex index)
throws CmsIndexException, CmsException {
  CmsFile file = readFile(cms, resource);
  try {
    return CmsExtractorPdf.getExtractor().extractText(file.getContents());
  } catch (Exception e) {
    if (e instanceof CryptographyException) {
      throw new CmsIndexException(Messages.get().container(
        Messages.ERR_DECRYPTING_RESOURCE_1,
        resource.getRootPath()), e);
    }
    if (e instanceof InvalidPasswordException) {
      // default password "" was wrong.
      throw new CmsIndexException(Messages.get().container(
        Messages.ERR_PWD_PROTECTED_1,
        resource.getRootPath()), e);
    }
    throw new CmsIndexException(
      Messages.get().container(Messages.ERR_TEXT_EXTRACTION_1, resource.getRootPath()),
      e);
  }
}
origin: org.opencms/opencms-core

/**
 * Returns the raw text content of a given vfs resource containing Adobe PDF data.<p>
 *
 * @see org.opencms.search.documents.I_CmsSearchExtractor#extractContent(CmsObject, CmsResource, CmsSearchIndex)
 */
public I_CmsExtractionResult extractContent(CmsObject cms, CmsResource resource, CmsSearchIndex index)
throws CmsIndexException, CmsException {
  logContentExtraction(resource, index);
  CmsFile file = readFile(cms, resource);
  try {
    return CmsExtractorPdf.getExtractor().extractText(file.getContents());
  } catch (Exception e) {
    if (e instanceof CryptographyException) {
      throw new CmsIndexException(
        Messages.get().container(Messages.ERR_DECRYPTING_RESOURCE_1, resource.getRootPath()),
        e);
    }
    if (e instanceof InvalidPasswordException) {
      // default password "" was wrong.
      throw new CmsIndexException(
        Messages.get().container(Messages.ERR_PWD_PROTECTED_1, resource.getRootPath()),
        e);
    }
    throw new CmsIndexException(
      Messages.get().container(Messages.ERR_TEXT_EXTRACTION_1, resource.getRootPath()),
      e);
  }
}
org.opencms.search.extractorsCmsExtractorPdf

Javadoc

Extracts the text from a PDF document.

Most used methods

  • getExtractor
    Returns an instance of this text extractor.
  • combineContentItem
  • extractText
  • removeControlChars

Popular in Java

  • Making http post requests using okhttp
  • getResourceAsStream (ClassLoader)
  • getSharedPreferences (Context)
  • scheduleAtFixedRate (Timer)
    Schedules the specified task for repeated fixed-rate execution, beginning after the specified delay.
  • EOFException (java.io)
    Thrown when a program encounters the end of a file or stream during an input operation.
  • ServerSocket (java.net)
    This class represents a server-side socket that waits for incoming client connections. A ServerSocke
  • UnknownHostException (java.net)
    Thrown when a hostname can not be resolved.
  • Format (java.text)
    The base class for all formats. This is an abstract base class which specifies the protocol for clas
  • TimerTask (java.util)
    A task that can be scheduled for one-time or repeated execution by a Timer.
  • Handler (java.util.logging)
    A Handler object accepts a logging request and exports the desired messages to a target, for example
Codota Logo
  • Products

    Search for Java codeSearch for JavaScript codeEnterprise
  • IDE Plugins

    IntelliJ IDEAWebStormAndroid StudioEclipseVisual Studio CodePyCharmSublime TextPhpStormVimAtomGoLandRubyMineEmacsJupyter
  • Company

    About UsContact UsCareers
  • Resources

    FAQBlogCodota Academy Plugin user guide Terms of usePrivacy policyJava Code IndexJavascript Code Index
Get Codota for your IDE now