- <init>
Create a new Word Extractor
- getText
Grab the text, based on the WordToTextConverter. Shouldn't include any crud, but
slower than getText
- getParagraphText
- close
- getFootnoteText
- appendHeaderFooter
Add the header/footer text, if it's not empty
- getCommentsText
- getMainTextboxText
- getSummaryInformation
- getTextFromPieces
Grab the text out of the text pieces. Might also include various bits of crud,
but will work in case
- stripFields
Removes any fields (eg macros, page markers etc) from the string.
- getDocSummaryInformation