Base class for filters dealing with the body of text documents only.
Subclasses can safely be used as either pre-parse or post-parse handlers
restricted to text documents only (see
AbstractImporterHandler).
Since 2.5.0, when used as a pre-parse handler,
this class attempts to detect the content character
encoding unless the character encoding
was specified using
#setSourceCharset(String). Since document
parsing converts content to UTF-8, UTF-8 is always assumed when
used as a post-parse handler.
Subclasses inherit this
IXMLConfigurable configuration:
<!-- parent tag has these attribute:
sourceCharset="(character encoding)"
onMatch="[include|exclude]"
-->
<restrictTo
caseSensitive="[false|true]"
field="(name of header/metadata field name to match)">
(regular expression of value to match)
</restrictTo>
<!-- multiple "restrictTo" tags allowed (only one needs to match) -->