Javadoc
CAS serializer support for XMI and JSON formats.
There are multiple use cases.
1) normal - the consumer is independent of UIMA
- (maybe) support for delta serialization
2) service calls:
- support deserialization with out-of-type-system set-aside, and subsequent serialization with re-merging
- guarantee of using same xmi:id's as were deserialized when serializing
- support for delta serialization
There is an outer class (one instance per "configuration" - reusable after configuration, and
an inner class - one per serialize call.
These classes are the common parts of serialization between XMI and JSON, mainly having to do with
1) enquuing the FS to be serialized
2) serializing according to their types and features
Methods marked public are not for public use but are that way to permit
other users of this class in other packages to "see" these methods.
XmiCasSerializer JsonCasSerializer
Instance Instance
css ref -------> CasSerializerSupport <------ css ref
XmiDocSerializer JsonDocSerializer
Instance Instance
(1 per serialize action) (1 per serialize action)
cds ref -------> CasDocSerializer <------- cds ref
csss points back
Construction:
new Xmi/JsonCasSerializer
initializes css with new CasSerializerSupport
serialize method creates a new Xmi/JsonDocSerializer inner class
constructor creates a new CasDocSerializer,
Use Cases and Algorithms
Support set-aside for out-of-type-system FS on deserialization (record in shareData)
implies can't determine sharing status of things ref'd by features; need to depend on
multiple-refs-allowed flag.
If multiple-refs found during serialization for feat marked non-shared, unshare these (make
2 serializations, one or more inplace, for example.
Perhaps not considered an error.
implies need (for non-delta case) to send all FSs that were deserialized - some may be ref'd by oots elements
** Could ** not do this if no oots elements, but could break some assumptions
and this only would apply to non-delta - not worth doing
Enqueuing:
There are two styles
- enqueueCommon: does **NOT** recursively enqueue features
- enqueue: calls enqueueCommon and then recursively enqueues features
enqueueCommon is called (bypassing enqueue) to defer scanning references
Order and target of enqueuing:
- things in the index
-- put on "queue"
-- first, the sofa's (which are the only things indexed in base view)
-- next, for each view, for each item, the FSs, but **NOT** following any feature/array refs
- things not in the index, but deserialized (incoming)
-- put on previouslySerializedFSs, no recursive descent for features
- (delta) enqueueNonsharedMultivaluedFS (lists and arrays)
-- put on modifiedEmbeddedValueFSs, no recursive descent for features
- recursive descent for
-- things in previouslySerializedFSs,
-- things in modifiedEmbeddedValueFSs
-- things in the index
The recursive descent is recursive, and an arbitrary long chain can get stack overflow error.
TODO Probably should fix this someday. See https://issues.apache.org/jira/browse/UIMA-106 *