Javadoc
Add label uniquely and disjointly; intersection with
another set or int/char forces breaking up the set(s).
Example, if reachable list of labels is [a..z, {k,9}, 0..9],
the disjoint list will be [{a..j,l..z}, k, 9, 0..8].
As we add NFA configurations to a DFA state, we might as well track
the set of all possible transition labels to make the DFA conversion
more efficient. W/o the reachable labels, we'd need to check the
whole vocabulary space (could be 0..\uFFFF)! The problem is that
labels can be sets, which may overlap with int labels or other sets.
As we need a deterministic set of transitions from any
state in the DFA, we must make the reachable labels set disjoint.
This operation amounts to finding the character classes for this
DFA state whereas with tools like flex, that need to generate a
homogeneous DFA, must compute char classes across all states.
We are going to generate DFAs with heterogeneous states so we
only care that the set of transitions out of a single state are
unique. :)
The idea for adding a new set, t, is to look for overlap with the
elements of existing list s. Upon overlap, replace
existing set s[i] with two new disjoint sets, s[i]-t and s[i]&t.
(if s[i]-t is nil, don't add). The remainder is t-s[i], which is
what you want to add to the set minus what was already there. The
remainder must then be compared against the i+1..n elements in s
looking for another collision. Each collision results in a smaller
and smaller remainder. Stop when you run out of s elements or
remainder goes to nil. If remainder is non nil when you run out of
s elements, then add remainder to the end.
Single element labels are treated as sets to make the code uniform.