This class may be used to produce a detailed report of the
segment by
segment performance of a given classifier on given labeled testing
data. Segment by segment performance is computed by using a specified
Token classifier to induce the predicted segments, and then
computing precision, recall, and F1 measures on those segments.
A predicted segment is judged as different than a labeled segment if the
two segments start or end at different
Tokens, or if they have
different types.
It is assumed that both of the specified
Token classifiers
(one giving labels and the other giving predictions) produce discrete
predicitions of the form B-type
,
I-type
, and O
to represent the beginning
of a segment of type type, a token inside a segment of type
type, and a token outside of any segment respectively.
It is also assumed that the specified
edu.illinois.cs.cogcomp.lbjava.parse.Parserproduces
Tokens linked to each other via the previous
and next
fields inherited from
edu.illinois.cs.cogcomp.lbjava.parse.LinkedChild. In order to invoke this class as a
program on the command line, it must also be the case that the parser
implements a constructor with a single String
argument.
Command Line Usage
java edu.illinois.cs.cogcomp.lbjava.edu.illinois.cs.cogcomp.lbjava.nlp.seg.BIOTester <classifier> <labeler>
<parser> <test file>
Input
The first three arguments must be fully qualified class names. The fourth
is the name of a file containing labeled testing data to be parsed by the
parser.
Output
The output is generated by the
edu.illinois.cs.cogcomp.lbjava.classify.TestDiscrete class.