A rudimentary XLSX -> CSV processor modeled on the
POI sample program XLS2CSVmra from the package
org.apache.poi.hssf.eventusermodel.examples.
As with the HSSF version, this tries to spot missing
rows and cells, and output empty entries for them.
Data sheets are read using a SAX parser to keep the
memory footprint relatively small, so this should be
able to read enormous workbooks. The styles table and
the shared-string table must be kept in memory. The
standard POI styles table class is used, but a custom
(read-only) class is used for the shared string table
because the standard POI SharedStringsTable grows very
quickly with the number of unique strings.
For a more advanced implementation of SAX event parsing
of XLSX files, see
XSSFEventBasedExcelExtractorand
XSSFSheetXMLHandler. Note that for many cases,
it may be possible to simply use those with a custom
SheetContentsHandler and no SAX code needed of
your own!