Enable reading offline tables. By default, this feature is disabled and only online tables are
scanned. This will make the map reduce job directly read the table's files. If the table is not
offline, then the job will fail. If the table comes online during the map reduce job, it is
likely that the job will fail.
To use this option, the map reduce user will need access to read the Accumulo directory in
HDFS.
Reading the offline table will create the scan time iterator stack in the map process. So any
iterators that are configured for the table will need to be on the mapper's classpath. The
accumulo.properties may need to be on the mapper's classpath if HDFS or the Accumulo directory
in HDFS are non-standard.
One way to use this feature is to clone a table, take the clone offline, and use the clone as
the input table for a map reduce job. If you plan to map reduce over the data many times, it
may be better to the compact the table, clone it, take it offline, and use the clone for all
map reduce jobs. The reason to do this is that compaction will reduce each tablet in the table
to one file, and it is faster to read from one file.
There are two possible advantages to reading a tables file directly out of HDFS. First, you may
see better read performance. Second, it will support speculative execution better. When reading
an online table speculative execution can put more load on an already slow tablet server.
By default, this feature is disabled.