A Base for
TableInputFormats. Receives a
HTable, a
byte[] of input columns and optionally a
Filter.
Subclasses may use other TableRecordReader implementations.
Subclasses MUST ensure initializeTable(Connection, TableName) is called for an instance to
function properly. Each of the entry points to this class used by the MapReduce framework,
#getRecordReader(InputSplit,JobConf,Reporter) and
#getSplits(JobConf,int),
will call
#initialize(JobConf) as a convenient centralized location to handle
retrieving the necessary configuration information. If your subclass overrides either of these
methods, either call the parent version or call initialize yourself.
An example of a subclass:
class ExampleTIF extends TableInputFormatBase {
@Override
protected void initialize(JobConf context) throws IOException {
// We are responsible for the lifecycle of this connection until we hand it over in
// initializeTable.
Connection connection =
ConnectionFactory.createConnection(HBaseConfiguration.create(job));
TableName tableName = TableName.valueOf("exampleTable");
// mandatory. once passed here, TableInputFormatBase will handle closing the connection.
initializeTable(connection, tableName);
byte[][] inputColumns = new byte [][] { Bytes.toBytes("columnA"),
Bytes.toBytes("columnB") };
// mandatory
setInputColumns(inputColumns);
// optional, by default we'll get everything for the given columns.
Filter exampleFilter = new RowFilter(CompareOp.EQUAL, new RegexStringComparator("aa.*"));
setRowFilter(exampleFilter);
}
}