A fully asynchronous, thread-safe, modern HBase client.
Unlike the traditional HBase client (
HTable), this client should be
instantiated only once. You can use it with any number of tables at the
same time. The only case where you should have multiple instances is when
you want to use multiple different clusters at the same time.
If you play by the rules, this client is (in theory
:D) completely
thread-safe. Read the documentation carefully to know what the requirements
are for this guarantee to apply.
This client is fully non-blocking, any blocking operation will return a
Deferred instance to which you can attach a
Callback chain
that will execute when the asynchronous operation completes.
Note regarding
HBaseRpc instances passed to this class
Every
HBaseRpc passed to a method of this class should not be
changed or re-used until the
Deferred returned by that method
calls you back.
Changing or re-using any
HBaseRpc for
an RPC in flight will lead to unpredictable results and voids
your warranty.
Data Durability
Some methods or RPC types take a
durable argument. When an edit
requests to be durable, the success of the RPC guarantees that the edit is
safely and durably stored by HBase and won't be lost. In case of server
failures, the edit won't be lost although it may become momentarily
unavailable. Setting the
durable argument to
false makes
the operation complete faster (and puts a lot less strain on HBase), but
removes this durability guarantee. In case of a server failure, the edit
may (or may not) be lost forever. When in doubt, leave it to
true(or use the corresponding method that doesn't accept a
durableargument as it will default to
true). Setting it to
falseis useful in cases where data-loss is acceptable, e.g. during batch imports
(where you can re-run the whole import in case of a failure), or when you
intend to do statistical analysis on the data (in which case some missing
data won't affect the results as long as the data loss caused by machine
failures preserves the distribution of your data, which depends on how
you're building your row keys and how you're using HBase, so be careful).
Bear in mind that this durability guarantee holds only once the RPC has
completed successfully. Any edit temporarily buffered on the client side
or in-flight will be lost if the client itself crashes. You can control
how much buffering is done by the client by using
#setFlushIntervaland you can force-flush the buffered edits by calling
#flush. When
you're done using HBase, you must not just give up your
reference to your
HBaseClient, you must shut it down gracefully by
calling
#shutdown. If you fail to do this, then all edits still
buffered by the client will be lost.
NOTE: This entire section assumes that you use a distributed file
system that provides HBase with the required durability semantics. If
you use HDFS, make sure you have a version of HDFS that provides HBase
the necessary API and semantics to durability store its data.
throws clauses
None of the asynchronous methods in this API are expected to throw an
exception. But the
Deferred object they return to you can carry an
exception that you should handle (using "errbacks", see the javadoc of
Deferred). In order to be able to do proper asynchronous error
handling, you need to know what types of exceptions you're expected to face
in your errbacks. In order to document that, the methods of this API use
javadoc's
@throws to spell out the exception types you should
handle in your errback. Asynchronous exceptions will be indicated as such
in the javadoc with "(deferred)".
For instance, if a method
foo pretends to throw an
UnknownScannerException and returns a
Deferred,
then you should use the method like so:
HBaseClient client = ...;
Deferred d = client.foo();
d.addCallbacks(new
Callback () {
SomethingElse call(Whatever arg) {
LOG.info("Yay, RPC completed successfully!");
return new SomethingElse(arg.getWhateverResult());
}
String toString() {
return "handle foo response";
}
},
new
Callback () {
Object call(Exception arg) {
if (arg instanceof
UnknownScannerException) {
LOG.error("Oops, we used the wrong scanner?", arg);
return otherAsyncOperation(); // returns a
Deferred}
LOG.error("Sigh, the RPC failed and we don't know what to do", arg);
return arg; // Pass on the error to the next errback (if any);
}
String toString() {
return "foo errback";
}
});
This code calls
foo, and upon successful completion transforms the
result from a
Whatever to a
SomethingElse (which will then
be given to the next callback in the chain, if any). When there's a
failure, the errback is called instead and it attempts to handle a
particular type of exception by retrying the operation differently.