How to use
setOfflineScan
method
in
org.apache.accumulo.core.client.mapreduce.InputTableConfig

Best Java code snippets using org.apache.accumulo.core.client.mapreduce.InputTableConfig.setOfflineScan (Showing top 3 results out of 315)

  .setUseIsolatedScanners(isIsolated(implementingClass, conf))
  .setUseLocalIterators(usesLocalIterators(implementingClass, conf))
  .setOfflineScan(isOfflineScan(implementingClass, conf));
return Maps.immutableEntry(tableName, queryConfig);

  .setUseIsolatedScanners(isIsolated(implementingClass, conf))
  .setUseLocalIterators(usesLocalIterators(implementingClass, conf))
  .setOfflineScan(isOfflineScan(implementingClass, conf));
return Maps.immutableEntry(tableName, queryConfig);

  .setUseIsolatedScanners(isIsolated(implementingClass, conf))
  .setUseLocalIterators(usesLocalIterators(implementingClass, conf))
  .setOfflineScan(isOfflineScan(implementingClass, conf))
  .setExecutionHints(getExecutionHints(implementingClass, conf));
return Maps.immutableEntry(tableName, queryConfig);

Javadoc

Enable reading offline tables. By default, this feature is disabled and only online tables are scanned. This will make the map reduce job directly read the table's files. If the table is not offline, then the job will fail. If the table comes online during the map reduce job, it is likely that the job will fail.

To use this option, the map reduce user will need access to read the Accumulo directory in HDFS.

Reading the offline table will create the scan time iterator stack in the map process. So any iterators that are configured for the table will need to be on the mapper's classpath. The accumulo.properties may need to be on the mapper's classpath if HDFS or the Accumulo directory in HDFS are non-standard.

One way to use this feature is to clone a table, take the clone offline, and use the clone as the input table for a map reduce job. If you plan to map reduce over the data many times, it may be better to the compact the table, clone it, take it offline, and use the clone for all map reduce jobs. The reason to do this is that compaction will reduce each tablet in the table to one file, and it is faster to read from one file.

There are two possible advantages to reading a tables file directly out of HDFS. First, you may see better read performance. Second, it will support speculative execution better. When reading an online table speculative execution can put more load on an already slow tablet server.

By default, this feature is disabled.

Popular methods of InputTableConfig

<init>
Creates a batch scan config object out of a previously serialized batch scan config object.
getIterators
Returns the iterators to be set on this configuration
setIterators
Set iterators on to be used in the query.
setRanges
Sets the input ranges to scan for all tables associated with this job. This will be added to any per
fetchColumns
Restricts the columns that will be mapped over for this job for the default input table.
getFetchedColumns
Returns the columns to be fetched for this configuration
getRanges
Returns the ranges to be queried in the configuration
getSamplerConfiguration
isOfflineScan
Determines whether a configuration has the offline table scan feature enabled.
readFields
setAutoAdjustRanges
Controls the automatic adjustment of ranges for this job. This feature merges overlapping ranges, th
setSamplerConfiguration
Set the sampler configuration to use when reading from the data.

Popular in Java

Reactive rest calls using spring rest template
scheduleAtFixedRate (ScheduledExecutorService)
orElseThrow (Optional)
Return the contained value, if present, otherwise throw an exception to be created by the provided s
setScale (BigDecimal)
URI (java.net)
A Uniform Resource Identifier that identifies an abstract or physical resource, as specified by RFC
KeyStore (java.security)
KeyStore is responsible for maintaining cryptographic keys and their owners. The type of the syste
SortedMap (java.util)
A map that has its keys ordered. The sorting is according to either the natural ordering of its keys
TimeZone (java.util)
TimeZone represents a time zone offset, and also figures out daylight savings. Typically, you get a
GridBagLayout (java.awt)
The GridBagLayout class is a flexible layout manager that aligns components vertically and horizonta
JOptionPane (javax.swing)
Github Copilot alternatives

How to use setOfflineScanmethodin org.apache.accumulo.core.client.mapreduce.InputTableConfig

Best Java code snippets using org.apache.accumulo.core.client.mapreduce.InputTableConfig.setOfflineScan (Showing top 3 results out of 315)

How to use
setOfflineScan
method
in
org.apache.accumulo.core.client.mapreduce.InputTableConfig