How to use
getRddInfo
method
in
com.uber.marmaray.utilities.SparkUtil

Best Java code snippets using com.uber.marmaray.utilities.SparkUtil.getRddInfo (Showing top 1 results out of 315)

public final void execute() {
  this.forkFunction.registerAccumulators(this.inputRDD.rdd().sparkContext());
  // Converts JavaRDD<T> -> JavaRDD<List<Integer>, T>
  JavaRDD<ForkData<DI>> forkedData = this.inputRDD.flatMap(this.forkFunction)
    .persist(this.persistLevel);
  final String jobName = SparkJobTracker.getJobName(this.inputRDD.rdd().sparkContext());
  forkedData.setName(String.format("%s-%s", jobName, forkedData.id()));
  // deliberately calling count so that DAG gets executed.
  final long processedRecords = forkedData.count();
  final Optional<RDDInfo> rddInfo = SparkUtil.getRddInfo(forkedData.context(), forkedData.id());
  log.info("#processed records :{} name:{}", processedRecords, forkedData.name());
  if (rddInfo.isPresent()) {
    final long size = rddInfo.get().diskSize() + rddInfo.get().memSize();
    log.info("rddInfo -> name:{} partitions:{} size:{}", forkedData.name(), rddInfo.get().numPartitions(),
      size);
  }
  this.groupRDD = Optional.of(forkedData);
}

Popular methods of SparkUtil

getSparkConf
addClassesIfFound
deserialize
getOrCreateSparkSession
All code paths should use this central method to create a SparkSession as the getOrCreate() could re
getSerializerInstance
KryoSerializer is the the default serializaer
serialize

Popular in Java

Start an intent from android
getContentResolver (Context)
findViewById (Activity)
compareTo (BigDecimal)
Pointer (com.sun.jna)
An abstraction for a native pointer data type. A Pointer instance represents, on the Java side, a na
SocketTimeoutException (java.net)
This exception is thrown when a timeout expired on a socket read or accept operation.
ExecutorService (java.util.concurrent)
An Executor that provides methods to manage termination and methods that can produce a Future for tr
Collectors (java.util.stream)
JTextField (javax.swing)
Get (org.apache.hadoop.hbase.client)
Used to perform Get operations on a single row. To get everything for a row, instantiate a Get objec
Best plugins for Eclipse

How to use getRddInfomethodin com.uber.marmaray.utilities.SparkUtil

Best Java code snippets using com.uber.marmaray.utilities.SparkUtil.getRddInfo (Showing top 1 results out of 315)

How to use
getRddInfo
method
in
com.uber.marmaray.utilities.SparkUtil