How to use
BypassMergeSortShuffleWriter
in
org.apache.spark.shuffle.sort

Best Java code snippets using org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter (Showing top 3 results out of 315)

File tmp = Utils.tempFileWith(output);
try {
 partitionLengths = writePartitionedFile(tmp);
 shuffleBlockResolver.writeIndexFileAndCommit(shuffleId, mapId, partitionLengths, tmp);
} finally {

File tmp = Utils.tempFileWith(output);
try {
 partitionLengths = writePartitionedFile(tmp);
 shuffleBlockResolver.writeIndexFileAndCommit(shuffleId, mapId, partitionLengths, tmp);
} finally {

File tmp = Utils.tempFileWith(output);
try {
 partitionLengths = writePartitionedFile(tmp);
 shuffleBlockResolver.writeIndexFileAndCommit(shuffleId, mapId, partitionLengths, tmp);
} finally {

Javadoc

This class implements sort-based shuffle's hash-style shuffle fallback path. This write path writes incoming records to separate files, one file per reduce partition, then concatenates these per-partition files to form a single output file, regions of which are served to reducers. Records are not buffered in memory. It writes output in a format that can be served / consumed via org.apache.spark.shuffle.IndexShuffleBlockResolver.

This write path is inefficient for shuffles with large numbers of reduce partitions because it simultaneously opens separate serializers and file streams for all partitions. As a result, SortShuffleManager only selects this write path when

no Ordering is specified,
no Aggregator is specified, and
the number of partitions is less than spark.shuffle.sort.bypassMergeThreshold.

This code used to be part of org.apache.spark.util.collection.ExternalSorter but was refactored into its own class in order to reduce code complexity; see SPARK-7855 for details.

There have been proposals to completely remove this code path; see SPARK-6026 for details.

Most used methods

writePartitionedFile
Concatenate all of the per-partition files into a single combined file.

Popular in Java

Running tasks concurrently on multiple threads
getSystemService (Context)
runOnUiThread (Activity)
compareTo (BigDecimal)
InputStreamReader (java.io)
A class for turning a byte stream into a character stream. Data read from the source input stream is
HttpURLConnection (java.net)
An URLConnection for HTTP (RFC 2616 [http://tools.ietf.org/html/rfc2616]) used to send and receive d
Timer (java.util)
Timers schedule one-shot or recurring TimerTask for execution. Prefer java.util.concurrent.Scheduled
Component (java.awt)
A component is an object having a graphical representation that can be displayed on the screen and t
Join (org.hibernate.mapping)
Scheduler (org.quartz)
This is the main interface of a Quartz Scheduler. A Scheduler maintains a registry of org.quartz.Job
Top Sublime Text plugins

How to useBypassMergeSortShuffleWriter in org.apache.spark.shuffle.sort

Best Java code snippets using org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter (Showing top 3 results out of 315)

How to use
BypassMergeSortShuffleWriter
in
org.apache.spark.shuffle.sort