eu.stratosphere.api.java.record.io.CsvOutputFormat$ConfigBuilder java code examples

.recordDelimiter('\n')
.fieldDelimiter('|')
.lenient(true)
.field(IntValue.class, 0)
.field(IntValue.class, 1)
.field(IntValue.class, 2);

result.setDegreeOfParallelism(numSubTasks);
CsvOutputFormat.configureRecordFormat(result)
  .recordDelimiter('\n')
  .fieldDelimiter('|')
  .lenient(true)
  .field(IntValue.class, 1)
  .field(StringValue.class, 0)
  .field(IntValue.class, 2);

.recordDelimiter('\n')
.fieldDelimiter('|')
.lenient(true)
.field(LongValue.class, 0)
.field(IntValue.class, 1)
.field(DoubleValue.class, 2);

.recordDelimiter('\n')
.fieldDelimiter('|')
.lenient(true)
.field(LongValue.class, 0)
.field(IntValue.class, 1)
.field(DoubleValue.class, 2);

result.setDegreeOfParallelism(numSubtasks);
CsvOutputFormat.configureRecordFormat(result)
  .recordDelimiter('\n')
  .fieldDelimiter('|')
  .field(IntValue.class, 0)
  .field(StringValue.class, 1);

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(LongValue.class, 0)
.field(LongValue.class, 1);

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(LongValue.class, 0)
.field(DoubleValue.class, 1);

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(LongValue.class, 0)
.field(LongValue.class, 1);

FileDataSink out = new FileDataSink(new CsvOutputFormat(), OUT_FILE, reduceNode, "Word Counts");
CsvOutputFormat.configureRecordFormat(out)
  .recordDelimiter('\n')
  .fieldDelimiter(' ')
  .lenient(true)
  .field(StringValue.class, 0)
  .field(IntValue.class, 1);

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(LongValue.class, 0)
.field(LongValue.class, 1);

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(LongValue.class, 0)
.field(LongValue.class, 1);

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(LongValue.class, 0)
.field(LongValue.class, 1);

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(StringValue.class, 0)
.field(IntValue.class, 1);

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(StringValue.class, 0)
.field(StringValue.class, 1)
.field(StringValue.class, 2);

.recordDelimiter('\n')
.fieldDelimiter(',')
.lenient(true)
.field(IntValue.class, 0)
.field(IntValue.class, 1)
.field(IntValue.class, 2);

FileDataSink out = new FileDataSink(new CsvOutputFormat(), output, reducer, "Word Counts");
CsvOutputFormat.configureRecordFormat(out)
  .recordDelimiter('\n')
  .fieldDelimiter(' ')
  .field(StringValue.class, 0)
  .field(IntValue.class, 1);

@Override
public Plan getPlan(String... args) {
  
  // parse job parameters
  int numSubTasks = (args.length > 0 ? Integer.parseInt(args[0]) : 1);
  String dataInput = (args.length > 1 ? args[1] : "");
  String output = (args.length > 2 ? args[2] : "");
  @SuppressWarnings("unchecked")
  CsvInputFormat format = new CsvInputFormat(' ', IntValue.class, IntValue.class);
  FileDataSource input = new FileDataSource(format, dataInput, "Input");
  
  // create the reduce contract and sets the key to the first field
  ReduceOperator sorter = ReduceOperator.builder(new IdentityReducer(), IntValue.class, 0)
    .input(input)
    .name("Reducer")
    .build();
  // sets the group sorting to the second field
  sorter.setGroupOrder(new Ordering(1, IntValue.class, Order.ASCENDING));
  // create and configure the output format
  FileDataSink out = new FileDataSink(new CsvOutputFormat(), output, sorter, "Sorted Output");
  CsvOutputFormat.configureRecordFormat(out)
    .recordDelimiter('\n')
    .fieldDelimiter(' ')
    .field(IntValue.class, 0)
    .field(IntValue.class, 1);
  
  Plan plan = new Plan(out, "SecondarySort Example");
  plan.setDefaultParallelism(numSubTasks);
  return plan;
}

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(StringValue.class, 0);

public Plan getPlan(int numSubTasks, String output) {
  List<Object> tmp = new ArrayList<Object>();
  int pos = 0;
  for (String s : WordCountData.COUNTS.split("\n")) {
    List<Object> tmpInner = new ArrayList<Object>();
    tmpInner.add(pos++);
    tmpInner.add(Integer.parseInt(s.split(" ")[1]));
    tmp.add(tmpInner);
  }
  // test serializable iterator input, the input record is {id, word}
  CollectionDataSource source = new CollectionDataSource(new SerializableIteratorTest(), "test_iterator");
  // test collection input, the input record is {id, count}
  CollectionDataSource source2 = new CollectionDataSource(tmp, "test_collection");
  JoinOperator join = JoinOperator.builder(Join.class, IntValue.class, 0, 0)
    .input1(source).input2(source2).build();
  FileDataSink out = new FileDataSink(new CsvOutputFormat(), output, join, "Collection Join");
  CsvOutputFormat.configureRecordFormat(out)
    .recordDelimiter('\n')
    .fieldDelimiter(' ')
    .field(StringValue.class, 0)
    .field(IntValue.class, 1);
  Plan plan = new Plan(out, "CollectionDataSource");
  plan.setDefaultParallelism(numSubTasks);
  return plan;
}

.recordDelimiter('\n')
.fieldDelimiter(' ')
.field(StringValue.class, 0);

Javadoc

A builder used to set parameters to the input format's configuration in a fluent way.

Most used methods

field
fieldDelimiter
recordDelimiter
<init>
Creates a new builder for the given configuration.
lenient

Popular in Java

Start an intent from android
addToBackStack (FragmentTransaction)
findViewById (Activity)
setContentView (Activity)
FileInputStream (java.io)
An input stream that reads bytes from a file. File file = ...finally if (in != null) in.clos
FileOutputStream (java.io)
An output stream that writes bytes to a file. If the output file exists, it can be replaced or appen
SortedSet (java.util)
SortedSet is a Set which iterates over its elements in a sorted order. The order is determined eithe
VirtualMachine (com.sun.tools.attach)
A Java virtual machine. A VirtualMachine represents a Java virtual machine to which this Java vir
BufferedImage (java.awt.image)
The BufferedImage subclass describes an java.awt.Image with an accessible buffer of image data. All
Response (javax.ws.rs.core)
Defines the contract between a returned instance and the runtime when an application needs to provid
Top Vim plugins

How to useCsvOutputFormat$ConfigBuilder in eu.stratosphere.api.java.record.io

Best Java code snippets using eu.stratosphere.api.java.record.io.CsvOutputFormat$ConfigBuilder (Showing top 20 results out of 315)

How to use
CsvOutputFormat$ConfigBuilder
in
eu.stratosphere.api.java.record.io