int blockSize = 10; Configuration cfg = new Configuration(); RowContainer result = new RowContainer(blockSize, cfg, null); LazyBinarySerDe serde = new LazyBinarySerDe(); Properties props = new Properties(); props.put(serdeConstants.LIST_COLUMN_TYPES, "array<string>"); SerDeUtils.initializeSerDe(serde, null, props, null); result.setSerDe(serde, ObjectInspectorUtils.getStandardObjectInspector(serde.getObjectInspector())); result.setTableDesc( PTFRowContainer.createTableDesc((StructObjectInspector) serde.getObjectInspector())); TimestampWritableV2 key = new TimestampWritableV2(Timestamp.ofEpochMilli(10)); result.setKeyObject(Lists.newArrayList(key)); List<Writable> row; row = new ArrayList<Writable>(); row.add(new Text("" + i)); result.addRow(row); assertEquals(2, result.getNumFlushedBlocks()); result.setKeyObject(null); assertEquals(Lists.newArrayList(0).toString(), result.first().get(0).toString()); for (int i = 1; i < result.rowCount() - 1; i++) { assertEquals(Lists.newArrayList(i).toString(), result.next().get(0).toString()); result.close();
closeWriter(); closeReader(); this.currentReadBlock = this.currentWriteBlock; } else { JobConf localJc = getLocalFSJobConfClone(jc); if (inputSplits == null) { if (this.inputFormat == null) { currentSplitPointer++; nextBlock(0); removeKeys(ret); return ret; } catch (Exception e) {
void endGroup() throws IOException, HiveException { if (skewKeyInCurrentGroup) { Path specPath = conf.getBigKeysDirMap().get((byte) currBigKeyTag); RowContainer<ArrayList<Object>> bigKey = (RowContainer)joinOp.storage[currBigKeyTag]; Path outputPath = getOperatorOutputPath(specPath); FileSystem destFs = outputPath.getFileSystem(hconf); bigKey.copyToDFSDirecory(destFs, outputPath); for (int i = 0; i < numAliases; i++) { if (((byte) i) == currBigKeyTag) { continue; } RowContainer<ArrayList<Object>> values = (RowContainer)joinOp.storage[i]; if (values != null) { specPath = conf.getSmallKeysDirMap().get((byte) currBigKeyTag).get( (byte) i); values.copyToDFSDirecory(destFs, getOperatorOutputPath(specPath)); } } } skewKeyInCurrentGroup = false; }
if (itrCursor < this.readBlockSize) { ret = this.currentReadBlock[itrCursor++]; removeKeys(ret); return ret; } else { nextBlock(0); if (this.readBlockSize == 0) { if (currentWriteBlock != null && currentReadBlock != currentWriteBlock) { setWriteBlockAsReadBlock(); } else { return null; return next();
public void copyToDFSDirecory(FileSystem destFs, Path destPath) throws IOException, HiveException { if (addCursor > 0) { this.spillBlock(this.currentWriteBlock, addCursor); } if (tempOutPath == null || tempOutPath.toString().trim().equals("")) { return; } this.closeWriter(); LOG.info("RowContainer copied temp file " + tmpFile.getAbsolutePath() + " to dfs directory " + destPath.toString()); destFs .copyFromLocalFile(true, tempOutPath, new Path(destPath, new Path(tempOutPath.getName()))); clearRows(); }
public static RowContainer<List<Object>> getRowContainer(Configuration hconf, List<ObjectInspector> structFieldObjectInspectors, Byte alias,int containerSize, TableDesc[] spillTableDesc, JoinDesc conf,boolean noFilter, Reporter reporter) throws HiveException { TableDesc tblDesc = JoinUtil.getSpillTableDesc(alias,spillTableDesc,conf, noFilter); AbstractSerDe serde = JoinUtil.getSpillSerDe(alias, spillTableDesc, conf, noFilter); if (serde == null) { containerSize = -1; } RowContainer<List<Object>> rc = new RowContainer<List<Object>>(containerSize, hconf, reporter); StructObjectInspector rcOI = null; if (tblDesc != null) { // arbitrary column names used internally for serializing to spill table List<String> colNames = Utilities.getColumnNames(tblDesc.getProperties()); // object inspector for serializing input tuples rcOI = ObjectInspectorFactory.getStandardStructObjectInspector(colNames, structFieldObjectInspectors); } rc.setSerDe(serde, rcOI); rc.setTableDesc(tblDesc); return rc; }
RowContainer<ArrayList<Object>> rc = (RowContainer)joinOp.storage[i]; if (rc != null) { rc.setSerDe(tblSerializers.get((byte) i), skewKeysTableObjectInspector .get((byte) i)); rc.setTableDesc(tblDesc.get(alias));
if (nextKeyGroup) { this.nextGroupStorage[alias].addRow(value); foundNextKeyGroup[tag] = true; if (tag != posBigTable) { if ((tag == posBigTable) && (candidateStorage[tag].rowCount() == joinEmitInterval)) { boolean canEmit = true; for (byte i = 0; i < foundNextKeyGroup.length; i++) { + joinEmitInterval); joinOneGroup(false); candidateStorage[tag].clearRows(); storage[tag].clearRows(); candidateStorage[tag].addRow(value);
if (itrCursor < this.readBlockSize) { ret = this.currentReadBlock[itrCursor++]; removeKeys(ret); return ret; } else { nextBlock(); if (this.readBlockSize == 0) { if (currentWriteBlock != null && currentReadBlock != currentWriteBlock) { return next();
@Override public void addRow(Row t) throws HiveException { if ( willSpill() ) { setupWriter(); PTFRecordWriter rw = (PTFRecordWriter) getRecordWriter(); BlockInfo blkInfo = new BlockInfo(); try { blkInfo.startOffset = rw.outStream.getLength(); blockInfos.add(blkInfo); } catch(IOException e) { clearRows(); LOG.error(e.toString(), e); throw new HiveException(e); } } super.addRow(t); }
@Override @SuppressWarnings("unchecked") protected void initializeOp(Configuration hconf) throws HiveException { if (conf.getGenJoinKeys()) { int tagLen = conf.getTagLength(); joinKeys = new List[tagLen]; JoinUtil.populateJoinKeyValue(joinKeys, conf.getKeys(), NOTSKIPBIGTABLE, hconf); joinKeysObjectInspectors = JoinUtil.getObjectInspectorsFromEvaluators(joinKeys, inputObjInspectors,NOTSKIPBIGTABLE, tagLen); } super.initializeOp(hconf); numMapRowsRead = 0; // all other tables are small, and are cached in the hash table posBigTable = (byte) conf.getPosBigTable(); emptyList = new RowContainer<List<Object>>(1, hconf, reporter); RowContainer<List<Object>> bigPosRC = JoinUtil.getRowContainer(hconf, rowContainerStandardObjectInspectors[posBigTable], posBigTable, joinCacheSize,spillTableDesc, conf, !hasFilter(posBigTable), reporter); storage[posBigTable] = bigPosRC; }
RowContainer<ArrayList<Object>> rc = (RowContainer)joinOp.storage[i]; if (rc != null) { rc.setKeyObject(dummyKey);
public void copyToDFSDirecory(FileSystem destFs, Path destPath) throws IOException, HiveException { if (addCursor > 0) { this.spillBlock(this.currentWriteBlock, addCursor); } if (tempOutPath == null || tempOutPath.toString().trim().equals("")) { return; } this.closeWriter(); LOG.info("RowContainer copied temp file " + tmpFile.getAbsolutePath() + " to dfs directory " + destPath.toString()); destFs .copyFromLocalFile(true, tempOutPath, new Path(destPath, new Path(tempOutPath.getName()))); clear(); }
@Override public Row next() throws HiveException { boolean endOfCurrBlock = endOfCurrentReadBlock(); if ( endOfCurrBlock ) { currentReadBlockStartRow += getCurrentReadBlockSize(); } return super.next(); }
@Override public void close() throws HiveException { super.close(); blockInfos = null; }
public RowContainer(int blockSize, SerDe sd, ObjectInspector oi, Configuration jc) throws HiveException { this(blockSize, jc); setSerDe(sd, oi); }
public static RowContainer<List<Object>> getRowContainer(Configuration hconf, List<ObjectInspector> structFieldObjectInspectors, Byte alias,int containerSize, TableDesc[] spillTableDesc, JoinDesc conf,boolean noFilter, Reporter reporter) throws HiveException { TableDesc tblDesc = JoinUtil.getSpillTableDesc(alias,spillTableDesc,conf, noFilter); AbstractSerDe serde = JoinUtil.getSpillSerDe(alias, spillTableDesc, conf, noFilter); if (serde == null) { containerSize = -1; } RowContainer<List<Object>> rc = new RowContainer<List<Object>>(containerSize, hconf, reporter); StructObjectInspector rcOI = null; if (tblDesc != null) { // arbitrary column names used internally for serializing to spill table List<String> colNames = Utilities.getColumnNames(tblDesc.getProperties()); // object inspector for serializing input tuples rcOI = ObjectInspectorFactory.getStandardStructObjectInspector(colNames, structFieldObjectInspectors); } rc.setSerDe(serde, rcOI); rc.setTableDesc(tblDesc); return rc; }
if (itrCursor < this.readBlockSize) { ret = this.currentReadBlock[itrCursor++]; removeKeys(ret); return ret; } else { nextBlock(0); if (this.readBlockSize == 0) { if (currentWriteBlock != null && currentReadBlock != currentWriteBlock) { setWriteBlockAsReadBlock(); } else { return null; return next();
RowContainer<ArrayList<Object>> rc = (RowContainer)joinOp.storage[i]; if (rc != null) { rc.setSerDe(tblSerializers.get((byte) i), skewKeysTableObjectInspector .get((byte) i)); rc.setTableDesc(tblDesc.get(alias));
public void copyToDFSDirecory(FileSystem destFs, Path destPath) throws IOException, HiveException { if (addCursor > 0) { this.spillBlock(this.currentWriteBlock, addCursor); } if (tempOutPath == null || tempOutPath.toString().trim().equals("")) { return; } this.closeWriter(); LOG.info("RowContainer copied temp file " + tmpFile.getAbsolutePath() + " to dfs directory " + destPath.toString()); destFs .copyFromLocalFile(true, tempOutPath, new Path(destPath, new Path(tempOutPath.getName()))); clearRows(); }