fragment.getPartitioning())); if (stageInfo.isPresent()) { StageStats stageStats = stageInfo.get().getStageStats(); double avgPositionsPerTask = stageInfo.get().getTasks().stream().mapToLong(task -> task.getStats().getProcessedInputPositions()).average().orElse(Double.NaN); double squaredDifferences = stageInfo.get().getTasks().stream().mapToDouble(task -> Math.pow(task.getStats().getProcessedInputPositions() - avgPositionsPerTask, 2)).sum(); double sdAmongTasks = Math.sqrt(squaredDifferences / stageInfo.get().getTasks().size()); .map(argument -> { if (argument.isConstant()) { NullableValue constant = argument.getConstant(); String printableValue = castToVarchar(constant.getType(), constant.getValue(), functionRegistry, session); return constant.getType().getDisplayName() + "(" + printableValue + ")"; .collect(toImmutableList()); builder.append(indentString(1)); if (replicateNullsAndAny) { .flatMap(f -> f.getSymbols().entrySet().stream()) .distinct() .collect(toImmutableMap(Map.Entry::getKey, Map.Entry::getValue))); builder.append(textLogicalPlan(fragment.getRoot(), typeProvider, Optional.of(fragment.getStageExecutionDescriptor()), functionRegistry, fragment.getStatsAndCosts(), session, planNodeStats, 1, verbose)) .append("\n");
@Override public ConnectorSplitSource getSplits(ConnectorTransactionHandle transaction, ConnectorSession session, ConnectorTableLayoutHandle layout, SplitSchedulingStrategy splitSchedulingStrategy) { JmxTableLayoutHandle jmxLayout = (JmxTableLayoutHandle) layout; JmxTableHandle tableHandle = jmxLayout.getTable(); TupleDomain<ColumnHandle> predicate = jmxLayout.getConstraint(); //TODO is there a better way to get the node column? Optional<JmxColumnHandle> nodeColumnHandle = tableHandle.getColumnHandles().stream() .filter(jmxColumnHandle -> jmxColumnHandle.getColumnName().equals(NODE_COLUMN_NAME)) .findFirst(); checkState(nodeColumnHandle.isPresent(), "Failed to find %s column", NODE_COLUMN_NAME); List<ConnectorSplit> splits = nodeManager.getAllNodes().stream() .filter(node -> { NullableValue value = NullableValue.of(createUnboundedVarcharType(), utf8Slice(node.getNodeIdentifier())); return predicate.overlaps(fromFixedValues(ImmutableMap.of(nodeColumnHandle.get(), value))); }) .map(node -> new JmxSplit(tableHandle, ImmutableList.of(node.getHostAndPort()))) .collect(toList()); return new FixedSplitSource(splits); } }
@Test public void testCreateOnlyNullsPredicate() { ImmutableList.Builder<HivePartition> partitions = ImmutableList.builder(); for (int i = 0; i < 5; i++) { partitions.add(new HivePartition( new SchemaTableName("test", "test"), Integer.toString(i), ImmutableMap.of(TEST_COLUMN_HANDLE, NullableValue.asNull(VarcharType.VARCHAR)))); } createPredicate(ImmutableList.of(TEST_COLUMN_HANDLE), partitions.build()); }
/** * Convert a map of columns to values into the TupleDomain which requires * those columns to be fixed to those values. Null is allowed as a fixed value. */ public static <T> TupleDomain<T> fromFixedValues(Map<T, NullableValue> fixedValues) { return TupleDomain.withColumnDomains(fixedValues.entrySet().stream() .collect(toMap( Map.Entry::getKey, entry -> { Type type = entry.getValue().getType(); Object value = entry.getValue().getValue(); return value == null ? Domain.onlyNull(type) : Domain.singleValue(type, value); }))); }
private Optional<Expression> coerceComparisonWithRounding( Type symbolExpressionType, Expression symbolExpression, NullableValue nullableValue, ComparisonExpression.Operator comparisonOperator) { requireNonNull(nullableValue, "nullableValue is null"); if (nullableValue.isNull()) { return Optional.empty(); } Type valueType = nullableValue.getType(); Object value = nullableValue.getValue(); return floorValue(valueType, symbolExpressionType, value) .map((floorValue) -> rewriteComparisonExpression(symbolExpressionType, symbolExpression, valueType, value, floorValue, comparisonOperator)); }
@Override public ActualProperties visitTableScan(TableScanNode node, List<ActualProperties> inputProperties) { checkArgument(node.getLayout().isPresent(), "table layout has not yet been chosen"); TableLayout layout = metadata.getLayout(session, node.getLayout().get()); Map<ColumnHandle, Symbol> assignments = ImmutableBiMap.copyOf(node.getAssignments()).inverse(); ActualProperties.Builder properties = ActualProperties.builder(); // Globally constant assignments Map<ColumnHandle, NullableValue> globalConstants = new HashMap<>(); extractFixedValues(node.getCurrentConstraint()).orElse(ImmutableMap.of()) .entrySet().stream() .filter(entry -> !entry.getValue().isNull()) .forEach(entry -> globalConstants.put(entry.getKey(), entry.getValue())); Map<Symbol, NullableValue> symbolConstants = globalConstants.entrySet().stream() .filter(entry -> assignments.containsKey(entry.getKey())) .collect(toMap(entry -> assignments.get(entry.getKey()), Map.Entry::getValue)); properties.constants(symbolConstants); // Partitioning properties properties.global(deriveGlobalProperties(layout, assignments, globalConstants)); // Append the global constants onto the local properties to maximize their translation potential List<LocalProperty<ColumnHandle>> constantAppendedLocalProperties = ImmutableList.<LocalProperty<ColumnHandle>>builder() .addAll(globalConstants.keySet().stream().map(ConstantProperty::new).iterator()) .addAll(layout.getLocalProperties()) .build(); properties.local(LocalProperties.translate(constantAppendedLocalProperties, column -> Optional.ofNullable(assignments.get(column)))); return properties.build(); }
.orElseThrow(() -> new AssertionError("Table does not exist: " + tableName)); assertEqualsIgnoreOrder(partitionNames, CREATE_TABLE_PARTITIONED_DATA.getMaterializedRows().stream() .map(row -> "ds=" + row.getField(CREATE_TABLE_PARTITIONED_DATA.getTypes().size() - 1)) .collect(toList())); MaterializedResult result = readTable(transaction, tableHandle, columnHandles, session, TupleDomain.all(), OptionalInt.empty(), Optional.of(storageFormat)); assertEqualsIgnoreOrder(result.getMaterializedRows(), expectedResultBuilder.build().getMaterializedRows()); TupleDomain<ColumnHandle> tupleDomain = TupleDomain.fromFixedValues(ImmutableMap.of(dsColumnHandle, NullableValue.of(createUnboundedVarcharType(), utf8Slice("2015-07-03")))); Constraint<ColumnHandle> constraint = new Constraint<>(tupleDomain, convertToPredicate(tupleDomain)); List<ConnectorTableLayoutResult> tableLayoutResults = metadata.getTableLayouts(session, tableHandle, constraint, Optional.empty()); ConnectorTableLayoutHandle tableLayoutHandle = getOnlyElement(tableLayoutResults).getTableLayout().getHandle(); metadata.metadataDelete(session, tableHandle, tableLayoutHandle); .filter(row -> !"2015-07-03".equals(row.getField(dsColumnOrdinalPosition))) .collect(toImmutableList()); MaterializedResult actualAfterDelete = readTable(transaction, tableHandle, columnHandles, session, TupleDomain.all(), OptionalInt.empty(), Optional.of(storageFormat)); assertEqualsIgnoreOrder(actualAfterDelete.getMaterializedRows(), expectedRows); ConnectorMetadata metadata = transaction.getMetadata(); ConnectorTableHandle tableHandle = getTableHandle(metadata, tableName); List<ColumnHandle> columnHandles = ImmutableList.copyOf(metadata.getColumnHandles(session, tableHandle).values()); MaterializedResult actualAfterDelete2 = readTable(transaction, tableHandle, columnHandles, session, TupleDomain.all(), OptionalInt.empty(), Optional.of(storageFormat)); assertEqualsIgnoreOrder(actualAfterDelete2.getMaterializedRows(), ImmutableList.of());
Map<ColumnHandle, NullableValue> fixedValues = TupleDomain.extractFixedValues(tupleDomain).orElse(ImmutableMap.of()) .entrySet().stream() .filter(entry -> !indexableColumns.contains(entry.getKey())) .filter(entry -> !entry.getValue().isNull()) // strip nulls since meaningless in index join lookups .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)); .addAll(handleToNames(ImmutableList.copyOf(indexableColumns))) .addAll(handleToNames(ImmutableList.copyOf(fixedValues.keySet()))) .build(); if (!indexedData.getIndexedTable(tpchTableHandle.getTableName(), tpchTableHandle.getScaleFactor(), lookupColumnNames).isPresent()) { return Optional.empty(); if (!tupleDomain.isNone()) { filteredTupleDomain = TupleDomain.withColumnDomains(Maps.filterKeys(tupleDomain.getDomains().get(), not(in(fixedValues.keySet()))));
List<Optional<NullableValue>> partitionConstants; List<Type> partitionChannelTypes; if (partitioningScheme.getHashColumn().isPresent()) { partitionChannels = ImmutableList.of(outputLayout.indexOf(partitioningScheme.getHashColumn().get())); partitionConstants = ImmutableList.of(Optional.empty()); partitionChannelTypes = ImmutableList.of(BIGINT); .map(argument -> { if (argument.isConstant()) { return -1; .collect(toImmutableList()); partitionConstants = partitioningScheme.getPartitioning().getArguments().stream() .map(argument -> { .map(argument -> { if (argument.isConstant()) { return argument.getConstant().getType();
/** * Extract all column constraints that require exactly one value or only null in their respective Domains. * Returns an empty Optional if the Domain is none. */ public static <T> Optional<Map<T, NullableValue>> extractFixedValues(TupleDomain<T> tupleDomain) { if (!tupleDomain.getDomains().isPresent()) { return Optional.empty(); } return Optional.of(tupleDomain.getDomains().get() .entrySet().stream() .filter(entry -> entry.getValue().isNullableSingleValue()) .collect(toMap(Map.Entry::getKey, entry -> new NullableValue(entry.getValue().getType(), entry.getValue().getNullableSingleValue())))); }
columnHandles, session, TupleDomain.all(), OptionalInt.empty(), Optional.empty()); columnHandles, session, TupleDomain.fromFixedValues(ImmutableMap.of(bucketColumnHandle(), NullableValue.of(INTEGER, 6L))), OptionalInt.empty(), Optional.empty()); columnHandles = ImmutableList.<ColumnHandle>builder() .addAll(metadata.getColumnHandles(session, tableHandle).values().stream() .filter(columnHandle -> !"id".equals(((HiveColumnHandle) columnHandle).getName())) .collect(toImmutableList())) .build(); result = readTable( columnHandles, session, TupleDomain.fromFixedValues(ImmutableMap.of(bucketColumnHandle(), NullableValue.of(INTEGER, 6L))), OptionalInt.empty(), Optional.empty());
@VisibleForTesting static Optional<DoubleRange> calculateRangeForPartitioningKey(HiveColumnHandle column, Type type, List<HivePartition> partitions) { if (!isRangeSupported(type)) { return Optional.empty(); } List<Double> values = partitions.stream() .map(HivePartition::getKeys) .map(keys -> keys.get(column)) .filter(value -> !value.isNull()) .map(NullableValue::getValue) .map(value -> convertPartitionValueToDouble(type, value)) .collect(toImmutableList()); if (values.isEmpty()) { return Optional.empty(); } double min = values.get(0); double max = values.get(0); for (Double value : values) { if (value > max) { max = value; } if (value < min) { min = value; } } return Optional.of(new DoubleRange(min, max)); }
@Override public StreamProperties visitTableScan(TableScanNode node, List<StreamProperties> inputProperties) { checkArgument(node.getLayout().isPresent(), "table layout has not yet been chosen"); TableLayout layout = metadata.getLayout(session, node.getLayout().get()); Map<ColumnHandle, Symbol> assignments = ImmutableBiMap.copyOf(node.getAssignments()).inverse(); // Globally constant assignments Set<ColumnHandle> constants = new HashSet<>(); extractFixedValues(node.getCurrentConstraint()).orElse(ImmutableMap.of()) .entrySet().stream() .filter(entry -> !entry.getValue().isNull()) // TODO consider allowing nulls .forEach(entry -> constants.add(entry.getKey())); Optional<Set<Symbol>> streamPartitionSymbols = layout.getStreamPartitioningColumns() .flatMap(columns -> getNonConstantSymbols(columns, assignments, constants)); // if we are partitioned on empty set, we must say multiple of unknown partitioning, because // the connector does not guarantee a single split in this case (since it might not understand // that the value is a constant). if (streamPartitionSymbols.isPresent() && streamPartitionSymbols.get().isEmpty()) { return new StreamProperties(MULTIPLE, Optional.empty(), false); } return new StreamProperties(MULTIPLE, streamPartitionSymbols, false); }
List<Optional<NullableValue>> partitionConstants; List<Type> partitionChannelTypes; if (functionBinding.getHashColumn().isPresent()) { partitionChannels = ImmutableList.of(outputLayout.indexOf(functionBinding.getHashColumn().get())); partitionConstants = ImmutableList.of(Optional.empty()); partitionChannelTypes = ImmutableList.of(BIGINT); .map(PartitionFunctionArgumentBinding::getColumn) .map(outputLayout::indexOf) .collect(toImmutableList()); partitionConstants = functionBinding.getPartitionFunctionArguments().stream() .map(argument -> { .map(argument -> { if (argument.isConstant()) { return argument.getConstant().getType();
private TupleDomain<ColumnHandle> toTupleDomain(Map<TpchColumnHandle, Set<NullableValue>> predicate) { return TupleDomain.withColumnDomains(predicate.entrySet().stream() .collect(Collectors.toMap(Map.Entry::getKey, entry -> { Type type = entry.getKey().getType(); return entry.getValue().stream() .map(nullableValue -> Domain.singleValue(type, nullableValue.getValue())) .reduce((Domain::union)) .orElse(Domain.none(type)); }))); }
private Optional<HivePartition> parseValuesAndFilterPartition( SchemaTableName tableName, String partitionId, List<HiveColumnHandle> partitionColumns, List<Type> partitionColumnTypes, Constraint<ColumnHandle> constraint) { HivePartition partition = parsePartition(tableName, partitionId, partitionColumns, partitionColumnTypes, timeZone); Map<ColumnHandle, Domain> domains = constraint.getSummary().getDomains().get(); for (HiveColumnHandle column : partitionColumns) { NullableValue value = partition.getKeys().get(column); Domain allowedDomain = domains.get(column); if (allowedDomain != null && !allowedDomain.includesNullableValue(value.getValue())) { return Optional.empty(); } } if (constraint.predicate().isPresent() && !constraint.predicate().get().test(partition.getKeys())) { return Optional.empty(); } return Optional.of(partition); }
"ds=2012-12-29/file_format=textfile/dummy=1", ImmutableMap.<ColumnHandle, NullableValue>builder() .put(dsColumn, NullableValue.of(createUnboundedVarcharType(), utf8Slice("2012-12-29"))) .put(fileFormatColumn, NullableValue.of(createUnboundedVarcharType(), utf8Slice("textfile"))) .put(dummyColumn, NullableValue.of(INTEGER, 1L)) .build())) .add(new HivePartition(tablePartitionFormat, "ds=2012-12-29/file_format=sequencefile/dummy=2", ImmutableMap.<ColumnHandle, NullableValue>builder() .put(dsColumn, NullableValue.of(createUnboundedVarcharType(), utf8Slice("2012-12-29"))) .put(fileFormatColumn, NullableValue.of(createUnboundedVarcharType(), utf8Slice("sequencefile"))) .put(dummyColumn, NullableValue.of(INTEGER, 2L)) .build())) .add(new HivePartition(tablePartitionFormat, .put(dsColumn, NullableValue.of(createUnboundedVarcharType(), utf8Slice("2012-12-29"))) .put(fileFormatColumn, NullableValue.of(createUnboundedVarcharType(), utf8Slice("rctext"))) .put(dummyColumn, NullableValue.of(INTEGER, 3L)) .build())) .add(new HivePartition(tablePartitionFormat, .put(dsColumn, NullableValue.of(createUnboundedVarcharType(), utf8Slice("2012-12-29"))) .put(fileFormatColumn, NullableValue.of(createUnboundedVarcharType(), utf8Slice("rcbinary"))) .put(dummyColumn, NullableValue.of(INTEGER, 4L)) .build())) .build(); partitionCount = partitions.size(); tupleDomain = TupleDomain.fromFixedValues(ImmutableMap.of(dsColumn, NullableValue.of(createUnboundedVarcharType(), utf8Slice("2012-12-29"))));
if (!result.isPresent()) { return context.defaultRewrite(node); TableScanNode tableScan = result.get(); ImmutableMap.Builder<Symbol, Type> typesBuilder = ImmutableMap.builder(); ImmutableMap.Builder<Symbol, ColumnHandle> columnBuilder = ImmutableMap.builder(); if (!tableScan.getLayout().isPresent()) { List<TableLayoutResult> layouts = metadata.getLayouts(session, tableScan.getTable(), Constraint.alwaysTrue(), Optional.empty()); if (layouts.size() == 1) { ImmutableList.Builder<List<Expression>> rowsBuilder = ImmutableList.builder(); for (TupleDomain<ColumnHandle> domain : predicates.getPredicates()) { if (!domain.isNone()) { Map<ColumnHandle, NullableValue> entries = TupleDomain.extractFixedValues(domain).get(); ImmutableList.Builder<Expression> rowBuilder = ImmutableList.builder(); rowBuilder.add(literalEncoder.toExpression(value.getValue(), type));
private static PlanNode buildProjectedIndexSource(PlanBuilder p, Predicate<Symbol> projectionFilter) { Symbol orderkey = p.symbol("orderkey", INTEGER); Symbol custkey = p.symbol("custkey", INTEGER); Symbol totalprice = p.symbol("totalprice", DOUBLE); ColumnHandle orderkeyHandle = new TpchColumnHandle(orderkey.getName(), INTEGER); ColumnHandle custkeyHandle = new TpchColumnHandle(custkey.getName(), INTEGER); ColumnHandle totalpriceHandle = new TpchColumnHandle(totalprice.getName(), DOUBLE); return p.project( Assignments.identity( ImmutableList.of(orderkey, custkey, totalprice).stream() .filter(projectionFilter) .collect(toImmutableList())), p.indexSource( new TableHandle( new ConnectorId("local"), new TpchTableHandle("orders", TINY_SCALE_FACTOR)), ImmutableSet.of(orderkey, custkey), ImmutableList.of(orderkey, custkey, totalprice), ImmutableMap.of( orderkey, orderkeyHandle, custkey, custkeyHandle, totalprice, totalpriceHandle), TupleDomain.fromFixedValues(ImmutableMap.of(totalpriceHandle, asNull(DOUBLE))))); } }