Type Parameters:: T - type of item the partitioner accepts

All Superinterfaces:: Serializable

All Known Implementing Classes:: Partitioner.Default

Functional Interface:: This is a functional interface and can therefore be used as the assignment target for a lambda expression or method reference.

@FunctionalInterface public interface Partitioner<T> extends Serializable

Encapsulates the logic associated with a DAG edge that decides on the partition ID of an item traveling over it. The partition ID determines which cluster member and which instance of Processor on that member an item will be forwarded to.

Jet's partitioning piggybacks on Hazelcast partitioning. Standard Hazelcast protocols are used to distribute partition ownership over the members of the cluster. However, if a DAG edge is configured as non-distributed, then on each member there will be some destination processor responsible for any given partition.

Since:: Jet 3.0

Nested Class Summary

Nested Classes

Modifier and Type

Interface

Description

static final class

Partitioner.Default

Partitioner which applies the default Hazelcast partitioning strategy.
Field Summary

Fields

Modifier and Type

Field

Description

static final Partitioner<Object>

HASH_CODE

Partitioner which calls Object.hashCode() and coerces it with the modulo operation into the allowed range of partition IDs.
Method Summary

Modifier and Type

Method

Description

static Partitioner<Object>

defaultPartitioner()

Returns a partitioner which applies the default Hazelcast partitioning.

default T

getConstantPartitioningKey()

int

getPartition(T item, int partitionCount)

Returns the partition ID of the given item.

default void

init(DefaultPartitionStrategy strat)

Callback that injects the Hazelcast's default partitioning strategy into this partitioner, so it can be consulted by the getPartition(Object, int) method.

Field Details
- HASH_CODE
  
  static final Partitioner<Object> HASH_CODE
  Partitioner which calls Object.hashCode() and coerces it with the modulo operation into the allowed range of partition IDs. The primary reason to prefer this over the default is performance, and it's a safe choice on local edges.
  WARNING: this is a dangerous strategy to use on distributed edges. Care must be taken to ensure that the produced hashcode remains stable across serialization-deserialization cycles as well as across all JVM processes. Consider a hashCode() method that is correct with respect to its contract, but not with respect to the stricter contract given above. Take the following scenario:
  there are two Jet cluster members;
  there is a DAG vertex;
  on each member there is a processor for this vertex;
  each processor emits an item;
  these two items have equal partitioning keys;
  nevertheless, on each member they get a different hashcode;
  they are routed to different processors, thus failing on the promise that all items with the same partition key go to the same processor.
Method Details
- init
  
  default void init(@Nonnull DefaultPartitionStrategy strat)
  
  Callback that injects the Hazelcast's default partitioning strategy into this partitioner, so it can be consulted by the getPartition(Object, int) method.
  The creation of instances of the Partitioner type is done in user's code, but the Hazelcast partitioning strategy only becomes available after the partitioner is deserialized on each target member. This method solves the lifecycle mismatch.
- getPartition
  
  int getPartition(@Nonnull T item, int partitionCount)
  
  Returns the partition ID of the given item.
  
  Parameters:
  
  partitionCount - the total number of partitions in use by the underlying Hazelcast instance
- getConstantPartitioningKey
  
  default T getConstantPartitioningKey()
  
  Returns:
  
  constant partition key that is always used by getPartition(Object, int) or null if different partitions may be returned
  
  Since:
  
  5.4
- defaultPartitioner
  
  static Partitioner<Object> defaultPartitioner()
  
  Returns a partitioner which applies the default Hazelcast partitioning. It will serialize the item using Hazelcast Serialization, then apply Hazelcast's MurmurHash-based algorithm to retrieve the partition ID. This is quite a bit of work, but has stable results across all JVM processes, making it a safe default.

Interface Partitioner<T>

Nested Class Summary

Field Summary

Method Summary

Field Details

HASH_CODE

Method Details

init

getPartition

getConstantPartitioningKey

defaultPartitioner