This manual is for an old version of Hazelcast Jet, use the latest stable version.

Map

Mapping is one of the most common operations and maps an incoming item to one output element.

For example, a vertex which converts incoming String elements to Integer values can simply be written as:

Vertex mapper = dag.newVertex("toInteger", Processors.map((String s) -> Integer.valueOf(s)));

Filter

Filtering is another common operation, which filters incoming items based on a predicate. Only items for which the predicate evaluates to true are allowed.

An vertex which filters out odd numbers could be written as follows:

Vertex filter = dag.newVertex("filterOddNumbers", Processors.filter((Integer s) -> s % 2 == 0));

FlatMap

FlatMap is a generalization of Map and maps each incoming element to zero or more elements. The flatMap processor expects a Traverser for each incoming item. The processor then emits all mapped values for an item before moving on to the next.

A vertex which will split incoming lines into words can be expressed as follows:

Vertex splitWords = dag.newVertex("splitWords",
        Processors.flatMap((String line) -> Traversers.traverseArray(line.split("\\W+"))));

Aggregation

The focal point of distributed computation is solving the problem of grouping by a time window and/or grouping key and aggregating the data of each group. As we explained in the Hazelcast Jet 101 section, aggregation can take place in a single stage or in two stages, and there are separate variants for batch and stream jobs.

The complete matrix of factories for aggregator vertices is presented in the following table:

single-stage stage 1/2 stage 2/2
batch,
no grouping
aggregate() accumulate() combine()
batch, group by key aggregateByKey() accumulateByKey() combineByKey()
stream, group by key
and aligned window
aggregateToSlidingWindow() accumulateByFrame() combineToSlidingWindow()
stream, group by key
and session window
aggregateToSessionWindow() N/A N/A