Welcome to the Hazelcast Jet Release Notes. This document includes the new features, enhancements and fixed issues for Hazelcast Jet releases. Select a release number in the table of contents.

The linked numbers refer to the issue numbers in Hazelcast Jet GitHub repository. The labels in the square brackets refer to the respective Hazelcast Jet module.

0.6.1

New Features

  • [core] Add support for configuring a custom classloader per job. #770

Enhancements

  • [pipeline-api] Add a noop sink. #827

  • [core] Utility methods to load JetConfig from XML file or resource directly. #856

  • [core] Update Hazelcast IMDG references to "3.10". #833

  • [core] Optimize in-memory layout of snapshot data: reduced memory usage by up to 75%. #822

  • [core] Idle CPU usage improvements when there’s a large number of non-cooperative processors. #816

Fixes

  • [pipeline-api] SinkBuilder shouldn’t call destroyFn if createFn wasn’t called #826

  • [pipeline-api] Fix generic output type for aggregate operations with a custom mapToOutputFn. #823

  • [core] Don’t shrink receive window during lulls which can lead to reduced throughput. #798

  • [kafka] Kafka source should handle InterruptException in all cases. #824

  • [hadoop] When matching HDFS hostnames against member hostnames, use alternative ways than just reverse DNS. #858

0.6

New Features

  • [pipeline-api] Windowing support and overhaul of Pipeline API to work with streaming sources. #731

  • [pipeline-api] New updating and merging map sinks which update the value in place. #621

  • [pipeline-api] Custom sink builder support for Pipeline API. #762

  • [core] Improved job lifecycle features such as naming jobs and getting jobs by name. #666

  • [core] Ability to restart jobs manually to make use of newly joined members. #693

  • [spring] Add support for annotation and XML based spring configuration. #757

Enhancements

  • [pipeline-api] Improve the method AggregateOperations.allOf() so that is type-safe.

  • [pipeline-api] Predictable vertex names when converting Pipelines to DAG. #616

  • [pipeline-api] Add support for stateful transform operations in the pipeline API. #744

  • [pipeline-api] Add ability to set parallelism and name of each stage separately. #731

  • [pipeline-api] Add generic Predicate factory methods. #695

  • [core] Support for idle streaming sources through a configurable idle timeout where idle processors are excluded from Watermark coalescing. #660

  • [core] Add support for initializing processors with Hazelcast Managed Context. #645

  • [core] Improve watermark coalescing and processing. #649

  • [core] Make parallelism information available to Processor.Context. #754

  • [core] Include Hazelcast Javadocs in Jet distribution. #652

  • [core] Add Java 9 build support and automatic module names. #720

  • [core] Add ability to emit file name from file sources. #729

  • [core] Several Javadoc improvements.

  • [java.util.stream] New DistributedStream instances now can work with Pipeline API sources.

  • [kafka] Kafka version updated to 1.0.0.

  • [hadoop] Hadoop version updated to 2.8.3.

Fixes

  • [core] Avoid race between job cancellation and completion. #604

  • [core] Do not send a join request for a job unless Job.join() is explicitly called. #628

  • [core] When master node leaves cluster, job would restart even if auto-restart was disabled. #683

  • [core] When using map source, returning null in a projection would cause rest of the items to be dropped. #669

  • [core] When DAG fails to deserialize on job submission from client, the exception was not returned to user. #794

  • [core] Fix for ClassNotFoundException when restoring from snapshot when the key class is not on Jet classpath. #778

  • [core] Assert that always same item offered to Outbox after a rejection. #808

  • [kafka] Kafka sink should ensure that all records are sent before snapshot. #611

  • [kafka/hadoop] Package names reorganized to have better support for module systems.

Technical Changes

  • [pipeline-api] Grouping and aggregation in Pipeline API now are done through the methods .groupingKey() and .aggregate().

  • [pipeline-api] Package names for Pipeline API have been reorganized.

  • [core] Job cancellation can now only be done through Job.cancel() rather than Job.getFuture().cancel().

  • [core] IStreamMap, IStreamList and IStreamCache have been renamed to IMapJet, IListJet and ICacheJet, respectively.

  • [core] Dedicated tryProcessWatermark() method is now responsible for processing Watermarks in a processor and watermarks will be forwarded automatically once it returns true.

  • [core] start-jet.sh, submit-jet.sh and other scripts have been renamed to have prefix "jet" instead.

  • [core] CloseableProcessorSupplier has been replaced with a close() method on the Processor instead.

  • [core] complete methods on ProcessorMetaSupplier and ProcessorSupplier have been renamed to close().

  • [java.util.stream] DistributedStream creation have been moved to DistributedStream class from Map and List instances. We strongly suggest to switch to Pipeline API instead of java.util.stream API.

0.5.1

Fixes

  • [pipeline-api] Generated vertex names are now deterministic instead of random. #623

  • [pipeline-api] Vertex names for sources and sinks should now be more explicit. #623

  • [core] JetInstance.getJobs() now only sends a join job request when getFuture() is called. #633

  • [core] When configuring internal Jet maps, do not inherit default map configuration from Hazelcast. #629

  • [core] Update schema version in default Hazelcast configuration XML file. #627

  • [core] When peeking input and output of a processor, the logger name of the peeked processor should be used. #617

  • [core] Fix race condition when calling ProcessorSupplier.complete() during job cancellation. #604

  • [kafka] Kafka writer should flush when taking a snapshot. #624

  • [javadoc] Replace Javadoc for com.hazelcast.jet.function and com.hazelcast.jet.stream with links to JDK docs. #636

  • [javadoc] AggregateOperations Javadoc updates. #635

0.5

New Features

  • [core] Support for in-memory snapshots and at-least and exactly-once processing. #500

  • [core] New Pipeline API with support for group by, hash join and co group. #497

  • [core] Improved Jet job lifecycle management including auto-restart. #492

  • [core] Event journal streaming source for Map and Cache. #487

Enhancements

  • [core] Add ProcessorMetaSupplier.complete() method to support global cleanup. #571

  • [core] Socket source now uses non-blocking IO. #554

  • [core] Add preferred local parallelism to all the sources and sinks. #552

  • [core] InsertWatermarksProcessor will now log late events. #551

  • [core] Add projection and predicate capability to map reader. #490

  • [core] Hazelcast version is updated to 3.9. #474

  • [java.util.stream] Support for creating a stream from any source. #360

  • [kafka] Add ability to create own ProducerRecord instances for Kafka writer. #588

  • [kafka] When a job, which includes Kafka source, is cancelled, StreamKafkaP throws InterruptedException. This exception should be prevented or it should give a descriptive log. #570

  • [kafka] Add optional projection function to Kafka reader. #534

  • [kafka] Add snapshotting support for Kafka reader. #500

Code Samples Improvements

  • Overall module reorganization roughly divided to code samples using Pipeline API and Core API.

  • Several examples migrated to use the new Pipeline API.

  • New examples using hash join enrichment and co group.

  • New code samples illustrating the new event journal and fault tolerance features.

  • New code samples showing how to use HDFS with java.util.stream.

Fixes

  • [core] The method Job.getJobStatus() does not reflect actual status during cancellation. #580

  • [core] Add missing serializer for Session type. #562

  • [core] The method Traversers.traverseStream() should close the stream when exhausted. #560

  • [core] Kafka source should throw when partition count is less than global parallelism. #559

  • [core] Suppress exception for flow control packet when the member has already left the cluster. #556

  • [core] Fix the ignored charset in StreamSocketP. #553

  • [core] JetClassLoader should not throw an exception on the method findResource(). #549

  • [core] Add cache manager in order not to require a dependency on javax.cache. #485

  • [core] Fix NPE when releasing uninitialized ResourceStore. #478

  • [core] Replace ReadFileStreamP with batch file processor. #373

  • [core] ReadFileStreamP should respect Outbox limit. #337

  • [kafka] When init phase for the job fails, StreamKafkaP throws NPE. There should be a null check. #568

  • [hdfs] Making sure a proper global cleanup is performed on job cancellation. #572

0.4

New Features

  • [core] Infinite stream processing with event-time semantics using watermarks.

  • [core] Out of the box processors for tumbling, sliding and session window aggregations.

  • [core] New AggregateOperation abstraction with several out of the box aggregators.

  • [core] Bootstrapper script for submitting a JAR and running a DAG from it. #421

  • [core] Sources and Sinks reorganized into Sources and Sinks static classes.

  • [core] Added Sources.streamSocket() and Sinks.writeSocket() for reading from and writing to sockets. #358, #363

  • [core] Added Sources.readFile(), Sources.streamFile() and Sources.writeFile() for reading and writing files. #373, #376

  • [core] Added Sources.readCache() and Sinks.writeCache() for reading from and writing to Hazelcast ICache. #357

  • [core] Added diagnostic processors for debugging. #426

  • [java.util.stream] java.util.stream support for ICache. #360

  • [pcf] Hazelcast Jet tile for Pivotal Cloud Foundry was released.

Enhancements

  • [core] Hazelcast IMDG updated to 3.8.2 and is now shaded in the Jet JAR. #367

  • [core] Blocking outbox support for non-cooperative processors. #372, #375

  • [core] JobFuture added to Processor.Context. #381

  • [core] Processor.Context.index() renamed to globalProcessorIndex() and now returns a global index. #405

  • [core] Fairer distribution of tasklets when number of cores is larger than number of tasklets in a job. #443

  • [java.util.stream] Added configuration option to DistributedStream. #359

Code Samples Improvements

  • Sliding window Stock exchange streaming sample. #19, #21

  • Hazelcast IMap/IList/ICache sample. #16

  • PCF-integration with Spring Boot sample. #17

  • Socket reader and writer examples with Netty. #15

  • Read and write files example. #24

  • Top N streaming example. #27

  • HashJoin and CoGroup samples added. #28, #29

Fixes

  • [core] Missing return value for JobConfig.addJar(). #393

  • [core] Improve reliability of ExecutionService on shutdown. #409

  • [core] Metaspace memory was not recovered fully when custom class loader is used. #466

0.3.1

New Features

  • [core] The Jet cluster can be scaled up without affecting running jobs. Please refer to the Elasticity section section in Hazelcast Jet Reference Manual for details.

Enhancements

  • [core] Improved DAG.toString. #310

  • [core] Convenience to make any processor non-cooperative. #321

  • [core] Updated Hazelcast version to 3.8. #323

  • [core] Added missing functional interfaces to the Distributed class. #324

  • [java.util.stream] Refactoring of Jet collectors into the new type Reducer. #313

  • [java.util.stream] Allow branching of the j.u.s pipeline. #318

  • [hadoop] Added support for reading and writing non-text data from or to HDFS. Please refer to the hazelcast-jet-hadoop documentation for details.

  • [hadoop] Added key/value mappers for ReadHdfsP and WriteHdfsP. #328

  • [kafka] Kafka connector now makes use of consumer groups. Please refer to the hazelcast-jet-kafka documentation.

Code Samples Improvements

  • [code-samples] Added Jet vs. java.util.stream benchmark to the code samples.

  • [code-samples] Added new Hadoop word count example.

  • [code-samples] Added Kafka consumer example.

Fixes

  • [core] Remove dead and potentially racy code from async operations. #320

  • [core] ReadSocketTextStreamP should emit items immediately after reading. #335

  • [core] Wrong configuration file is used for IMDG when Jet config file present in classpath. #345

  • [java.util.stream] Do not require forEach action to be serializable. #340, #341