Welcome to the Hazelcast Jet Release Notes. This document includes the new features, enhancements and fixed issues for Hazelcast Jet releases. Select a release number in the table of contents.

The linked numbers refer to the issue numbers in Hazelcast Jet GitHub repository. The labels in the square brackets refer to the respective Hazelcast Jet module.

0.7

New Features

  • [pipeline-api] Added a source builder to build custom sources. #989

  • [pipeline-api] Added JDBC connector to enable Jet reading from/writing to JDBC source. #931

  • [pipeline-api] Added Avro connector to work with Avro files. #890, #925

  • [pipeline-api] Added convenient API to enrich a stream from IMap and Replicated Map. #924

  • [pipeline-api] Added map/flatMap/filterUsingContext for keyed streams. #892

  • [pipeline-api] Introduced new merge and distinct operations and a simpler co-aggregation. #885

  • [pipeline-api] Added JMS connector to enable Jet reading from/writing to JMS provider. #874

  • [core] Added full job elasticity with graceful shutdown, and suspend and resume. #946

  • [core] Added support for exposing metrics through JMX. #926

Enhancements

  • [pipeline-api] Added Tuple4 and Tuple5. #964

  • [pipeline-api] Added the method AggregateOperation.andThen. #959

  • [pipeline-api] Added exception throwing methods to distributed lambdas. #951

  • [pipeline-api] The method mapToOutputFn in aggregations now supports returning nulls. #938

  • [pipeline-api] File sources and sinks have been converted to use builders for more convenience. #913

  • [pipeline-api] Added a new String parameter name to Sinks.builder. #894

  • [pipeline-api] Added support for DOT file visualization for DAG and Pipeline API. #891

  • [core] Added JetClientConfig which uses the default Jet group name and password. #950

  • [core] Logs now provide job names. #939

  • [core] Improved Processor`s so that they are now closed as soon as the processor is completed. Also, `jobId(), executionId() and jobConfig() are now available on Processor.Context. #934

  • [core] Improved Watermark diagnostics logging. #872

Fixes

  • [pipeline-api] Fixed the method mapWithMerging which should have not used a stateful flush function. #968

  • [pipeline-api] Fixed the issue of causing a window shift when using the results of a window aggregation in another window. #898

  • [kafka] StreamKafkaP.close() now handles InterruptException. #870

  • [java.util.stream] Removed java.util.stream implementation. It has been superseded by the Pipeline API. #983

  • [spring] Spring context is now injected to the wrapped processors as well to enable the @SpringAware annotation to be processed for sinks. #999

  • [spring] An IllegalArgumentException was thrown because the hazelcast-jet-spring module was adding a dependency to Hazelcast when used with Spring boot. This has been fixed so that hazelcast-jet-spring works out of the box. #977

Breaking Changes

  • [pipeline-api] Simplified the API for co-aggregation.

  • [pipeline-api] Replaced file sources and sinks with a builder API through Sources.filesBuilder().

  • [pipeline-api] SinkBuilder now requires a name and takes Processor.Context as input in createFn.

  • [core] Removed AbstractProcessor.setCooperative.

  • [core] Removed tempDir from config as it was unused.

  • [core] Processor.close() no longer receives the exception for the job.

0.6.1

New Features

  • [core] Add support for configuring a custom classloader per job. #770

Enhancements

  • [pipeline-api] Add a noop sink. #827

  • [core] Utility methods to load JetConfig from XML file or resource directly. #856

  • [core] Update Hazelcast IMDG references to "3.10". #833

  • [core] Optimize in-memory layout of snapshot data: reduced memory usage by up to 75%. #822

  • [core] Idle CPU usage improvements when there’s a large number of non-cooperative processors. #816

Fixes

  • [pipeline-api] SinkBuilder shouldn’t call destroyFn if createFn wasn’t called #826

  • [pipeline-api] Fix generic output type for aggregate operations with a custom mapToOutputFn. #823

  • [core] Don’t shrink receive window during lulls which can lead to reduced throughput. #798

  • [kafka] Kafka source should handle InterruptException in all cases. #824

  • [hadoop] When matching HDFS hostnames against member hostnames, use alternative ways than just reverse DNS. #858

0.6

New Features

  • [pipeline-api] Windowing support and overhaul of Pipeline API to work with streaming sources. #731

  • [pipeline-api] New updating and merging map sinks which update the value in place. #621

  • [pipeline-api] Custom sink builder support for Pipeline API. #762

  • [core] Improved job lifecycle features such as naming jobs and getting jobs by name. #666

  • [core] Ability to restart jobs manually to make use of newly joined members. #693

  • [spring] Add support for annotation and XML based spring configuration. #757

Enhancements

  • [pipeline-api] Improve the method AggregateOperations.allOf() so that is type-safe.

  • [pipeline-api] Predictable vertex names when converting Pipelines to DAG. #616

  • [pipeline-api] Add support for stateful transform operations in the pipeline API. #744

  • [pipeline-api] Add ability to set parallelism and name of each stage separately. #731

  • [pipeline-api] Add generic Predicate factory methods. #695

  • [core] Support for idle streaming sources through a configurable idle timeout where idle processors are excluded from Watermark coalescing. #660

  • [core] Add support for initializing processors with Hazelcast Managed Context. #645

  • [core] Improve watermark coalescing and processing. #649

  • [core] Make parallelism information available to Processor.Context. #754

  • [core] Include Hazelcast Javadocs in Jet distribution. #652

  • [core] Add Java 9 build support and automatic module names. #720

  • [core] Add ability to emit file name from file sources. #729

  • [core] Several Javadoc improvements.

  • [java.util.stream] New DistributedStream instances now can work with Pipeline API sources.

  • [kafka] Kafka version updated to 1.0.0.

  • [hadoop] Hadoop version updated to 2.8.3.

Fixes

  • [core] Avoid race between job cancellation and completion. #604

  • [core] Do not send a join request for a job unless Job.join() is explicitly called. #628

  • [core] When master node leaves cluster, job would restart even if auto-restart was disabled. #683

  • [core] When using map source, returning null in a projection would cause rest of the items to be dropped. #669

  • [core] When DAG fails to deserialize on job submission from client, the exception was not returned to user. #794

  • [core] Fix for ClassNotFoundException when restoring from snapshot when the key class is not on Jet classpath. #778

  • [core] Assert that always same item offered to Outbox after a rejection. #808

  • [kafka] Kafka sink should ensure that all records are sent before snapshot. #611

  • [kafka/hadoop] Package names reorganized to have better support for module systems.

Technical Changes

  • [pipeline-api] Grouping and aggregation in Pipeline API now are done through the methods .groupingKey() and .aggregate().

  • [pipeline-api] Package names for Pipeline API have been reorganized.

  • [core] Job cancellation can now only be done through Job.cancel() rather than Job.getFuture().cancel().

  • [core] IStreamMap, IStreamList and IStreamCache have been renamed to IMapJet, IListJet and ICacheJet, respectively.

  • [core] Dedicated tryProcessWatermark() method is now responsible for processing Watermarks in a processor and watermarks will be forwarded automatically once it returns true.

  • [core] start-jet.sh, submit-jet.sh and other scripts have been renamed to have prefix "jet" instead.

  • [core] CloseableProcessorSupplier has been replaced with a close() method on the Processor instead.

  • [core] complete methods on ProcessorMetaSupplier and ProcessorSupplier have been renamed to close().

  • [java.util.stream] DistributedStream creation have been moved to DistributedStream class from Map and List instances. We strongly suggest to switch to Pipeline API instead of java.util.stream API.

0.5.1

Fixes

  • [pipeline-api] Generated vertex names are now deterministic instead of random. #623

  • [pipeline-api] Vertex names for sources and sinks should now be more explicit. #623

  • [core] JetInstance.getJobs() now only sends a join job request when getFuture() is called. #633

  • [core] When configuring internal Jet maps, do not inherit default map configuration from Hazelcast. #629

  • [core] Update schema version in default Hazelcast configuration XML file. #627

  • [core] When peeking input and output of a processor, the logger name of the peeked processor should be used. #617

  • [core] Fix race condition when calling ProcessorSupplier.complete() during job cancellation. #604

  • [kafka] Kafka writer should flush when taking a snapshot. #624

  • [javadoc] Replace Javadoc for com.hazelcast.jet.function and com.hazelcast.jet.stream with links to JDK docs. #636

  • [javadoc] AggregateOperations Javadoc updates. #635

0.5

New Features

  • [core] Support for in-memory snapshots and at-least and exactly-once processing. #500

  • [core] New Pipeline API with support for group by, hash join and co group. #497

  • [core] Improved Jet job lifecycle management including auto-restart. #492

  • [core] Event journal streaming source for Map and Cache. #487

Enhancements

  • [core] Add ProcessorMetaSupplier.complete() method to support global cleanup. #571

  • [core] Socket source now uses non-blocking IO. #554

  • [core] Add preferred local parallelism to all the sources and sinks. #552

  • [core] InsertWatermarksProcessor will now log late events. #551

  • [core] Add projection and predicate capability to map reader. #490

  • [core] Hazelcast version is updated to 3.9. #474

  • [java.util.stream] Support for creating a stream from any source. #360

  • [kafka] Add ability to create own ProducerRecord instances for Kafka writer. #588

  • [kafka] When a job, which includes Kafka source, is cancelled, StreamKafkaP throws InterruptedException. This exception should be prevented or it should give a descriptive log. #570

  • [kafka] Add optional projection function to Kafka reader. #534

  • [kafka] Add snapshotting support for Kafka reader. #500

Code Samples Improvements

  • Overall module reorganization roughly divided to code samples using Pipeline API and Core API.

  • Several examples migrated to use the new Pipeline API.

  • New examples using hash join enrichment and co group.

  • New code samples illustrating the new event journal and fault tolerance features.

  • New code samples showing how to use HDFS with java.util.stream.

Fixes

  • [core] The method Job.getJobStatus() does not reflect actual status during cancellation. #580

  • [core] Add missing serializer for Session type. #562

  • [core] The method Traversers.traverseStream() should close the stream when exhausted. #560

  • [core] Kafka source should throw when partition count is less than global parallelism. #559

  • [core] Suppress exception for flow control packet when the member has already left the cluster. #556

  • [core] Fix the ignored charset in StreamSocketP. #553

  • [core] JetClassLoader should not throw an exception on the method findResource(). #549

  • [core] Add cache manager in order not to require a dependency on javax.cache. #485

  • [core] Fix NPE when releasing uninitialized ResourceStore. #478

  • [core] Replace ReadFileStreamP with batch file processor. #373

  • [core] ReadFileStreamP should respect Outbox limit. #337

  • [kafka] When init phase for the job fails, StreamKafkaP throws NPE. There should be a null check. #568

  • [hdfs] Making sure a proper global cleanup is performed on job cancellation. #572

0.4

New Features

  • [core] Infinite stream processing with event-time semantics using watermarks.

  • [core] Out of the box processors for tumbling, sliding and session window aggregations.

  • [core] New AggregateOperation abstraction with several out of the box aggregators.

  • [core] Bootstrapper script for submitting a JAR and running a DAG from it. #421

  • [core] Sources and Sinks reorganized into Sources and Sinks static classes.

  • [core] Added Sources.streamSocket() and Sinks.writeSocket() for reading from and writing to sockets. #358, #363

  • [core] Added Sources.readFile(), Sources.streamFile() and Sources.writeFile() for reading and writing files. #373, #376

  • [core] Added Sources.readCache() and Sinks.writeCache() for reading from and writing to Hazelcast ICache. #357

  • [core] Added diagnostic processors for debugging. #426

  • [java.util.stream] java.util.stream support for ICache. #360

  • [pcf] Hazelcast Jet tile for Pivotal Cloud Foundry was released.

Enhancements

  • [core] Hazelcast IMDG updated to 3.8.2 and is now shaded in the Jet JAR. #367

  • [core] Blocking outbox support for non-cooperative processors. #372, #375

  • [core] JobFuture added to Processor.Context. #381

  • [core] Processor.Context.index() renamed to globalProcessorIndex() and now returns a global index. #405

  • [core] Fairer distribution of tasklets when number of cores is larger than number of tasklets in a job. #443

  • [java.util.stream] Added configuration option to DistributedStream. #359

Code Samples Improvements

  • Sliding window Stock exchange streaming sample. #19, #21

  • Hazelcast IMap/IList/ICache sample. #16

  • PCF-integration with Spring Boot sample. #17

  • Socket reader and writer examples with Netty. #15

  • Read and write files example. #24

  • Top N streaming example. #27

  • HashJoin and CoGroup samples added. #28, #29

Fixes

  • [core] Missing return value for JobConfig.addJar(). #393

  • [core] Improve reliability of ExecutionService on shutdown. #409

  • [core] Metaspace memory was not recovered fully when custom class loader is used. #466

0.3.1

New Features

  • [core] The Jet cluster can be scaled up without affecting running jobs. Please refer to the Elasticity section section in Hazelcast Jet Reference Manual for details.

Enhancements

  • [core] Improved DAG.toString. #310

  • [core] Convenience to make any processor non-cooperative. #321

  • [core] Updated Hazelcast version to 3.8. #323

  • [core] Added missing functional interfaces to the Distributed class. #324

  • [java.util.stream] Refactoring of Jet collectors into the new type Reducer. #313

  • [java.util.stream] Allow branching of the j.u.s pipeline. #318

  • [hadoop] Added support for reading and writing non-text data from or to HDFS. Please refer to the hazelcast-jet-hadoop documentation for details.

  • [hadoop] Added key/value mappers for ReadHdfsP and WriteHdfsP. #328

  • [kafka] Kafka connector now makes use of consumer groups. Please refer to the hazelcast-jet-kafka documentation.

Code Samples Improvements

  • [code-samples] Added Jet vs. java.util.stream benchmark to the code samples.

  • [code-samples] Added new Hadoop word count example.

  • [code-samples] Added Kafka consumer example.

Fixes

  • [core] Remove dead and potentially racy code from async operations. #320

  • [core] ReadSocketTextStreamP should emit items immediately after reading. #335

  • [core] Wrong configuration file is used for IMDG when Jet config file present in classpath. #345

  • [java.util.stream] Do not require forEach action to be serializable. #340, #341