logo dark

Hazelcast IMDG Reference Manual

Version 3.11-SNAPSHOT

Preface

Welcome to the Hazelcast IMDG (In-Memory Data Grid) Reference Manual. This manual includes concepts, instructions and samples to guide you on how to use Hazelcast and build Hazelcast IMDG applications.

As the reader of this manual, you must be familiar with the Java programming language and you should have installed your preferred Integrated Development Environment (IDE).

Hazelcast IMDG Editions

This Reference Manual covers all editions of Hazelcast IMDG. Throughout this manual:

  • Hazelcast or Hazelcast IMDG refers to the open source edition of Hazelcast in-memory data grid middleware. Hazelcast is also the name of the company (Hazelcast, Inc.) providing the Hazelcast product.

  • Hazelcast IMDG Enterprise is a commercially licensed edition of Hazelcast IMDG which provides high-value enterprise features in addition to Hazelcast IMDG.

  • Hazelcast IMDG Enterprise HD is a commercially licensed edition of Hazelcast IMDG which provides High-Density (HD) Memory Store and Hot Restart Persistence features in addition to Hazelcast IMDG Enterprise.

Hazelcast IMDG Architecture

You can see the features for all Hazelcast IMDG editions in the following architecture diagram.

Hazelcast Architecture
You can see small "HD" boxes for some features in the above diagram. Those features can use High-Density (HD) Memory Store when it is available. It means if you have Hazelcast IMDG Enterprise HD, you can use those features with HD Memory Store.

For more information on Hazelcast IMDG’s Architecture, please see the white paper An Architect’s View of Hazelcast.

Hazelcast IMDG Plugins

You can extend Hazelcast IMDG’s functionality by using its plugins. These plugins have their own lifecycles. Please see Plugins page to learn about Hazelcast plugins you can use. Hazelcast plugins are marked with Plugin label throughout this manual. See also the Hazelcast Plugins chapter for more information.

Licensing

Hazelcast IMDG and Hazelcast Reference Manual are free and provided under the Apache License, Version 2.0. Hazelcast IMDG Enterprise and Hazelcast IMDG Enterprise HD is commercially licensed by Hazelcast, Inc.

For more detailed information on licensing, please see the License Questions appendix.

Trademarks

Hazelcast is a registered trademark of Hazelcast, Inc. All other trademarks in this manual are held by their respective owners.

Customer Support

Support for Hazelcast is provided via GitHub, Mail Group and StackOverflow.

For information on the commercial support for Hazelcast IMDG and Hazelcast IMDG Enterprise, please see hazelcast.com.

Release Notes

Please refer to the Release Notes document for the new features, enhancements and fixes performed for each Hazelcast IMDG release.

Contributing to Hazelcast IMDG

You can contribute to the Hazelcast IMDG code, report a bug, or request an enhancement. Please see the following resources.

Partners

Hazelcast partners with leading hardware and software technologies, system integrators, resellers and OEMs including Amazon Web Services, Vert.x, Azul Systems, C2B2. Please see the Partners page for the full list of and information on our partners.

Phone Home

Hazelcast uses phone home data to learn about usage of Hazelcast IMDG.

Hazelcast IMDG member instances call our phone home server initially when they are started and then every 24 hours. This applies to all the instances joined to the cluster.

What is sent in?

The following information is sent in a phone home:

  • Hazelcast IMDG version

  • Local Hazelcast IMDG member UUID

  • Download ID

  • A hash value of the cluster ID

  • Cluster size bands for 5, 10, 20, 40, 60, 100, 150, 300, 600 and > 600

  • Number of connected clients bands of 5, 10, 20, 40, 60, 100, 150, 300, 600 and > 600

  • Cluster uptime

  • Member uptime

  • Environment Information:

    • Name of operating system

    • Kernel architecture (32-bit or 64-bit)

    • Version of operating system

    • Version of installed Java

    • Name of Java Virtual Machine

  • Hazelcast IMDG Enterprise specific:

    • Number of clients by language (Java, C++, C#)

    • Flag for Hazelcast Enterprise

    • Hash value of license key

    • Native memory usage

  • Hazelcast Management Center specific:

    • Hazelcast Management Center version

    • Hash value of Hazelcast Management Center license key

Phone Home Code

The phone home code itself is open source. Please see here.

Disabling Phone Homes

Set the hazelcast.phone.home.enabled system property to false either in the config or on the Java command line. Please see the System Properties section for information on how to set a property.

You can also disable the phone home using the environment variable HZ_PHONE_HOME_ENABLED. Simply add the following line to your .bash_profile:

export HZ_PHONE_HOME_ENABLED=false

Phone Home URLs

For versions 1.x and 2.x: http://www.hazelcast.com/version.jsp.

For versions 3.x up to 3.6: http://versioncheck.hazelcast.com/version.jsp.

For versions after 3.6: http://phonehome.hazelcast.com/ping.

1. Document Revision History

This chapter lists the changes made to this document from the previous release.

Please refer to the Release Notes for the new features, enhancements and fixes performed for each Hazelcast release.*
Table 1. Revision History

Chapter

Section

Description

Getting Started

Added Running in Modular Java as a new section.

Setting Up Clusters

Partition Group Configuration

Added a note to the ZONE_AWARE type content suggesting to have equal number of members in each Availability Zone.

Distributed Data Structures

Reliable Topic

Enhanced the content to clarify the relation between Reliable Topic and Ringbuffer data structures.

Map

Added content for the newly introduced maxIdle parameter for the put operation. See the Evicting Specific Entries section.

Distributed Query

Added Using "this" as an Attribute as a new section to explain how you can use the keyword "this" in your queries.

Hazelcast JCache

Defining a Custom ExpiryPolicy

Updated the content to mention about the method setExpiryPolicy which associates certain keys with custom expiry policies.

JCache API

Added content related to eager expiration.

Hazelcast Clients

Java Client

Added Configuring Client Connection Retry as a new section.

Serialization

Added Untrusted Deserialization Protection as a new section.

Management

Added Map Index Statistics as a new section describing how to access map index statistics.

Diagnostics

Added content for the new OperationThreadSamples plugin.

Security

TLS/SSL

Performed a full review and updated the content to include the new properties keyStoreType and trustStoreType.

Added Using BoringSSL as a new section.

WAN Replication

Added URL information to synchronize all the maps in source and target clusters.

Configuring Consumer and Event Filtering API

Added content related to the new event type LOADED and configuration element persist-wan-replicated-data.

Added the REST URL for clearing the event queues. See the Queue Capacity section.

Hazelcast Plugins

Added as a new chapter to describe the plugins using which you can extend Hazelcast IMDG’s functionalities.

System Properties

Added definitions for the following system properties:

* hazelcast.socket.buffer.direct

2. Getting Started

This chapter explains how to install Hazelcast and start a Hazelcast member and client. It describes the executable files in the download package and also provides the fundamentals for configuring Hazelcast and its deployment options.

2.1. Installation

The following sections explain the installation of Hazelcast IMDG and Hazelcast IMDG Enterprise. It also includes notes and changes to consider when upgrading Hazelcast.

2.1.1. Installing Hazelcast IMDG

You can find Hazelcast in standard Maven repositories. If your project uses Maven, you do not need to add additional repositories to your pom.xml or add hazelcast-<version>.jar file into your classpath (Maven does that for you). Just add the following lines to your pom.xml:

<dependencies>
   <dependency>
      <groupId>com.hazelcast</groupId>
      <artifactId>hazelcast</artifactId>
      <version>Hazelcast IMDG Version To Be Installed</version>
    </dependency>
</dependencies>

As an alternative, you can download and install Hazelcast IMDG yourself. You only need to:

  • Download the package hazelcast-<version>.zip or hazelcast-<version>.tar.gz from hazelcast.org.

  • Extract the downloaded hazelcast-<version>.zip or hazelcast-<version>.tar.gz.

  • Add the file hazelcast-<version>.jar to your classpath.

2.1.2. Installing Hazelcast IMDG Enterprise

There are two Maven repositories defined for Hazelcast IMDG Enterprise:

<repository>
   <id>Hazelcast Private Snapshot Repository</id>
   <url>https://repository-hazelcast-l337.forge.cloudbees.com/snapshot/</url>
</repository>
<repository>
   <id>Hazelcast Private Release Repository</id>
   <url>https://repository-hazelcast-l337.forge.cloudbees.com/release/</url>
</repository>

Hazelcast IMDG Enterprise customers may also define dependencies, a sample of which is shown below.

<dependency>
   <groupId>com.hazelcast</groupId>
   <artifactId>hazelcast-enterprise</artifactId>
   <version>Hazelcast IMDG Enterprise Version To Be Installed</version>
</dependency>
<dependency>
   <groupId>com.hazelcast</groupId>
   <artifactId>hazelcast-enterprise-all</artifactId>
   <version>Hazelcast IMDG Enterprise Version To Be Installed</version>
</dependency>

2.2. Setting the License Key

Hazelcast IMDG Enterprise offers you two types of licenses: Enterprise and Enterprise HD. The supported features differ in your Hazelcast setup according to the license type you own.

  • Enterprise license: In addition to the open source edition of Hazelcast, Enterprise features are the following:

    • Security

    • WAN Replication

    • Clustered REST

    • Clustered JMX

    • Striim Hot Cache

    • Rolling Upgrades

  • Enterprise HD license: In addition to the Enterprise features, Enterprise HD features are the following:

    • High-Density Memory Store

    • Hot Restart Persistence

To use Hazelcast IMDG Enterprise, you need to set the provided license key using one of the configuration methods shown below.

Hazelcast IMDG Enterprise license keys are required only for members. You do not need to set a license key for your Java clients for which you want to use IMDG Enterprise features.

Declarative Configuration:

Add the below line to any place you like in the file hazelcast.xml. This XML file offers you a declarative way to configure your Hazelcast. It is included in the Hazelcast download package. When you extract the downloaded package, you will see the file hazelcast.xml under the /bin directory.

<hazelcast>
  ...
  <license-key>Your Enterprise License Key</license-key>
  ...
</hazelcast>

Programmatic Configuration:

Alternatively, you can set your license key programmatically as shown below.

Config config = new Config();
config.setLicenseKey( "Your Enterprise License Key" );

Spring XML Configuration:

If you are using Spring with Hazelcast, then you can set the license key using the Spring XML schema, as shown below.

<hz:config>
  ...
  <hz:license-key>Your Enterprise License Key</hz:license-key>
  ...
</hz:config>

JVM System Property:

As another option, you can set your license key using the below command (the "-D" command line option).

-Dhazelcast.enterprise.license.key=Your Enterprise License Key

2.2.1. License Key Format

License keys have the following format:

<Name of the Hazelcast edition>#<Count of the Members>#<License key>

The strings before the <License key> is the human readable part. You can use your license key with or without this human readable part. So, both the following example license keys are valid:

HazelcastEnterpriseHD#2Nodes#1q2w3e4r5t
1q2w3e4r5t

2.3. Running in Modular Java

Java project Jigsaw brought a new Module System into Java 9 and newer. Hazelcast supports running in the modular environment. If you want to run your application with Hazelcast libraries on the modulepath, use following module names:

  • com.hazelcast.core for hazelcast-<version>.jar and hazelcast-enterprise-<version>.jar

  • com.hazelcast.client for hazelcast-client-<version>.jar and hazelcast-enterprise-client-<version>.jar

Don’t use hazelcast-all-<version>.jar or hazelcast-enterprise-all-<version>.jar on the modulepath as it could lead to problems in module dependencies for your application. You can still use them on the classpath.

The Java Module System comes with stricter visibility rules. It affects Hazelcast which uses internal Java API to reach the best performance results.

Hazelcast needs java.se module and access to the following Java packages for a proper work:

  • java.base/jdk.internal.ref

  • java.base/java.nio (reflective access)

  • java.base/sun.nio.ch (reflective access)

  • java.base/java.lang (reflective access)

  • jdk.management/com.sun.management.internal (reflective access)

  • java.management/sun.management (reflective access)

You can provide the access to the above mentioned packages by using --add-exports and --add-opens (for the reflective access) Java arguments.

Example: Running a member on the classpath

java --add-exports java.base/jdk.internal.ref=ALL-UNNAMED \
  --add-opens java.base/java.lang=ALL-UNNAMED \
  --add-opens java.base/java.nio=ALL-UNNAMED \
  --add-opens java.base/sun.nio.ch=ALL-UNNAMED \
  --add-opens java.management/sun.management=ALL-UNNAMED \
  --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED \
  -jar hazelcast-<version>.jar

Example: Running a member on the modulepath

java --add-exports java.base/jdk.internal.ref=com.hazelcast.core \
  --add-opens java.base/java.lang=com.hazelcast.core \
  --add-opens java.base/java.nio=com.hazelcast.core \
  --add-opens java.base/sun.nio.ch=com.hazelcast.core \
  --add-opens java.management/sun.management=com.hazelcast.core \
  --add-opens jdk.management/com.sun.management.internal=com.hazelcast.core \
  --module-path lib \
  --add-modules java.se \
  --module com.hazelcast.core/com.hazelcast.core.server.StartServer

This example expects hazelcast-<version>.jar placed in the lib directory.

2.4. Upgrading from 3.x

  • Upgrading from 3.6.x to 3.7.x when using JCache: Hazelcast 3.7 introduced changes in JCache implementation which broke compatibility of 3.6.x clients to 3.7-3.7.2 cluster members and vice versa, so 3.7-3.7.2 clients are also incompatible with 3.6.x cluster members. This issue only affects Java clients which use JCache functionality.

    Starting with Hazelcast 3.7.3, a compatibility option is provided which can be used to ensure backwards compatibility with 3.6.x clients.

    In order to upgrade a 3.6.x cluster and clients to 3.7.3 (or later), you will need to use this compatibility option on either the member or the client side, depending on which one is upgraded first:

    • first upgrade your cluster members to 3.7.3, adding property hazelcast.compatibility.3.6.client=true to your configuration; when started with this property, cluster members are compatible with 3.6.x and 3.7.3+ clients but not with 3.7-3.7.2 clients. Once your cluster is upgraded, you may upgrade your applications to use client version 3.7.3+.

    • upgrade your clients from 3.6.x to 3.7.3, adding property hazelcast.compatibility.3.6.server=true to your Hazelcast client configuration. A 3.7.3 client started with this compatibility option is compatible with 3.6.x and 3.7.3+ cluster members but incompatible with 3.7-3.7.2 cluster members. Once your clients are upgraded, you may then proceed to upgrade your cluster members to version 3.7.3 or later.

      You may use any of the supported ways as described in the System Properties section to configure the compatibility option. When done upgrading your cluster and clients, you may remove the compatibility property from your Hazelcast member configuration.

  • Upgrading from 3.6.x to 3.8.x EE when using JCache: Due to a compatibility problem CacheConfig serialization may not work if your member is 3.8.x where x < 5. Hence, you will need to use the 3.8.5 or higher version where the problem is being fixed.

  • Introducing the spring-aware element: Before the release 3.5, Hazelcast uses SpringManagedContext to scan SpringAware annotations by default. This may cause some performance overhead for the users who do not use SpringAware. This behavior has been changed with the release of Hazelcast 3.5. SpringAware annotations are disabled by default. By introducing the spring-aware element, now it is possible to enable it by adding the <hz:spring-aware /> tag to the configuration. Please see the Spring Integration section.

  • Introducing new configuration options for WAN replication: Starting with Hazelcast 3.6, WAN replication related system properties, which are configured on a per member basis, can now be configured per target cluster. The 4 system properties below are no longer valid.

  • Removal of deprecated getId() method: The method getId() in the interface DistributedObject has been removed. Please use the method getName() instead.

  • Change in the Custom Serialization in the C++ Client Distribution: Before, the method getTypeId() was used to retrieve the ID of the object to be serialized. Now, the method getHazelcastTypeId() is used and you give your object as a parameter to this new method. Also, getTypeId() was used in your custom serializer class, now it has been renamed to getHazelcastTypeId() too. Note that, these changes also apply when you want to switch from Hazelcast 3.6.1 to 3.6.2 too.

  • Important note about Hazelcast System Properties: Even Hazelcast has not been recommending the usage of GroupProperties.java class while benefiting from System Properties, there has been a change to inform to the users who have been using this class. Starting with Hazelcast 3.7, the class GroupProperties.java has been replaced by GroupProperty.java. In this new class, system properties are instances of the newly introduced HazelcastProperty object. You can access the names of these properties by calling getName() method of HazelcastProperty.

  • Removal of WanNoDelayReplication: WanNoDelayReplication implementation of Hazelcast’s WAN Replication has been removed starting with Hazelcast 3.7. You can still achieve this behavior by setting the batch size to 1 while configuring the WanBatchReplication. Please refer to the Defining WAN Replication section for more information.

  • Introducing <wan-publisher> element: Starting with Hazelcast 3.8, the configuration element <target-cluster> is replaced with the element <wan-publisher> in WAN replication configuration.

  • WaitNotifyService interface has been renamed as OperationParker.

  • Synchronizing WAN Target Cluster: Starting with Hazelcast 3.8 release, the URL for the REST call has been changed from http://member_ip:port/hazelcast/rest/wan/sync/map to http://member_ip:port/hazelcast/rest/mancenter/wan/sync/map.

2.5. Upgrading from 2.x

  • Removal of deprecated static methods: The static methods of Hazelcast class reaching Hazelcast data components have been removed. The functionality of these methods can be reached from the HazelcastInstance interface. You should replace the following:

    Map<Integer, String> customers = Hazelcast.getMap( "customers" );

    with

    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    // or if you already started an instance named "instance1"
    // HazelcastInstance hazelcastInstance = Hazelcast.getHazelcastInstanceByName( "instance1" );
    Map<Integer, String> customers = hazelcastInstance.getMap( "customers" );
  • Renaming "instance" to "distributed object": Before 3.0 there was confusion about the term "instance": it was used for both the cluster members and the distributed objects (map, queue, topic, etc. instances). Starting with Hazelcast 3.0, the term instance will be only used for Hazelcast instances, namely cluster members. We will use the term "distributed object" for map, queue, etc. instances. You should replace the related methods with the new renamed ones. 3.0 clients are smart clients in that they know in which cluster member the data is located, so you can replace your lite members with native clients.

    public static void main( String[] args ) throws InterruptedException {
      HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
      IMap map = hazelcastInstance.getMap( "test" );
      Collection<Instance> instances = hazelcastInstance.getInstances();
      for ( Instance instance : instances ) {
        if ( instance.getInstanceType() == Instance.InstanceType.MAP ) {
          System.out.println( "There is a map with name: " + instance.getId() );
        }
      }
    }

    with

    public static void main( String[] args ) throws InterruptedException {
      HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
      IMap map = hz.getMap( "test" );
      Collection<DistributedObject> objects = hazelcastInstance.getDistributedObjects();
      for ( DistributedObject distributedObject : objects ) {
        if ( distributedObject instanceof IMap ) {
          System.out.println( "There is a map with name: " + distributedObject.getName() );
        }
      }
    }
  • Package structure change: PartitionService has been moved to package com.hazelcast.core from com.hazelcast.partition.

  • Listener API change: Before 3.0, removeListener methods were taking the Listener object as a parameter. But this caused confusion because same listener object may be used as a parameter for different listener registrations. So we have changed the listener API. addListener methods returns a unique ID and you can remove a listener by using this ID. So you should do the following replacement if needed:

    IMap map = hazelcastInstance.getMap( "map" );
    map.addEntryListener( listener, true );
    map.removeEntryListener( listener );

    with

    IMap map = hazelcastInstance.getMap( "map" );
    String listenerId = map.addEntryListener( listener, true );
    map.removeEntryListener( listenerId );
  • IMap changes:

    • tryRemove(K key, long timeout, TimeUnit timeunit) returns boolean indicating whether operation is successful.

    • tryLockAndGet(K key, long time, TimeUnit timeunit) is removed.

    • putAndUnlock(K key, V value) is removed.

    • lockMap(long time, TimeUnit timeunit) and unlockMap() are removed.

    • getMapEntry(K key) is renamed as getEntryView(K key). The returned object’s type, MapEntry class is renamed as EntryView.

    • There is no predefined names for merge policies. You just give the full class name of the merge policy implementation:

      <merge-policy>com.hazelcast.map.merge.PassThroughMergePolicy</merge-policy>

      Also MergePolicy interface has been renamed to MapMergePolicy and also returning null from the implemented merge() method causes the existing entry to be removed.

  • IQueue changes: There is no change on IQueue API but there are changes on how IQueue is configured. With Hazelcast 3.0 there will be no backing map configuration for queue. Settings like backup count will be directly configured on queue config. For queue configuration details, please see the Queue section.

  • Transaction API change: In Hazelcast 3.0, transaction API is completely different. Please see the Transactions chapter.

  • ExecutorService API change: Classes MultiTask and DistributedTask have been removed. All the functionality is supported by the newly presented interface IExecutorService. Please see the Executor Service section.

  • LifeCycleService API: The lifecycle has been simplified. pause(), resume(), restart() methods have been removed.

  • AtomicNumber: AtomicNumber class has been renamed to IAtomicLong.

  • ICountDownLatch: await() operation has been removed. We expect users to use await() method with timeout parameters.

  • ISemaphore API: The ISemaphore has been substantially changed. attach(), detach() methods have been removed.

    • In 2.x releases, the default value for max-size eviction policy was cluster_wide_map_size. In 3.x releases, default is PER_NODE. After upgrading, the max-size should be set according to this new default, if it is not changed. Otherwise, it is likely that OutOfMemory exception may be thrown.

2.6. Starting the Member and Client

Having installed Hazelcast, you can get started.

In this short tutorial, you perform the following activities.

  1. Create a simple Java application using the Hazelcast distributed map and queue.

  2. Run our application twice to have a cluster with two members (JVMs).

  3. Connect to our cluster from another Java application by using the Hazelcast Native Java Client API.

Let’s begin.

  • The following code starts the first Hazelcast member and creates and uses the customers map and queue.

    Config cfg = new Config();
    HazelcastInstance instance = Hazelcast.newHazelcastInstance(cfg);
    Map<Integer, String> mapCustomers = instance.getMap("customers");
    mapCustomers.put(1, "Joe");
    mapCustomers.put(2, "Ali");
    mapCustomers.put(3, "Avi");
    
    System.out.println("Customer with key 1: "+ mapCustomers.get(1));
    System.out.println("Map Size:" + mapCustomers.size());
    
    Queue<String> queueCustomers = instance.getQueue("customers");
    queueCustomers.offer("Tom");
    queueCustomers.offer("Mary");
    queueCustomers.offer("Jane");
    System.out.println("First customer: " + queueCustomers.poll());
    System.out.println("Second customer: "+ queueCustomers.peek());
    System.out.println("Queue size: " + queueCustomers.size());
  • Run this GettingStarted class a second time to get the second member started. The members form a cluster and the output is similar to the following.

    Members {size:2, ver:2} [
        Member [127.0.0.1]:5701 - e40081de-056a-4ae5-8ffe-632caf8a6cf1 this
        Member [127.0.0.1]:5702 - 93e82109-16bf-4b16-9c87-f4a6d0873080
    ]

    Here, you can see the size of your cluster (size) and member list version (ver). The member list version will be incremented when changes happen to the cluster, e.g., a member leaving from or joining to the cluster.

    The above member list format is introduced with Hazelcast 3.9. You can enable the legacy member list format, which was used for the releases before Hazelcast 3.9, using the system property hazelcast.legacy.memberlist.format.enabled. Please see the System Properties appendix. The following is an example for the legacy member list format:

    Members [2] {
        Member [127.0.0.1]:5701 - c1ccc8d4-a549-4bff-bf46-9213e14a9fd2 this
        Member [127.0.0.1]:5702 - 33a82dbf-85d6-4780-b9cf-e47d42fb89d4
    }
  • Now, add the hazelcast-client-<version>.jar library to your classpath. This is required to use a Hazelcast client.

  • The following code starts a Hazelcast Client, connects to our cluster, and prints the size of the customers map.

    public class GettingStartedClient {
    
        public static void main( String[] args ) {
            ClientConfig clientConfig = new ClientConfig();
            HazelcastInstance client = HazelcastClient.newHazelcastClient( clientConfig );
            IMap map = client.getMap( "customers" );
            System.out.println( "Map Size:" + map.size() );
        }
    }
  • When you run it, you see the client properly connecting to the cluster and printing the map size as 3.

Hazelcast also offers a tool, Management Center, that enables you to monitor your cluster. You can download it from Hazelcast website’s download page. You can use it to monitor your maps, queues and other distributed data structures and members. Please see the Hazelcast Management Center Reference Manual for usage explanations.

By default, Hazelcast uses multicast to discover other members that can form a cluster. If you are working with other Hazelcast developers on the same network, you may find yourself joining their clusters under the default settings. Hazelcast provides a way to segregate clusters within the same network when using multicast. Please see the Creating Cluster Groups for more information. Alternatively, if you do not wish to use the default multicast mechanism, you can provide a fixed list of IP addresses that are allowed to join. Please see the Join configuration section for more information.

Multicast mechanism is not recommended for production since UDP is often blocked in production environments and other discovery mechanisms are more definite. Please see the Discovery Mechanisms section.
You can also check the video tutorials here.

2.7. Using the Scripts In The Package

When you download and extract the Hazelcast ZIP or TAR.GZ package, you will see three scripts under the /bin folder that provide basic functionalities for member and cluster management.

The following are the names and descriptions of each script:

  • start.sh / start.bat: Starts a Hazelcast member with default configuration in the working directory*.

  • stop.sh / stop.bat: Stops the Hazelcast member that was started in the current working directory.

  • cluster.sh: Provides basic functionalities for cluster management, such as getting and changing the cluster state, shutting down the cluster or forcing the cluster to clean its persisted data and make a fresh start. Please refer to the Using the Script cluster.sh section to learn the usage of this script.

start.sh / start.bat scripts lets you start one Hazelcast instance per folder. To start a new instance, please unzip Hazelcast ZIP or TAR.GZ package in a new folder.

2.8. Deploying On Amazon EC2

You can deploy your Hazelcast project onto an Amazon EC2 environment using Third Party tools such as Vagrant and Chef.

You can find a sample deployment project (amazon-ec2-vagrant-chef) with step-by-step instructions in the hazelcast-integration folder of the hazelcast-code-samples package, which you can download at hazelcast.org. Please refer to this sample project for more information.

2.9. Deploying On Microsoft Azure

Azure Plugin

You can deploy your Hazelcast cluster onto a Microsoft Azure environment. For this, your cluster should make use of Hazelcast Discovery Plugin for Microsoft Azure. You can find information about this plugin on its GitHub repository at Hazelcast Azure.

For information on how to automatically deploy your cluster onto Azure, please see the Deployment section of Hazelcast Azure plugin repository.

2.10. Deploying On Pivotal Cloud Foundry

CloudFoundry

Starting with Hazelcast 3.7, you can deploy your Hazelcast cluster onto Pivotal Cloud Foundry. It is available as a Pivotal Cloud Foundry Tile which you can download at here. You can find the installation and usage instructions and the release notes documents at https://docs.pivotal.io/partners/hazelcast/index.html.

2.11. Deploying using Docker

Docker Plugin

You can deploy your Hazelcast projects using the Docker containers. Hazelcast has the following images on Docker:

  • Hazelcast IMDG

  • Hazelcast IMDG Enterprise

  • Hazelcast Management Center

  • Hazelcast OpenShift

After you pull an image from the Docker registry, you can run your image to start the Management Center or a Hazelcast instance with Hazelcast’s default configuration. All repositories provide the latest stable releases but you can pull a specific release too. You can also specify environment variables when running the image.

If you want to start a customized Hazelcast instance, you can extend the Hazelcast image by providing your own configuration file.

This feature is provided as a Hazelcast plugin. Please see its own GitHub repo at Hazelcast Docker for details on configurations and usages.

3. Hazelcast Overview

Hazelcast is an open source In-Memory Data Grid (IMDG). It provides elastically scalable distributed In-Memory computing, widely recognized as the fastest and most scalable approach to application performance. Hazelcast does this in open source. More importantly, Hazelcast makes distributed computing simple by offering distributed implementations of many developer-friendly interfaces from Java such as Map, Queue, ExecutorService, Lock and JCache. For example, the Map interface provides an In-Memory Key Value store which confers many of the advantages of NoSQL in terms of developer friendliness and developer productivity.

In addition to distributing data In-Memory, Hazelcast provides a convenient set of APIs to access the CPUs in your cluster for maximum processing speed. Hazelcast is designed to be lightweight and easy to use. Since Hazelcast is delivered as a compact library (JAR) and since it has no external dependencies other than Java, it easily plugs into your software solution and provides distributed data structures and distributed computing utilities.

Hazelcast is highly scalable and available. Distributed applications can use Hazelcast for distributed caching, synchronization, clustering, processing, pub/sub messaging, etc. Hazelcast is implemented in Java and has clients for Java, C/C++, .NET, REST, Python, Go and Node.js. Hazelcast also speaks Memcached protocol. It plugs into Hibernate and can easily be used with any existing database system.

If you are looking for in-memory speed, elastic scalability and the developer friendliness of NoSQL, Hazelcast is a great choice.

Hazelcast is Simple

Hazelcast is written in Java with no other dependencies. It exposes the same API from the familiar Java util package, exposing the same interfaces. Just add hazelcast.jar to your classpath and you can quickly enjoy JVMs clustering and start building scalable applications.

Hazelcast is Peer-to-Peer

Unlike many NoSQL solutions, Hazelcast is peer-to-peer. There is no master and slave; there is no single point of failure. All members store equal amounts of data and do equal amounts of processing. You can embed Hazelcast in your existing application or use it in client and server mode where your application is a client to Hazelcast members.

Hazelcast is Scalable

Hazelcast is designed to scale up to hundreds and thousands of members. Simply add new members and they will automatically discover the cluster and will linearly increase both memory and processing capacity. The members maintain a TCP connection between each other and all communication is performed through this layer.

Hazelcast is Fast

Hazelcast stores everything in-memory. It is designed to perform very fast reads and updates.

Hazelcast is Redundant

Hazelcast keeps the backup of each data entry on multiple members. On a member failure, the data is restored from the backup and the cluster will continue to operate without downtime.

3.1. Sharding in Hazelcast

Hazelcast shards are called Partitions. By default, Hazelcast has 271 partitions. Given a key, we serialize, hash and mod it with the number of partitions to find the partition which the key belongs to. The partitions themselves are distributed equally among the members of the cluster. Hazelcast also creates the backups of partitions and distributes them among members for redundancy.

Please refer to the Data Partitioning section for more information on how Hazelcast partitions your data.

3.2. Hazelcast Topology

You can deploy a Hazelcast cluster in two ways: Embedded or Client/Server.

If you have an application whose main focal point is asynchronous or high performance computing and lots of task executions, then Embedded deployment is the preferred way. In Embedded deployment, members include both the application and Hazelcast data and services. The advantage of the Embedded deployment is having a low-latency data access.

See the below illustration.

Embedded Deployment

In the Client/Server deployment, Hazelcast data and services are centralized in one or more server members and they are accessed by the application through clients. You can have a cluster of server members that can be independently created and scaled. Your clients communicate with these members to reach to Hazelcast data and services on them.

See the below illustration.

Client/Server Deployment

Hazelcast provides native clients (Java, .NET and C++), Memcache and REST clients, Scala, Python and Node.js client implementations.

Client/Server deployment has advantages including more predictable and reliable Hazelcast performance, easier identification of problem causes and, most importantly, better scalability. When you need to scale in this deployment type, just add more Hazelcast server members. You can address client and server scalability concerns separately.

If you want low-latency data access, as in the Embedded deployment, and you also want the scalability advantages of the Client/Server deployment, you can consider defining Near Caches for your clients. This enables the frequently used data to be kept in the client’s local memory. Please refer to the Configuring Client Near Cache section.

3.3. Why Hazelcast?

A Glance at Traditional Data Persistence

Data is at the core of software systems. In conventional architectures, a relational database persists and provides access to data. Applications are talking directly with a database which has its backup as another machine. To increase performance, tuning or a faster machine is required. This can cost a large amount of money or effort.

There is also the idea of keeping copies of data next to the database, which is performed using technologies like external key-value stores or second level caching that help offload the database. However, when the database is saturated or the applications perform mostly "put" operations (writes), this approach is of no use because it insulates the database only from the "get" loads (reads). Even if the applications are read-intensive there can be consistency problems - when data changes, what happens to the cache and how are the changes handled? This is when concepts like time-to-live (TTL) or write-through come in.

In the case of TTL, if the access is less frequent than the TTL, the result will always be a cache miss. On the other hand, in the case of write-through caches, if there are more than one of these caches in a cluster, we again will have consistency issues. This can be avoided by having the nodes communicate with each other so that entry invalidations can be propagated.

We can conclude that an ideal cache would combine TTL and write-through features. There are several cache servers and in-memory database solutions in this field. However, these are stand-alone single instances with a distribution mechanism that is provided by other technologies to an extent. So, we are back to square one; we experience saturation or capacity issues if the product is a single instance or if consistency is not provided by the distribution.

And, there is Hazelcast

Hazelcast, a brand new approach to data, is designed around the concept of distribution. Hazelcast shares data around the cluster for flexibility and performance. It is an in-memory data grid for clustering and highly scalable data distribution.

One of the main features of Hazelcast is that it does not have a master member. Each cluster member is configured to be the same in terms of functionality. The oldest member (the first member created in the cluster) automatically performs the data assignment to cluster members. If the oldest member dies, the second oldest member takes over.

You can come across with the term "master" or "master member" in some sections of this manual. They are used for contextual clarification purposes; please remember that they refer to the "oldest member" which is explained in the above paragraph.

Another main feature of Hazelcast is that the data is held entirely in-memory. This is fast. In the case of a failure, such as a member crash, no data will be lost since Hazelcast distributes copies of the data across all the cluster members.

As shown in the feature list in the Distributed Data Structures chapter, Hazelcast supports a number of distributed data structures and distributed computing utilities. These provide powerful ways of accessing distributed clustered memory and accessing CPUs for true distributed computing.

Hazelcast’s Distinctive Strengths

  • Hazelcast is open source.

  • Hazelcast is only a JAR file. You do not need to install software.

  • Hazelcast is a library, it does not impose an architecture on Hazelcast users.

  • Hazelcast provides out-of-the-box distributed data structures, such as Map, Queue, MultiMap, Topic, Lock and Executor.

  • There is no "master," meaning no single point of failure in a Hazelcast cluster; each member in the cluster is configured to be functionally the same.

  • When the size of your memory and compute requirements increase, new members can be dynamically joined to the Hazelcast cluster to scale elastically.

  • Data is resilient to member failure. Data backups are distributed across the cluster. This is a big benefit when a member in the cluster crashes as data will not be lost.

  • Members are always aware of each other unlike in traditional key-value caching solutions.

  • You can build your own custom-distributed data structures using the Service Programming Interface (SPI) if you are not happy with the data structures provided.

Finally, Hazelcast has a vibrant open source community enabling it to be continuously developed.

Hazelcast is a fit when you need:

  • analytic applications requiring big data processing by partitioning the data.

  • to retain frequently accessed data in the grid.

  • a cache, particularly an open source JCache provider with elastic distributed scalability.

  • a primary data store for applications with utmost performance, scalability and low-latency requirements.

  • an In-Memory NoSQL Key Value Store.

  • publish/subscribe communication at highest speed and scalability between applications.

  • applications that need to scale elastically in distributed and cloud environments.

  • a highly available distributed cache for applications.

  • an alternative to Coherence and Terracotta.

3.4. Data Partitioning

As you read in the Sharding in Hazelcast section, Hazelcast shards are called Partitions. Partitions are memory segments that can contain hundreds or thousands of data entries each, depending on the memory capacity of your system. Each Hazelcast partition can have multiple replicas, which are distributed among the cluster members. One of the replicas becomes the primary and other replicas are called backups. Cluster member which owns primary replica of a partition is called partition owner. When you read or write a particular data entry, you transparently talk to the owner of the partition that contains the data entry.

By default, Hazelcast offers 271 partitions. When you start a cluster with a single member, it owns all of 271 partitions (i.e., it keeps primary replicas for 271 partitions). The following illustration shows the partitions in a Hazelcast cluster with single member.

Single Member with Partitions

When you start a second member on that cluster (creating a Hazelcast cluster with two members), the partition replicas are distributed as shown in the illustration here.

Partition distributions in the below illustrations are shown for the sake of simplicity and for descriptive purposes. Normally, the partitions are not distributed in any order, as they are shown in these illustrations, but are distributed randomly (they do not have to be sequentially distributed to each member). The important point here is that Hazelcast equally distributes the partition primaries and their backup replicas among the members.
Cluster with Two Members - Backups are Created

In the illustration, the partition replicas with black text are primaries and the partition replicas with blue text are backups. The first member has primary replicas of 135 partitions (black) and each of these partitions are backed up in the second member (i.e., the second member owns the backup replicas) (blue). At the same time, the first member also has the backup replicas of the second member’s primary partition replicas.

As you add more members, Hazelcast moves some of the primary and backup partition replicas to the new members one by one, making all members equal and redundant. Thanks to the consistent hashing algorithm, only the minimum amount of partitions will be moved to scale out Hazelcast. The following is an illustration of the partition replica distributions in a Hazelcast cluster with four members.

Cluster with Four Members

Hazelcast distributes partitions' primary and backup replicas equally among the members of the cluster. Backup replicas of the partitions are maintained for redundancy.

Your data can have multiple copies on partition primaries and backups, depending on your backup count. Please see the Backing Up Maps section.

Starting with Hazelcast 3.6, lite members are introduced. Lite members are a new type of members that do not own any partition. Lite members are intended for use in computationally-heavy task executions and listener registrations. Although they do not own any partitions, they can access partitions that are owned by other members in the cluster.

Please refer to the Enabling Lite Members section.

3.4.1. How the Data is Partitioned

Hazelcast distributes data entries into the partitions using a hashing algorithm. Given an object key (for example, for a map) or an object name (for example, for a topic or list):

  • the key or name is serialized (converted into a byte array)

  • this byte array is hashed

  • the result of the hash is mod by the number of partitions

The result of this modulo - MOD(hash result, partition count) - is the partition in which the data will be stored, that is the partition ID. For ALL members you have in your cluster, the partition ID for a given key will always be the same.

3.4.2. Partition Table

When you start a member, a partition table is created within it. This table stores the partition IDs and the cluster members to which they belong. The purpose of this table is to make all members (including lite members) in the cluster aware of this information, making sure that each member knows where the data is.

The oldest member in the cluster (the one that started first) periodically sends the partition table to all members. In this way each member in the cluster is informed about any changes to partition ownership. The ownerships may be changed when, for example, a new member joins the cluster, or when a member leaves the cluster.

If the oldest member of the cluster goes down, the next oldest member sends the partition table information to the other ones.

You can configure the frequency (how often) that the member sends the partition table the information by using the hazelcast.partition.table.send.interval system property. The property is set to every 15 seconds by default.

3.4.3. Repartitioning

Repartitioning is the process of redistribution of partition ownerships. Hazelcast performs the repartitioning in the following cases:

  • When a member joins to the cluster.

  • When a member leaves the cluster.

In these cases, the partition table in the oldest member is updated with the new partition ownerships. Note that if a lite member joins or leaves a cluster, repartitioning is not triggered since lite members do not own any partitions.

3.5. Use Cases

Hazelcast can be used:

  • to share server configuration/information to see how a cluster performs.

  • to cluster highly changing data with event notifications, e.g., user based events, and to queue and distribute background tasks.

  • as a simple Memcache with Near Cache.

  • as a cloud-wide scheduler of certain processes that need to be performed on some members.

  • to share information (user information, queues, maps, etc.) on the fly with multiple members in different installations under OSGI environments.

  • to share thousands of keys in a cluster where there is a web service interface on an application server and some validation.

  • as a distributed topic (publish/subscribe server) to build scalable chat servers for smartphones.

  • as a front layer for a Cassandra back-end.

  • to distribute user object states across the cluster, to pass messages between objects and to share system data structures (static initialization state, mirrored objects, object identity generators).

  • as a multi-tenancy cache where each tenant has its own map.

  • to share datasets, e.g., table-like data structure, to be used by applications.

  • to distribute the load and collect status from Amazon EC2 servers where the front-end is developed using, for example, Spring framework.

  • as a real-time streamer for performance detection.

  • as storage for session data in web applications (enables horizontal scalability of the web application).

3.6. Resources

4. Understanding Configuration

This chapter describes the options to configure your Hazelcast applications and explains the utilities which you can make use of while configuring. You can configure Hazelcast using one or mix of the following options:

  • Declarative way

  • Programmatic way

  • Using Hazelcast system properties

  • Within the Spring context

  • Dynamically adding configuration on a running cluster (starting with Hazelcast 3.9)

4.1. Configuring Declaratively

This is the configuration option where you use an XML configuration file. When you download and unzip hazelcast-<version>.zip, you will see the following files present in /bin folder, which are standard XML-formatted configuration files:

  • hazelcast.xml: Default declarative configuration file for Hazelcast. The configuration in this XML file should be fine for most of the Hazelcast users. If not, you can tailor this XML file according to your needs by adding/removing/modifying properties.

  • hazelcast-full-example.xml: Configuration file which includes all Hazelcast configuration elements and attributes with their descriptions. It is the "superset" of hazelcast.xml. You can use hazelcast-full-example.xml as a reference document to learn about any element or attribute, or you can change its name to hazelcast.xml and start to use it as your Hazelcast configuration file.

A part of hazelcast.xml is shown as an example below.

<group>
    <name>dev</name>
</group>
<management-center enabled="false">http://localhost:8080/mancenter</management-center>
<network>
     <port auto-increment="true" port-count="100">5701</port>
     <outbound-ports>
     <!--
     Allowed port range when connecting to other members.
     0 or * means the port provided by the system.
     -->
         <ports>0</ports>
     </outbound-ports>
     <join>
         <multicast enabled="true">
             <multicast-group>224.2.2.3</multicast-group>
             <multicast-port>54327</multicast-port>
         </multicast>
         <tcp-ip enabled="false">
             <interface>127.0.0.1</interface>
             <member-list>
                 <member>127.0.0.1</member>
             </member-list>
         </tcp-ip>
    </join>
</network>
<map name="default">
    <time-to-live-seconds>0</time-to-live-seconds>
</map>

4.1.1. Composing Declarative Configuration

You can compose the declarative configuration of your Hazelcast member or Hazelcast client from multiple declarative configuration snippets. In order to compose a declarative configuration, you can use the <import/> element to load different declarative configuration files.

Let’s say you want to compose the declarative configuration for Hazelcast out of two configurations: development-group-config.xml and development-network-config.xml. These two configurations are shown below.

development-group-config.xml:

<hazelcast>
  <group>
      <name>dev</name>
  </group>
</hazelcast>

development-network-config.xml:

<hazelcast>
  <network>
    <port auto-increment="true" port-count="100">5701</port>
    <join>
        <multicast enabled="true">
            <multicast-group>224.2.2.3</multicast-group>
            <multicast-port>54327</multicast-port>
        </multicast>
    </join>
  </network>
</hazelcast>

To get your example Hazelcast declarative configuration out of the above two, use the <import/> element as shown below.

<hazelcast>
  <import resource="development-group-config.xml"/>
  <import resource="development-network-config.xml"/>
</hazelcast>

This feature also applies to the declarative configuration of Hazelcast client. Please see the following examples.

client-group-config.xml:

<hazelcast-client>
  <group>
      <name>dev</name>
  </group>
</hazelcast-client>

client-network-config.xml:

<hazelcast-client>
    <network>
        <cluster-members>
            <address>127.0.0.1:7000</address>
        </cluster-members>
    </network>
</hazelcast-client>

To get a Hazelcast client declarative configuration from the above two examples, use the <import/> element as shown below.

<hazelcast-client>
  <import resource="client-group-config.xml"/>
  <import resource="client-network-config.xml"/>
</hazelcast>
Use <import/> element on top level of the XML hierarchy.

Using the element <import>, you can also load XML resources from classpath and file system:

<hazelcast>
  <import resource="file:///etc/hazelcast/development-group-config.xml"/> <!-- loaded from filesystem -->
  <import resource="classpath:development-network-config.xml"/>  <!-- loaded from classpath -->
</hazelcast>

The element <import> supports variables too. Please see the following example snippet:

<hazelcast>
  <import resource="${environment}-group-config.xml"/>
  <import resource="${environment}-network-config.xml"/>
</hazelcast>
You can refer to the Using Variables section to learn how you can set the configuration elements with variables.

4.2. Configuring Programmatically

Besides declarative configuration, you can configure your cluster programmatically. For this you can create a Config object, set/change its properties and attributes and use this Config object to create a new Hazelcast member. Following is an example code which configures some network and Hazelcast Map properties.

Config config = new Config();
config.getNetworkConfig().setPort( 5900 )
        .setPortAutoIncrement( false );

MapConfig mapConfig = new MapConfig();
mapConfig.setName( "testMap" )
        .setBackupCount( 2 )
        .setTimeToLiveSeconds( 300 );

To create a Hazelcast member with the above example configuration, pass the configuration object as shown below:

HazelcastInstance hazelcast = Hazelcast.newHazelcastInstance( config );
The Config must not be modified after the Hazelcast instance is started. In other words, all configuration must be completed before creating the HazelcastInstance. Certain additional configuration elements can be added at runtime as described in the Dynamically Adding Data Structure Configuration on a Cluster section.

You can also create a named Hazelcast member. In this case, you should set instanceName of Config object as shown below:

Config config = new Config();
config.setInstanceName( "my-instance" );
Hazelcast.newHazelcastInstance( config );

To retrieve an existing Hazelcast member by its name, use the following:

Hazelcast.getHazelcastInstanceByName( "my-instance" );

To retrieve all existing Hazelcast members, use the following:

Hazelcast.getAllHazelcastInstances();
Hazelcast performs schema validation through the file hazelcast-config-<version>.xsd which comes with your Hazelcast libraries. Hazelcast throws a meaningful exception if there is an error in the declarative or programmatic configuration.

If you want to specify your own configuration file to create Config, Hazelcast supports several ways including filesystem, classpath, InputStream and URL:

  • Config cfg = new XmlConfigBuilder(xmlFileName).build();

  • Config cfg = new XmlConfigBuilder(inputStream).build();

  • Config cfg = new ClasspathXmlConfig(xmlFileName);

  • Config cfg = new FileSystemXmlConfig(configFilename);

  • Config cfg = new UrlXmlConfig(url);

  • Config cfg = new InMemoryXmlConfig(xml);

4.3. Configuring with System Properties

You can use system properties to configure some aspects of Hazelcast. You set these properties as name and value pairs through declarative configuration, programmatic configuration or JVM system property. Following are examples for each option.

Declaratively:

  ....
  <properties>
    <property name="hazelcast.property.foo">value</property>
    ....
  </properties>
</hazelcast>

Programmatically:

Config config = new Config() ;
config.setProperty( "hazelcast.property.foo", "value" );

Using JVM’s System class or -D argument:

System.setProperty( "hazelcast.property.foo", "value" );

or

java -Dhazelcast.property.foo=value

You will see Hazelcast system properties mentioned throughout this Reference Manual as required in some of the chapters and sections. All Hazelcast system properties are listed in the System Properties appendix with their descriptions, default values and property types as a reference for you.

4.4. Configuring within Spring Context

If you use Hazelcast with Spring you can declare beans using the namespace hazelcast. When you add the namespace declaration to the element beans in the Spring context file, you can start to use the namespace shortcut hz to be used as a bean declaration. Following is an example Hazelcast configuration when integrated with Spring:

<hz:hazelcast id="instance">
  <hz:config>
    <hz:group name="dev"/>
    <hz:network port="5701" port-auto-increment="false">
      <hz:join>
        <hz:multicast enabled="false"/>
        <hz:tcp-ip enabled="true">
          <hz:members>10.10.1.2, 10.10.1.3</hz:members>
        </hz:tcp-ip>
      </hz:join>
    </hz:network>
  </hz:config>
</hz:hazelcast>

Please see the Spring Integration section for more information on Hazelcast-Spring integration.

4.5. Dynamically Adding Data Structure Configuration on a Cluster

As described above, Hazelcast can be configured in a declarative or programmatic way; configuration must be completed before starting a Hazelcast member and this configuration cannot be altered at runtime, thus we refer to this as static configuration.

Starting with Hazelcast 3.9, it is possible to dynamically add configuration for certain data structures at runtime; these can be added by invoking one of the Config.add*Config methods on the Config object obtained from a running member’s HazelcastInstance.getConfig() method. For example:

Config config = new Config();
MapConfig mapConfig = new MapConfig("sessions");
config.addMapConfig(mapConfig);
HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
MapConfig noBackupsMap = new MapConfig("dont-backup").setBackupCount(0);
instance.getConfig().addMapConfig(noBackupsMap);

Dynamic configuration elements must be fully configured before the invocation of add*Config method: at that point, the configuration object will be delivered to every member of the cluster and added to each member’s dynamic configuration, so mutating the configuration object after the add*Config invocation will have no effect.

As dynamically added data structure configuration is propagated across all cluster members, failures may occur due to conditions such as timeout and network partition. The configuration propagation mechanism internally retries adding the configuration whenever a membership change is detected. However if an exception is thrown from add*Config method, the configuration may have been partially propagated to some cluster members and adding the configuration should be retried by the user.

Adding new dynamic configuration is supported for all add*Config methods except:

  • JobTracker which has been deprecated since Hazelcast 3.8

  • QuorumConfig: new quorum configuration cannot be dynamically added but other configuration can reference quorums configured in the existing static configuration

  • WanReplicationConfig: new WAN replication configuration cannot be dynamically added, however existing static ones can be referenced from other configurations, e.g., a new dynamic MapConfig may include a WanReplicationRef to a statically configured WAN replication config.

  • ListenerConfig: listeners can be instead added at runtime via other API such as HazelcastInstance.getCluster().addMembershipListener and HazelcastInstance.getPartitionService().addMigrationListener.

4.5.1. Handling Configuration Conflicts

Attempting to add a dynamic configuration, when a static configuration for the same element already exists, will throw ConfigurationException. For example, assuming we start a member with the following fragment in hazelcast.xml configuration:

<map name="sessions">
   ...
</map>

Then adding a dynamic configuration for a map with the name sessions will throw a ConfigurationException:

HazelcastInstance instance = Hazelcast.newHazelcastInstance();

MapConfig sessionsMapConfig = new MapConfig("sessions");

// this will throw ConfigurationException:
instance.getConfig().addMapConfig(sessionsMapConfig);

When attempting to add dynamic configuration for an element for which dynamic configuration has already been added, then if a configuration conflict is detected a ConfigurationException will be thrown. For example:

HazelcastInstance instance = Hazelcast.newHazelcastInstance();

MapConfig sessionsMapConfig = new MapConfig("sessions").setBackupCount(0);
instance.getConfig().addMapConfig(sessionsMapConfig);

MapConfig sessionsWithBackup = new MapConfig("sessions").setBackupCount(1);
// throws ConfigurationException because the new MapConfig conflicts with existing one
instance.getConfig().addMapConfig(sessionsWithBackup);

MapConfig sessionsWithoutBackup = new MapConfig("sessions").setBackupCount(0);
// does not throw exception: new dynamic config is equal to existing dynamic config of same name
instance.getConfig().addMapConfig(sessionsWithoutBackup);

4.6. Checking Configuration

When you start a Hazelcast member without passing a Config object, as explained in the Configuring Programmatically section, Hazelcast checks the member’s configuration as follows:

  • First, it looks for the hazelcast.config system property. If it is set, its value is used as the path. This is useful if you want to be able to change your Hazelcast configuration; you can do this because it is not embedded within the application. You can set the config option with the following command:

    -Dhazelcast.config=`*`<path to the hazelcast.xml>

    The path can be a regular one or a classpath reference with the prefix classpath:.

  • If the above system property is not set, Hazelcast then checks whether there is a hazelcast.xml file in the working directory.

  • If not, it then checks whether hazelcast.xml exists on the classpath.

  • If none of the above works, Hazelcast loads the default configuration (hazelcast.xml) that comes with your Hazelcast package.

Before configuring Hazelcast, please try to work with the default configuration to see if it works for you. This default configuration should be fine for most of the users. If not, you can consider to modify the configuration to be more suitable for your environment.

4.7. Configuration Pattern Matcher

You can give a custom strategy to match an item name to a configuration pattern. By default Hazelcast uses a simplified wildcard matching. See Using Wildcards section for this. A custom configuration pattern matcher can be given by using either member or client config objects, as shown below:

// Setting a custom config pattern matcher via member config object
Config config = new Config();
config.setConfigPatternMatcher(new ExampleConfigPatternMatcher());

And the following is an example pattern matcher:

class ExampleConfigPatternMatcher extends MatchingPointConfigPatternMatcher {

    @Override
    public String matches(Iterable<String> configPatterns, String itemName) throws ConfigurationException {
        String matches = super.matches(configPatterns, itemName);
        if (matches == null) throw new ConfigurationException("No config found for " + itemName);
        return matches;
    }
}

4.8. Using Wildcards

Hazelcast supports wildcard configuration for all distributed data structures that can be configured using Config, that is, for all except IAtomicLong, IAtomicReference. Using an asterisk (\*) character in the name, different instances of maps, queues, topics, semaphores, etc. can be configured by a single configuration.

A single asterisk (\*) can be placed anywhere inside the configuration name.

For instance, a map named com.hazelcast.test.mymap can be configured using one of the following configurations.

<map name="com.hazelcast.test.*">
...
</map>
<map name="com.hazel*">
...
</map>
<map name="*.test.mymap">
...
</map>
<map name="com.*test.mymap">
...
</map>

Or a queue com.hazelcast.test.myqueue:

<queue name="*hazelcast.test.myqueue">
...
</queue>
<queue name="com.hazelcast.*.myqueue">
...
</queue>

4.9. Using Variables

In your Hazelcast and/or Hazelcast Client declarative configuration, you can use variables to set the values of the elements. This is valid when you set a system property programmatically or you use the command line interface. You can use a variable in the declarative configuration to access the values of the system properties you set.

For example, see the following command that sets two system properties.

-Dgroup.name=dev

Let’s get the values of these system properties in the declarative configuration of Hazelcast, as shown below.

<hazelcast>
  <group>
    <name>${group.name}</name>
  </group>
</hazelcast>

This also applies to the declarative configuration of Hazelcast Client, as shown below.

<hazelcast-client>
  <group>
    <name>${group.name}</name>
  </group>
</hazelcast-client>

If you do not want to rely on the system properties, you can use the XmlConfigBuilder and explicitly set a Properties instance, as shown below.

Properties properties = new Properties();

// fill the properties, e.g., from database/LDAP, etc.

XmlConfigBuilder builder = new XmlConfigBuilder();
builder.setProperties(properties);
Config config = builder.build();
HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);

4.10. Variable Replacers

Variable replacers are used to replace custom strings during loading the configuration, e.g., they can be used to mask sensitive information such as usernames and passwords. Of course their usage is not limited to security related information.

Variable replacers implement the interface com.hazelcast.config.replacer.spi.ConfigReplacer and they are configured only declaratively: in the Hazelcast’s declarative configuration files, i.e., hazelcast.xml and hazelcast-client.xml. You can refer to ConfigReplacer s Javadoc for basic information on how a replacer works.

Variable replacers are configured within the element <config-replacers> under <hazelcast>, as shown below.

<hazelcast>
    ...
    <config-replacers fail-if-value-missing="false">
        <replacer class-name="com.acme.MyReplacer">
            <properties>
                <property name="propName">value</property>
                ...
            </properties>
        </replacer>
        <replacer class-name="example.AnotherReplacer"/>
    </config-replacers>
    ...
</hazelcast>

As you can see, <config-replacers> is the parent element for your replacers, which are declared using the <replacer> sub-elements. You can define multiple replacers under the <config-replacers>. Here are the descriptions of elements and attributes used for the replacer configuration:

  • fail-if-value-missing: Specifies whether the loading configuration process stops when a replacement value is missing. It is an optional attribute and its default value is true.

  • class-name: Full class name of the replacer.

  • <properties>: Contains names and values of the properties used to configure a replacer. Each property is defined using the <property> sub-element. All of the properties are explained in the upcoming sections.

The following replacer classes are provided by Hazelcast as example implementations of the ConfigReplacer interface. Note that you can also implement your own replacers.

  • EncryptionReplacer

  • PropertyReplacer

There is also a ExecReplacer which runs an external command and uses its standard output as the value for the variable. Please refer to its code sample.

Each example replacer is explained in the below sections.

4.10.1. EncryptionReplacer

This example EncryptionReplacer replaces encrypted variables by its plain form. The secret key for encryption/decryption is generated from a password which can be a value in a file and/or environment specific values, such as MAC address and actual user data.

Its full class name is com.hazelcast.config.replacer.EncryptionReplacer and the replacer prefix is ENC. Here are the properties used to configure this example replacer:

  • cipherAlgorithm: Cipher algorithm used for the encryption/decryption. Its default value is AES.

  • keyLengthBits: Length of the secret key to be generated in bits. Its default value is 128 bits.

  • passwordFile: Path to a file whose content should be used as a part of the encryption password. When the property is not provided no file is used as a part of the password. Its default value is null.

  • passwordNetworkInterface: Name of network interface whose MAC address should be used as a part of the encryption password. When the property is not provided no network interface property is used as a part of the password. Its default value is null.

  • passwordUserProperties: Specifies whether the current user properties (user.name and user.home) should be used as a part of the encryption password. Its default value is true.

  • saltLengthBytes: Length of a random password salt in bytes. Its default value is 8 bytes.

  • secretKeyAlgorithm: Name of the secret-key algorithm to be associated with the generated secret key. Its default value is AES.

  • secretKeyFactoryAlgorithm: Algorithm used to generate a secret key from a password. Its default value is PBKDF2WithHmacSHA256.

  • securityProvider: Name of a Java Security Provider to be used for retrieving the configured secret key factory and the cipher. Its default value is null.

Older Java versions may not support all the algorithms used as defaults. Please use the property values supported your Java version.

As a usage example, let’s create a password file and generate the encrypted strings out of this file.

1 - Create the password file: echo '/Za-uG3dDfpd,5.-' > /opt/master-password

2 - Define the encrypted variables:

java -cp hazelcast-*.jar \
    -DpasswordFile=/opt/master-password \
    -DpasswordUserProperties=false \
    com.hazelcast.config.replacer.EncryptionReplacer \
    "aGroup"
$ENC{Gw45stIlan0=:531:yVN9/xQpJ/Ww3EYkAPvHdA==}

java -cp hazelcast-*.jar \
    -DpasswordFile=/opt/master-password \
    -DpasswordUserProperties=false \
    com.hazelcast.config.replacer.EncryptionReplacer \
    "aPasswordToEncrypt"
$ENC{wJxe1vfHTgg=:531:WkAEdSi//YWEbwvVNoU9mUyZ0DE49acJeaJmGalHHfA=}

3 - Configure the replacer and put the encrypted variables into the configuration:

<hazelcast>
    <config-replacers>
        <replacer class-name="com.hazelcast.config.replacer.EncryptionReplacer">
            <properties>
                <property name="passwordFile">/opt/master-password</property>
                <property name="passwordUserProperties">false</property>
            </properties>
        </replacer>
    </config-replacers>
    <group>
        <name>$ENC{Gw45stIlan0=:531:yVN9/xQpJ/Ww3EYkAPvHdA==}</name>
        <password>$ENC{wJxe1vfHTgg=:531:WkAEdSi/YWEbwvVNoU9mUyZ0DE49acJeaJmGalHHfA=}</password>
    </group>
</hazelcast>

4 - Check if the decryption works:

java -jar hazelcast-*.jar
Apr 06, 2018 10:15:43 AM com.hazelcast.config.XmlConfigLocator
INFO: Loading 'hazelcast.xml' from working directory.
Apr 06, 2018 10:15:44 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [aGroup] [3.10-SNAPSHOT] Prefer IPv4 stack is true.

As you can see in the logs, the correctly decrypted group name value ("aGroup") is used.

4.10.2. PropertyReplacer

The PropertyReplacer replaces variables by properties with the given name. Usually the system properties are used, e.g., ${user.name}. There is no need to define it in the declarative configuration files.

Its full class name is com.hazelcast.config.replacer.PropertyReplacer and the replacer prefix is empty string ("").

4.10.3. Implementing Custom Replacers

You can also provide your own replacer implementations. All replacers have to implement the interface com.hazelcast.config.replacer.spi.ConfigReplacer. A simple snippet is shown below.

public interface ConfigReplacer {
    void init(Properties properties);
    String getPrefix();
    String getReplacement(String maskedValue);
}

5. Setting Up Clusters

This chapter describes Hazelcast clusters and the methods cluster members and native clients use to form a Hazelcast cluster.

5.1. Discovery Mechanisms

A Hazelcast cluster is a network of cluster members that run Hazelcast. Cluster members (also called nodes) automatically join together to form a cluster. This automatic joining takes place with various discovery mechanisms that the cluster members use to find each other.

Please note that, after a cluster is formed, communication between cluster members is always via TCP/IP, regardless of the discovery mechanism used.

Hazelcast uses the following discovery mechanisms.

You can refer to the Hazelcast IMDG Deployment and Operations Guide for advice on the best discovery mechanism to use.

5.1.1. TCP

You can configure Hazelcast to be a full TCP/IP cluster. Please see the Discovering Members by TCP section for configuration details.

5.1.2. Multicast

Multicast mechanism is not recommended for production since UDP is often blocked in production environments and other discovery mechanisms are more definite.

With this mechanism, Hazelcast allows cluster members to find each other using multicast communication. Please see the Discovering Members by Multicast section.

5.1.3. AWS Cloud Discovery

Hazelcast supports EC2 auto-discovery. It is useful when you do not want to provide or you cannot provide the list of possible IP addresses. This discovery feature is provided as a Hazelcast plugin. Please see its documentation for information on configuring and using it.

5.1.4. Apache jclouds® Cloud Discovery

Hazelcast members and native clients support jclouds® for discovery. This mechanism allows applications to be deployed in various cloud infrastructure ecosystems in an infrastructure-agnostic way. This discovery feature is provided as a Hazelcast plugin. Please see its documentation for information on configuring and using it.

5.1.5. Azure Cloud Discovery

Hazelcast offers a discovery strategy for your Hazelcast applications running on Azure. This strategy provides all of your Hazelcast instances by returning the virtual machines within your Azure resource group that are tagged with a specified value. This discovery feature is provided as a Hazelcast plugin. Please see its documentation for information on configuring and using it.

5.1.6. Zookeeper Cloud Discovery

This discovery mechanism provides a service based discovery strategy by using Apache Curator to communicate with your Zookeeper server. You can use this plugin with Discovery SPI enabled Hazelcast 3.6.1 and higher applications. This is provided as a Hazelcast plugin. Please see its documentation for information on configuring and using it.

5.1.7. Consul Cloud Discovery

Consul is a highly available and distributed service discovery and key-value store designed with support for the modern data center to make distributed systems and configuration easy. This mechanism provides a Consul based discovery strategy for Hazelcast enabled applications (Hazelcast 3.6 and higher) and enables Hazelcast members to dynamically discover one another via Consul. This discovery feature is provided as a Hazelcast plugin. Please see its documentation for information on configuring and using it.

5.1.8. etcd Cloud Discovery

This mechanism provides an etcd based discovery strategy for Hazelcast enabled applications (Hazelcast 3.6 and higher). This is an easy to configure plug-and-play Hazelcast discovery strategy that will optionally register each of your Hazelcast members with etcd and enable Hazelcast members to dynamically discover one another via etcd. This discovery feature is provided as a Hazelcast plugin. Please see its documentation for information on configuring and using it.

5.1.9. Hazelcast for PCF

Using a clickable Hazelcast Tile for Pivotal Cloud Foundry (PCF), you can deploy your Hazelcast cluster on PCF. This feature is provided as a Hazelcast plugin. Please see its documentation on how to install, configure and use the plugin Hazelcast for PCF.

5.1.10. Hazelcast OpenShift Integration

Hazelcast can run inside OpenShift benefiting from its cluster management software Kubernetes for discovery of members. Using Hazelcast Docker images, templates and default configuration files, you can deploy Hazelcast IMDG, Hazelcast IMDG Enterprise and Management Center onto OpenShift. Please see the documentation:

Please also see the Hazelcast for OpenShift guide, which presents how to set up local OpenShift environment, start Hazelcast cluster, configure Management Center and finally run a sample client application.

5.1.11. Eureka Cloud Discovery

Eureka is a REST based service that is primarily used in the AWS cloud for locating services for the purpose of load balancing and failover of middle-tier servers. Hazelcast supports Eureka V1 discovery; Hazelcast members within EC2 Virtual Private Cloud can discover each other using this mechanism. This discovery feature is provided as a Hazelcast plugin. Please see its documentation.

5.1.12. Heroku Cloud Discovery

Heroku is a platform as a service (PaaS) with which you can build, run and operate applications entirely in the cloud. It is a cloud platform based on a managed container system, with integrated data services and a powerful ecosystem. Hazelcast offers a discovery plugin that looks for IP addresses of other members by resolving service names against the Heroku DNS Discovery in Heroku Private Spaces. This discovery feature is provided as a Hazelcast plugin. Please see its documentation.

5.1.13. Kubernetes Cloud Discovery

Kubernetes is an open source system for automating deployment, scaling and management of containerized applications. Hazelcast provides Kubernetes discovery mechanism that looks for IP addresses of other members by resolving the requests against a Kubernetes Service Discovery system. It supports two different options of resolving against the discovery registry: (i) a request to the REST API, (ii) DNS Lookup against a given DNS service name. This discovery feature is provided as a Hazelcast plugin. Please see its documentation for information on configuring and using it.

5.2. Discovering Members by TCP

If multicast is not the preferred way of discovery for your environment, then you can configure Hazelcast to be a full TCP/IP cluster. When you configure Hazelcast to discover members by TCP/IP, you must list all or a subset of the members' hostnames and/or IP addresses as cluster members. You do not have to list all of these cluster members, but at least one of the listed members has to be active in the cluster when a new member joins.

To set your Hazelcast to be a full TCP/IP cluster, set the following configuration elements. Please refer to the tcp-ip element section for the full description of the TCP/IP discovery configuration elements.

  • Set the enabled attribute of the multicast element to "false".

  • Set the enabled attribute of the aws element to "false".

  • Set the enabled attribute of the tcp-ip element to "true".

  • Set your member elements within the tcp-ip element.

The following is an example declarative configuration.

<hazelcast>
   ...
  <network>
    ...
    <join>
      <multicast enabled="false">
      </multicast>
      <tcp-ip enabled="true">
        <member>machine1</member>
        <member>machine2</member>
        <member>machine3:5799</member>
        <member>192.168.1.0-7</member>
        <member>192.168.1.21</member>
      </tcp-ip>
      ...
    </join>
    ...
  </network>
  ...
</hazelcast>

As shown above, you can provide IP addresses or hostnames for member elements. You can also give a range of IP addresses, such as 192.168.1.0-7.

Instead of providing members line-by-line as shown above, you also have the option to use the members element and write comma-separated IP addresses, as shown below.

<members>192.168.1.0-7,192.168.1.21</members>

If you do not provide ports for the members, Hazelcast automatically tries the ports 5701, 5702 and so on.

By default, Hazelcast binds to all local network interfaces to accept incoming traffic. You can change this behavior using the system property hazelcast.socket.bind.any. If you set this property to false, Hazelcast uses the interfaces specified in the interfaces element (please refer to the Interfaces Configuration section). If no interfaces are provided, then it will try to resolve one interface to bind from the member elements.

5.3. Discovering Members by Multicast

With the multicast auto-discovery mechanism, Hazelcast allows cluster members to find each other using multicast communication. The cluster members do not need to know the concrete addresses of the other members, as they just multicast to all the other members for listening. Whether multicast is possible or allowed depends on your environment.

To set your Hazelcast to multicast auto-discovery, set the following configuration elements. Please refer to the multicast element section for the full description of the multicast discovery configuration elements.

  • Set the enabled attribute of the multicast element to "true".

  • Set multicast-group, multicast-port, multicast-time-to-live, etc. to your multicast values.

  • Set the enabled attribute of both tcp-ip and aws elements to "false".

The following is an example declarative configuration.

<hazelcast>
   ...
  <network>
    ...
        <join>
            <multicast enabled="true">
                <multicast-group>224.2.2.3</multicast-group>
                <multicast-port>54327</multicast-port>
                <multicast-time-to-live>32</multicast-time-to-live>
                <multicast-timeout-seconds>2</multicast-timeout-seconds>
                <trusted-interfaces>
                   <interface>192.168.1.102</interface>
                </trusted-interfaces>
            </multicast>
            <tcp-ip enabled="false">
            </tcp-ip>
            <aws enabled="false">
            </aws>
        </join>
  </network>

Pay attention to the multicast-timeout-seconds element. multicast-timeout-seconds specifies the time in seconds that a member should wait for a valid multicast response from another member running in the network before declaring itself the leader member (the first member joined to the cluster) and creating its own cluster. This only applies to the startup of members where no leader has been assigned yet. If you specify a high value to multicast-timeout-seconds, such as 60 seconds, it means that until a leader is selected, each member will wait 60 seconds before moving on. Be careful when providing a high value. Also, be careful not to set the value too low, or the members might give up too early and create their own cluster.

Multicast auto-discovery is not supported for Hazelcast native clients yet. However, we offer Multicast Discovery Plugin for this purpose. Please refer to the Discovering Native Clients section.

5.4. Discovering Native Clients

Hazelcast members and native Java clients can find each other with multicast discovery plugin. This plugin is implemented using Hazelcast Discovery SPI. You should configure the plugin both at Hazelcast members and Java clients in order to use multicast discovery.

To configure your cluster to have the multicast discovery plugin, follow these steps:

  • Disable the multicast and TCP/IP join mechanisms. To do this, set the enabled attributes of the multicast and tcp-ip elements to false in your hazelcast.xml configuration file

  • Set the enabled attribute of the hazelcast.discovery.enabled property to true.

  • Add multicast discovery strategy configuration to your XML file, i.e., <discovery-strategies> element.

The following is an example declarative configuration.

 ...
  <properties>
    <property name="hazelcast.discovery.enabled">true</property>
  </properties>
   ....
 <join>
    <multicast enabled="false">
    </multicast>
    <tcp-ip enabled="false">
    </tcp-ip>
    <discovery-strategies>
        <discovery-strategy class="com.hazelcast.spi.discovery.multicast.MulticastDiscoveryStrategy" enabled="true">
          <properties>
          <property name="group">224.2.2.3</property>
          <property name="port">54327</property>
          </properties>
        </discovery-strategy>
    </discovery-strategies>
</join>
...

The following are the multicast discovery plugin configuration properties with their descriptions.

  • group: String value that is used to set the multicast group, so that you can isolate your clusters.

  • port: Integer value that is used to set the multicast port.

5.5. Creating Cluster Groups

You can create cluster groups. To do this, use the group configuration element.

You can separate your clusters in a simple way by specifying group names. Example groupings can be by development, production, test, app, etc. The following is an example declarative configuration.

<hazelcast>
  <group>
    <name>production</name>
  </group>
  ...
</hazelcast>

You can also define the cluster groups using the programmatic configuration. A JVM can host multiple Hazelcast instances. Each Hazelcast instance can only participate in one group. Each Hazelcast instance only joins to its own group and does not interact with other groups. The following code example creates three separate Hazelcast instances--h1 belongs to the production cluster, while h2 and h3 belong to the development cluster.

Config configProd = new Config();
configProd.getGroupConfig().setName( "production" );

Config configDev = new Config();
configDev.getGroupConfig().setName( "development" );

HazelcastInstance h1 = Hazelcast.newHazelcastInstance( configProd );
HazelcastInstance h2 = Hazelcast.newHazelcastInstance( configDev );
HazelcastInstance h3 = Hazelcast.newHazelcastInstance( configDev );

5.5.1. Cluster Groups before Hazelcast 3.8.2

If you have a Hazelcast release older than 3.8.2, you need to provide also a group password along with the group name. The following are the configuration examples with the password element:

<hazelcast>
  <group>
    <name>production</name>
    <password>prod-pass</password>
  </group>
  ...
</hazelcast>
Config configProd = new Config();
configProd.getGroupConfig().setName( "production" ).setPassword( "prod-pass" );

Config configDev = new Config();
configDev.getGroupConfig().setName( "development" ).setPassword( "dev-pass" );

HazelcastInstance h1 = Hazelcast.newHazelcastInstance( configProd );
HazelcastInstance h2 = Hazelcast.newHazelcastInstance( configDev );
HazelcastInstance h3 = Hazelcast.newHazelcastInstance( configDev );

Starting with 3.8.2, there is no need for a group password.

5.6. Member User Code Deployment - BETA

Hazelcast can dynamically load your custom classes or domain classes from a remote class repository, which typically includes lite members. For this purpose Hazelcast offers a distributed dynamic class loader.

Using this dynamic class loader, you can control the local caching of the classes loaded from other members, control the classes to be served to other members and create blacklists or whitelists of classes and packages. When you enable this feature, you will not have to deploy your classes to all cluster members.

The following is the brief working mechanism of the User Code Deployment feature:

  1. Dynamic class loader first checks the local classes, i.e., your classpath, for your custom class. If it is there, Hazelcast does not try to load it from the remote class repository.

  2. Then, it checks the cache of classes loaded from the remote class repository (for this, caching should have been enabled in your local, please refer to Configuring User Code Deployment section). If your class is found here, again, Hazelcast does not try to load it from the remote class repository.

  3. Finally, dynamic class loader checks the remote class repository. If a member in this repository returns the class, it means your class is found and will be used. You can also put this class into your local class cache as mentioned in the previous step.

5.6.1. Configuring User Code Deployment

User Code Deployment feature is not enabled by default. You can configure this feature declaratively or programmatically. Following are example configuration snippets:

Declarative Configuration

<user-code-deployment enabled="true">
        <class-cache-mode>ETERNAL</class-cache-mode>
        <provider-mode>LOCAL_CLASSES_ONLY</provider-mode>
        <blacklist-prefixes>com.foo</blacklist-prefixes>
        <whitelist-prefixes>com.bar.MyClass</whitelist-prefixes>
        <provider-filter>HAS_ATTRIBUTE:lite</provider-filter>
</user-code-deployment>

Programmatic Configuration

Config config = new Config();
UserCodeDeploymentConfig distCLConfig = config.getUserCodeDeploymentConfig();
distCLConfig.setEnabled( true )
        .setClassCacheMode( UserCodeDeploymentConfig.ClassCacheMode.ETERNAL )
        .setProviderMode( UserCodeDeploymentConfig.ProviderMode.LOCAL_CLASSES_ONLY )
        .setBlacklistedPrefixes( "com.foo" )
        .setWhitelistedPrefixes( "com.bar.MyClass" )
        .setProviderFilter( "HAS_ATTRIBUTE:lite" );

User Code Deployment has the following configuration elements and attributes:

  • enabled: Specifies whether dynamic class loading is enabled or not. Its default value is "true" and it is a mandatory attribute.

  • <class-cache-mode>: Controls the local caching behavior for the classes loaded from the remote class repository. Available values are as follows:

    • ETERNAL: Cache the loaded classes locally. This is the default value and suitable when you load long-living objects, such as domain objects stored in a map.

    • OFF: Do not cache the loaded classes locally. It is suitable for loading runnables, callables, entry processors, etc.

  • <provider-mode>: Controls how the classes are served to the other cluster members. Available values are as follows:

    • LOCAL_AND_CACHED_CLASSES: Serve classes loaded from both local classpath and from other members. This is the default value.

    • LOCAL_CLASSES_ONLY: Serve classes from the local classpath only. Classes loaded from other members will be used locally, but they are not served to other members.

    • OFF: Never serve classes to other members.

  • <blacklist-prefixes>: Comma separated name prefixes of classes/packages to be prevented from dynamic class loading. For example, if you set it as "com.foo", remote loading of all classes from the "com.foo" package will be blacklisted, including the classes from all its sub-packages. If you set it as "com.foo.Class", then the "Class" and all classes having the "Class" as prefix in the "com.foo" package will be blacklisted. There are some built-in prefixes which are blacklisted by default. These are as follows:

    • javax.

    • java.

    • sun.

    • com.hazelcast.

  • <whitelist-prefixes>: Comma separated name prefixes of classes/packages only from which the classes will be loaded. It allows to quickly configure remote loading only for classes from selected packages. It can be used together with blacklisting. For example, you can whitelist the prefix "com.foo" and blacklist the prefix "com.foo.secret".

  • <provider-filter>: Filter to constraint members to be used for a class loading request when a class is not available locally. The value is in the format "HAS_ATTRIBUTE:foo". When it is set as "HAS_ATTRIBUTE:foo", the class loading request will only be sent to the members which have "foo" as a member attribute. Setting this to null will allow to load classes from all members. Please see an example in the below section.

5.6.2. Example for Filtering Members

As described above, the configuration element provider-filter is used to constrain a member to load classes only from a subset of all cluster members. The value of the provider-filter must be set as a member attribute in the desired members from which the classes will be loaded. Please see the following example usage provided as programmatic configurations.

The below example configuration will allow the Hazelcast member to load classes only from members with the class-provider attribute set. It will not ask any other member to provide a locally unavailable class:

Config hazelcastConfig = new Config();
DistributedClassloadingConfig distributedClassloadingConfig = hazelcastConfig.getDistributedClassloadingConfig();
distributedClassloadingConfig.setProviderFilter("HAS_ATTRIBUTE:class-provider");

HazelcastInstance instance = Hazelcast.newHazelcastInstance(hazelcastConfig);

And the below example configuration sets the attribute class-provider for a member. So, the above member will load classes from the members who have the attribute class-provider:

Config hazelcastConfig = new Config();
MemberAttributeConfig memberAttributes = hazelcastConfig.getMemberAttributeConfig();
memberAttributes.setAttribute("class-provider", "true");

HazecastInstance instance = Hazelcast.newHazelcastInstance(hazelcastConfig);

5.7. Client User Code Deployment - BETA

You can use the User Code Deployment at the client side for the following situations:

  1. You have objects that will run on the cluster via the clients such as Runnable, Callable and Entry Processors.

  2. You have new or amended user domain objects (in-memory format of the IMap set to Object) which need to be deployed into the cluster.

When this feature is enabled, the clients will deploy these classes to the members. By this way, when a client adds a new class, the members will not require restarts to include the new classes in classpath.

You can also use the client permission policy to specify which clients are permitted to use User Code Deployment. Please see the Permissions.

5.7.1. Configuring Client User Code Deployment

Client User Code Deployment feature is not enabled by default. You can configure this feature declaratively or programmatically. Following are example configuration snippets:

Declarative Configuration

In your hazelcast-client.xml:

<user-code-deployment enabled="true">
    <jarPaths>
        <jarPath>/User/sample/sample.jar</jarPath>
        <jarPath>sample.jar</jarPath> <!--from class path -->
        <jarPath>https://com.sample.com/sample.jar</jarPath>
        <jarPath>file://Users/sample/sample.jar</jarPath>
    </jarPaths>
    <classNames>
            <!-- for the classes available in client class path -->
        <className>sample.ClassName</className>
        <className>sample.ClassName2</className>
    </classNames>
</user-code-deployment>

Programmatic Configuration

ClientConfig clientConfig = new ClientConfig();
ClientUserCodeDeploymentConfig clientUserCodeDeploymentConfig = new ClientUserCodeDeploymentConfig();

clientUserCodeDeploymentConfig.addJar("/User/sample/sample.jar");
clientUserCodeDeploymentConfig.addJar("https://com.sample.com/sample.jar");
clientUserCodeDeploymentConfig.addClass("sample.ClassName");
clientUserCodeDeploymentConfig.addClass("sample.ClassName2");

clientUserCodeDeploymentConfig.setEnabled(true);
clientConfig.setUserCodeDeploymentConfig(clientUserCodeDeploymentConfig);
Important to Know

Note that User Code Deployment should also be enabled on the members to use this feature.

Config config = new Config();
UserCodeDeploymentConfig userCodeDeploymentConfig = config.getUserCodeDeploymentConfig();
userCodeDeploymentConfig.setEnabled( true );

Please refer to the Member User Code Deployment section for more information on enabling it on the member side and its configuration properties.

For the property class-cache-mode, Client User Code Deployment supports only the ETERNAL mode, regardless of the configuration set at the member side (which can be ETERNAL and OFF).

For the property, provider-mode, Client User Code Deployment supports only the LOCAL_AND_CACHED_CLASSES mode, regardless of the configuration set at the member side (which can be LOCAL_AND_CACHED_CLASSES, LOCAL_CLASSES_ONLY and OFF).

The remaining properties, which are blacklist-prefixes, whitelist-prefixes and provider-filter configured at the member side, will effect the client user code deployment’s behavior too. For example, assuming that you provide com.foo as a blacklist prefix at the member side, the member will discard the classes with the prefix com.foo loaded by the client.

5.8. Partition Group Configuration

Hazelcast distributes key objects into partitions using the consistent hashing algorithm. Multiple replicas are created for each partition and those partition replicas are distributed among Hazelcast members. An entry is stored in the members that own replicas of the partition to which the entry’s key is assigned. The total partition count is 271 by default; you can change it with the configuration property hazelcast.partition.count. Please see the System Properties appendix.

Hazelcast member that owns the primary replica of a partition is called as partition owner. Other replicas are called backups. Based on the configuration, a key object can be kept in multiple replicas of a partition. A member can hold at most one replica of a partition (ownership or backup).

By default, Hazelcast distributes partition replicas randomly and equally among the cluster members, assuming all members in the cluster are identical. But what if some members share the same JVM or physical machine or chassis and you want backups of these members to be assigned to members in another machine or chassis? What if processing or memory capacities of some members are different and you do not want an equal number of partitions to be assigned to all members?

To deal with such scenarios, you can group members in the same JVM (or physical machine) or members located in the same chassis. Or you can group members to create identical capacity. We call these groups partition groups. Partitions are assigned to those partition groups instead of individual members. Backup replicas of a partition which is owned by a partition group are located in other partition groups.

5.8.1. Grouping Types

When you enable partition grouping, Hazelcast presents the following choices for you to configure partition groups.

HOST_AWARE

You can group members automatically using the IP addresses of members, so members sharing the same network interface will be grouped together. All members on the same host (IP address or domain name) will be a single partition group. This helps to avoid data loss when a physical server crashes, because multiple replicas of the same partition are not stored on the same host. But if there are multiple network interfaces or domain names per physical machine, that will make this assumption invalid.

Following are declarative and programmatic configuration snippets that show how to enable HOST_AWARE grouping.

<partition-group enabled="true" group-type="HOST_AWARE" />
Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.HOST_AWARE );
CUSTOM

You can do custom grouping using Hazelcast’s interface matching configuration. This way, you can add different and multiple interfaces to a group. You can also use wildcards in the interface addresses. For example, the users can create rack-aware or data warehouse partition groups using custom partition grouping.

Following are declarative and programmatic configuration examples that show how to enable and use CUSTOM grouping.

<partition-group enabled="true" group-type="CUSTOM">
<member-group>
  <interface>10.10.0.*</interface>
  <interface>10.10.3.*</interface>
  <interface>10.10.5.*</interface>
</member-group>
<member-group>
  <interface>10.10.10.10-100</interface>
  <interface>10.10.1.*</interface>
  <interface>10.10.2.*</interface>
</member-group>
</partition-group>
Config config = new Config();
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
        .setGroupType( PartitionGroupConfig.MemberGroupType.CUSTOM );

MemberGroupConfig memberGroupConfig = new MemberGroupConfig();
memberGroupConfig.addInterface( "10.10.0.*" )
        .addInterface( "10.10.3.*" ).addInterface("10.10.5.*" );

MemberGroupConfig memberGroupConfig2 = new MemberGroupConfig();
memberGroupConfig2.addInterface( "10.10.10.10-100" )
        .addInterface( "10.10.1.*").addInterface( "10.10.2.*" );

partitionGroupConfig.addMemberGroupConfig( memberGroupConfig );
partitionGroupConfig.addMemberGroupConfig( memberGroupConfig2 );
While your cluster was forming, if you configured your members to discover each other by their IP addresses, you should use the IP addresses for the <interface> element. If your members discovered each other by their hostnames, you should use the hostnames.
PER_MEMBER

You can give every member its own group. Each member is a group of its own and primary and backup partitions are distributed randomly (not on the same physical member). This gives the least amount of protection and is the default configuration for a Hazelcast cluster. This grouping type provides good redundancy when Hazelcast members are on separate hosts. However, if multiple instances run on the same host, this type is not a good option.

Following are declarative and programmatic configuration snippets that show how to enable PER_MEMBER grouping.

<partition-group enabled="true" group-type="PER_MEMBER" />
Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.PER_MEMBER );
ZONE_AWARE

You can use ZONE_AWARE configuration with Hazelcast AWS, Hazelcast jclouds or Hazelcast Azure Discovery Service plugins.

As discovery services, these plugins put zone information to the Hazelcast member attributes map during the discovery process. When ZONE_AWARE is configured as partition group type, Hazelcast creates the partition groups with respect to member attributes map entries that include zone information. That means backups are created in the other zones and each zone will be accepted as one partition group.

When using the ZONE_AWARE partition grouping, a Hazelcast cluster spanning multiple AZs should have an equal number of members in each AZ. Otherwise, it will result in uneven partition distribution among the members.

This is the list of supported attributes which is set by Discovery Service plugins during a Hazelcast member start-up:

  • hazelcast.partition.group.zone: For the zones in the same area.

  • hazelcast.partition.group.rack: For different racks in the same zone.

  • hazelcast.partition.group.host: For a shared physical member if virtualization is used.

hazelcast-jclouds offers rack or host information in addition to zone information based on cloud provider. In such cases, Hazelcast looks for zone, rack and host information in the given order and create partition groups with available information*

Following are declarative and programmatic configuration snippets that show how to enable ZONE_AWARE grouping.

<partition-group enabled="true" group-type="ZONE_AWARE" />
Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.ZONE_AWARE );
SPI

You can provide your own partition group implementation using the SPI configuration. To create your partition group implementation, you need to first extend the DiscoveryStrategy class of the discovery service plugin, override the method public PartitionGroupStrategy getPartitionGroupStrategy() and return the PartitionGroupStrategy configuration in that overridden method.

Following is a sample code covering the implementation steps mentioned in the above paragraph:

public class CustomDiscovery extends AbstractDiscoveryStrategy {

    public CustomDiscovery(ILogger logger, Map<String, Comparable> properties) {
        super(logger, properties);
    }

    @Override
    public Iterable<DiscoveryNode> discoverNodes() {
        Iterable<DiscoveryNode> iterable = //your implementation
        return iterable;
    }

    @Override
    public PartitionGroupStrategy getPartitionGroupStrategy() {
        return new CustomPartitionGroupStrategy();
    }

    private class CustomPartitionGroupStrategy implements PartitionGroupStrategy {
        @Override
        public Iterable<MemberGroup> getMemberGroups() {
            Iterable<MemberGroup> iterable = //your implementation
            return iterable;
        }
    }
}

5.9. Logging Configuration

Hazelcast has a flexible logging configuration and does not depend on any logging framework except JDK logging. It has built-in adapters for a number of logging frameworks and it also supports custom loggers by providing logging interfaces.

To use built-in adapters, set the hazelcast.logging.type property to one of the predefined types below.

  • jdk: JDK logging (default)

  • log4j: Log4j

  • log4j2: Log4j2

  • slf4j: Slf4j

  • none: disable logging

You can set hazelcast.logging.type through declarative configuration, programmatic configuration, or JVM system property.

If you choose to use log4j, log4j2, or slf4j, you should include the proper dependencies in the classpath.

Declarative Configuration

  ....
  <properties>
    <property name="hazelcast.logging.type">log4j</property>
    ....
  </properties>
</hazelcast>

Programmatic Configuration

Config config = new Config() ;
config.setProperty( "hazelcast.logging.type", "log4j" );

System Property

  • Using JVM parameter: java -Dhazelcast.logging.type=slf4j

  • Using System class: System.setProperty( "hazelcast.logging.type", "none" );

If the provided logging mechanisms are not satisfactory, you can implement your own using the custom logging feature. To use it, implement the com.hazelcast.logging.LoggerFactory and com.hazelcast.logging.ILogger interfaces and set the system property hazelcast.logging.class as your custom LoggerFactory class name.

-Dhazelcast.logging.class=foo.bar.MyLoggingFactory

You can also listen to logging events generated by Hazelcast runtime by registering LogListeners to LoggingService.

LogListener listener = new LogListener() {
  public void log( LogEvent logEvent ) {
    // do something
  }
};
HazelcastInstance instance = Hazelcast.newHazelcastInstance();
LoggingService loggingService = instance.getLoggingService();
loggingService.addLogListener( Level.INFO, listener );

Through the LoggingService, you can get the currently used ILogger implementation and log your own messages too.

If you are not using command line for configuring logging, you should be careful about Hazelcast classes. They may be defaulted to jdk logging before newly configured logging is read. When logging mechanism is selected, it will not change.

Below are example configurations for Log4j2 and Log4j. Note that Hazelcast does not recommend any specific logging library, these examples are provided only to demonstrate how to configure the logging. You can use your custom logging as explained above.

5.9.1. Example Log4j2 Configuration

Specify the logging type as Log4j2 and a separate logging configuration file as shown below.

Using JVM arguments:

-Dhazelcast.logging.type=log4j2
-Dlog4j.configurationFile=/path/to/properties/log4j2.properties

Using declarative configuration (hazelcast.xml):

<hazelcast>
   ...
   <properties>
      <property name="hazelcast.logging.type">log4j2</property>
      <property name="log4j2.configuration">/path/to/properties/log4j2.properties</property>
      ...
   </properties>
   ...
</hazelcast>

Following is an example log4j2.properties file:

rootLogger=file
rootLogger.level=info
property.filepath=/path/to/log/files
property.filename=hazelcast

appender.file.type=RollingFile
appender.file.name=RollingFile
appender.file.fileName=${filepath}/${filename}.log
appender.file.filePattern=${filepath}/${filename}-%d{yyyy-MM-dd}-%i.log.gz
appender.file.layout.type=PatternLayout
appender.file.layout.pattern = %d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
appender.file.policies.type=Policies
appender.file.policies.time.type=TimeBasedTriggeringPolicy
appender.file.policies.time.interval=1
appender.file.policies.time.modulate=true
appender.file.policies.size.type=SizeBasedTriggeringPolicy
appender.file.policies.size.size=50MB
appender.file.strategy.type=DefaultRolloverStrategy
appender.file.strategy.max=100

rootLogger.appenderRefs=file
rootLogger.appenderRef.file.ref=RollingFile

#Hazelcast specific logs.

#log4j.logger.com.hazelcast=debug

#log4j.logger.com.hazelcast.cluster=debug
#log4j.logger.com.hazelcast.partition=debug
#log4j.logger.com.hazelcast.partition.InternalPartitionService=debug
#log4j.logger.com.hazelcast.nio=debug
#log4j.logger.com.hazelcast.hibernate=debug

To enable the debug logs for all Hazelcast operations uncomment the below line in the above configuration file:

log4j.logger.com.hazelcast=debug

If you do not need detailed logs, the default settings is enough. Using the Hazelcast specific lines in the above configuration file, you can select to see specific logs (cluster, partition, hibernate, etc.) in desired levels.

5.9.2. Example Log4j Configuration

Its configuration is similar to that of Log4j2. Below is the JVM argument way of specifying the logging type and configuration file:

-Dhazelcast.logging.type=log4j
-Dlog4j.configuration=file:/path/to/properties/log4j.properties

Following is an example log4j.properties file:

log4j.rootLogger=INFO,file

log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.File=/path/to/log/files/hazelcast.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %p [%c{1}] - %m%n
log4j.appender.file.maxFileSize=50MB
log4j.appender.file.maxBackupIndex=100
log4j.appender.file.threshold=DEBUG

#log4j.logger.com.hazelcast=debug

#log4j.logger.com.hazelcast.cluster=debug
#log4j.logger.com.hazelcast.partition=debug
#log4j.logger.com.hazelcast.partition.InternalPartitionService=debug
#log4j.logger.com.hazelcast.nio=debug
#log4j.logger.com.hazelcast.hibernate=debug

5.10. Other Network Configurations

All network related configurations are performed via the network element in the Hazelcast XML configuration file or the class NetworkConfig when using programmatic configuration. Following subsections describe the available configurations that you can perform under the network element.

5.10.1. Public Address

public-address overrides the public address of a member. By default, a member selects its socket address as its public address. But behind a network address translation (NAT), two endpoints (members) may not be able to see/access each other. If both members set their public addresses to their defined addresses on NAT, then that way they can communicate with each other. In this case, their public addresses are not an address of a local network interface but a virtual address defined by NAT. It is optional to set and useful when you have a private cloud. Note that, the value for this element should be given in the format host IP address:port number. See the following examples.

Declarative:

<network>
    <public-address>11.22.33.44:5555</public-address>
</network>

Programmatic:

Config config = new Config();
config.getNetworkConfig()
    .setPublicAddress( "11.22.33.44:5555" );

5.10.2. Port

You can specify the ports that Hazelcast will use to communicate between cluster members. Its default value is 5701. The following are example configurations.

Declarative:

<network>
  <port port-count="20" auto-increment="true">5701</port>
</network>

Programmatic:

Config config = new Config();
config.getNetworkConfig().setPort( 5701 )
    .setPortAutoIncrement( true ).setPortCount( 20 );

According to the above example, Hazelcast will try to find free ports between 5701 and 5720.

port has the following attributes.

  • port-count: By default, Hazelcast will try 100 ports to bind. Meaning that, if you set the value of port as 5701, as members are joining to the cluster, Hazelcast tries to find ports between 5701 and 5801. You can choose to change the port count in the cases like having large instances on a single machine or willing to have only a few ports to be assigned. The parameter port-count is used for this purpose, whose default value is 100.

  • auto-increment: In some cases you may want to choose to use only one port. In that case, you can disable the auto-increment feature of port by setting auto-increment to false. The port-count attribute is not used when auto-increment feature is disabled.

5.10.3. Outbound Ports

By default, Hazelcast lets the system pick up an ephemeral port during socket bind operation. But security policies/firewalls may require you to restrict outbound ports to be used by Hazelcast-enabled applications. To fulfill this requirement, you can configure Hazelcast to use only defined outbound ports. The following are example configurations.

Declarative:

<network>
  <outbound-ports>
    <!-- ports between 33000 and 35000 -->
    <ports>33000-35000</ports>
    <!-- comma separated ports -->
    <ports>37000,37001,37002,37003</ports>
    <ports>38000,38500-38600</ports>
  </outbound-ports>
</network>

Programmatic:

...
NetworkConfig networkConfig = config.getNetworkConfig();
// ports between 35000 and 35100
networkConfig.addOutboundPortDefinition("35000-35100");
// comma separated ports
networkConfig.addOutboundPortDefinition("36001, 36002, 36003");
networkConfig.addOutboundPort(37000);
networkConfig.addOutboundPort(37001);
...
You can use port ranges and/or comma separated ports.

As shown in the programmatic configuration, you use the method addOutboundPort to add only one port. If you need to add a group of ports, then use the method addOutboundPortDefinition.

In the declarative configuration, the element ports can be used for both single and multiple port definitions. When you set this element to 0 or *, your operating system (not Hazelcast) will select a free port from the ephemeral range.

5.10.4. Reuse Address

When you shutdown a cluster member, the server socket port will be in the TIME_WAIT state for the next couple of minutes. If you start the member right after shutting it down, you may not be able to bind it to the same port because it is in the TIME_WAIT state. If you set the reuse-address element to true, the TIME_WAIT state is ignored and you can bind the member to the same port again.

The following are example configurations.

Declarative:

<network>
  <reuse-address>true</reuse-address>
</network>

Programmatic:

...
NetworkConfig networkConfig = config.getNetworkConfig();

networkConfig.setReuseAddress( true );
...

5.10.5. Join

The join configuration element is used to discover Hazelcast members and enable them to form a cluster. Hazelcast provides multicast, TCP/IP, EC2 and jclouds® discovery mechanisms. These mechanisms are explained the Discovery Mechanisms section. This section describes all the sub-elements and attributes of join element. The following are example configurations.

Declarative:

<network>
     <join>
         <multicast enabled="true">
             <multicast-group>224.2.2.3</multicast-group>
             <multicast-port>54327</multicast-port>
             <multicast-time-to-live>32</multicast-time-to-live>
             <multicast-timeout-seconds>2</multicast-timeout-seconds>
             <trusted-interfaces>
                <interface>192.168.1.102</interface>
             </trusted-interfaces>
         </multicast>
         <tcp-ip enabled="false">
             <required-member>192.168.1.104</required-member>
             <member>192.168.1.104</member>
             <members>192.168.1.105,192.168.1.106</members>
         </tcp-ip>
         <aws enabled="false">
             <access-key>my-access-key</access-key>
             <secret-key>my-secret-key</secret-key>
             <region>us-west-1</region>
             <host-header>ec2.amazonaws.com</host-header>
             <security-group-name>hazelcast-sg</security-group-name>
             <tag-key>type</tag-key>
             <tag-value>hz-members</tag-value>
         </aws>
         <discovery-strategies>
           <discovery-strategy ... />
         </discovery-strategies>
     </join>
</network>

Programmatic:

Config config = new Config();
NetworkConfig network = config.getNetworkConfig();
JoinConfig join = network.getJoin();
join.getMulticastConfig().setEnabled( false )
            .addTrustedInterface( "192.168.1.102" );
join.getTcpIpConfig().addMember( "10.45.67.32" ).addMember( "10.45.67.100" )
            .setRequiredMember( "192.168.10.100" ).setEnabled( true );

The join element has the following sub-elements and attributes.

multicast element

The multicast element includes parameters to fine tune the multicast join mechanism.

  • enabled: Specifies whether the multicast discovery is enabled or not, true or false.

  • multicast-group: The multicast group IP address. Specify it when you want to create clusters within the same network. Values can be between 224.0.0.0 and 239.255.255.255. Its default value is 224.2.2.3.

  • multicast-port: The multicast socket port that the Hazelcast member listens to and sends discovery messages through. Its default value is 54327.

  • multicast-time-to-live: Time-to-live value for multicast packets sent out to control the scope of multicasts. See more information here.

  • multicast-timeout-seconds: Only when the members are starting up, this timeout (in seconds) specifies the period during which a member waits for a multicast response from another member. For example, if you set it as 60 seconds, each member will wait for 60 seconds until a leader member is selected. Its default value is 2 seconds.

  • trusted-interfaces: Includes IP addresses of trusted members. When a member wants to join to the cluster, its join request will be rejected if it is not a trusted member. You can give an IP addresses range using the wildcard (*) on the last digit of IP address, e.g., 192.168.1.\* or 192.168.1.100-110.

tcp-ip element

The tcp-ip element includes parameters to fine tune the TCP/IP join mechanism.

  • enabled: Specifies whether the TCP/IP discovery is enabled or not. Values can be true or false.

  • required-member: IP address of the required member. Cluster will only formed if the member with this IP address is found.

  • member: IP address(es) of one or more well known members. Once members are connected to these well known ones, all member addresses will be communicated with each other. You can also give comma separated IP addresses using the members element.

    tcp-ip element also accepts the interface parameter. Please refer to the Interfaces element description.
  • connection-timeout-seconds: Defines the connection timeout in seconds. This is the maximum amount of time Hazelcast is going to try to connect to a well known member before giving up. Setting it to a too low value could mean that a member is not able to connect to a cluster. Setting it to a too high value means that member startup could slow down because of longer timeouts, for example when a well known member is not up. Increasing this value is recommended if you have many IPs listed and the members cannot properly build up the cluster. Its default value is 5 seconds.

aws element

The aws element includes parameters to allow the members to form a cluster on the Amazon EC2 environment.

  • enabled: Specifies whether the EC2 discovery is enabled or not, true or false.

  • access-key, secret-key: Access and secret keys of your account on EC2.

  • region: The region where your members are running. Its default value is us-east-1. You need to specify this if the region is other than the default one.

  • host-header: The URL that is the entry point for a web service. It is optional.

  • security-group-name: Name of the security group you specified at the EC2 management console. It is used to narrow the Hazelcast members to be within this group. It is optional.

  • tag-key, tag-value: To narrow the members in the cloud down to only Hazelcast members, you can set these parameters as the ones you specified in the EC2 console. They are optional.

  • connection-timeout-seconds: The maximum amount of time, in seconds, Hazelcast will try to connect to a well known member before giving up. Setting this value too low could mean that a member is not able to connect to a cluster. Setting the value too high means that member startup could slow down because of longer timeouts (for example, when a well known member is not up). Increasing this value is recommended if you have many IPs listed and the members cannot properly build up the cluster. Its default value is 5 seconds.

If you are using a cloud provider other than AWS, you can use the programmatic configuration to specify a TCP/IP cluster. The members will need to be retrieved from that provider, e.g., jclouds.

discovery-strategies element

The discovery-strategies element configures internal or external discovery strategies based on the Hazelcast Discovery SPI. For further information, please refer to the Discovery SPI section and the vendor documentation of the used discovery strategy.

5.10.6. AWSClient Configuration

To make sure EC2 instances are found correctly, you can use the AWSClient class. It determines the private IP addresses of EC2 instances to be connected. Give the AWSClient class the values for the parameters that you specified in the aws element, as shown below. You will see whether your EC2 instances are found.

public static void main( String[] args )throws Exception{
  AwsConfig config = new AwsConfig();
  config.setSecretKey( ... ) ;
  config.setSecretKey( ... );
  config.setRegion( ... );
  config.setSecurityGroupName( ... );
  config.setTagKey( ... );
  config.setTagValue( ... );
  config.setEnabled( true );
  AWSClient client = new AWSClient( config );
  Collection<String> ipAddresses = client.getPrivateIpAddresses();
  System.out.println( "addresses found:" + ipAddresses );
  for ( String ip: ipAddresses ) {
    System.out.println( ip );
  }
}

5.10.7. Interfaces

You can specify which network interfaces that Hazelcast should use. Servers mostly have more than one network interface, so you may want to list the valid IPs. Range characters ('*' and '-') can be used for simplicity. For instance, 10.3.10.\* refers to IPs between 10.3.10.0 and 10.3.10.255. Interface 10.3.10.4-18 refers to IPs between 10.3.10.4 and 10.3.10.18 (4 and 18 included). If network interface configuration is enabled (it is disabled by default) and if Hazelcast cannot find a matching interface, then it will print a message on the console and will not start on that member.

The following are example configurations.

Declarative:

<hazelcast>
  ...
  <network>
    ...
    <interfaces enabled="true">
      <interface>10.3.16.*</interface>
      <interface>10.3.10.4-18</interface>
      <interface>192.168.1.3</interface>
    </interfaces>
  </network>
  ...
</hazelcast>

Programmatic:

Config config = new Config();
NetworkConfig network = config.getNetworkConfig();
InterfacesConfig interfaceConfig = network.getInterfaces();
interfaceConfig.setEnabled( true )
            .addInterface( "192.168.1.3" );

5.10.8. IPv6 Support

Hazelcast supports IPv6 addresses seamlessly (This support is switched off by default, please see the note at the end of this section).

All you need is to define IPv6 addresses or interfaces in the network configuration. The only current limitation is that you cannot define wildcard IPv6 addresses in the TCP/IP join configuration (tcp-ip element). Interfaces configuration does not have this limitation, you can configure wildcard IPv6 interfaces in the same way as IPv4 interfaces.

<hazelcast>
  ...
  <network>
    <port auto-increment="true">5701</port>
    <join>
      <multicast enabled="false">
        <multicast-group>FF02:0:0:0:0:0:0:1</multicast-group>
        <multicast-port>54327</multicast-port>
      </multicast>
      <tcp-ip enabled="true">
        <member>[fe80::223:6cff:fe93:7c7e]:5701</member>
        <interface>192.168.1.0-7</interface>
        <interface>192.168.1.*</interface>
        <interface>fe80:0:0:0:45c5:47ee:fe15:493a</interface>
      </tcp-ip>
    </join>
    <interfaces enabled="true">
      <interface>10.3.16.*</interface>
      <interface>10.3.10.4-18</interface>
      <interface>fe80:0:0:0:45c5:47ee:fe15:*</interface>
      <interface>fe80::223:6cff:fe93:0-5555</interface>
    </interfaces>
    ...
  </network>
  ...
</hazelcast>

JVM has two system properties for setting the preferred protocol stack (IPv4 or IPv6) as well as the preferred address family types (inet4 or inet6). On a dual stack machine, IPv6 stack is preferred by default, you can change this through the java.net.preferIPv4Stack=<true|false> system property. When querying name services, JVM prefers IPv4 addresses over IPv6 addresses and will return an IPv4 address if possible. You can change this through java.net.preferIPv6Addresses=<true|false> system property.

Also see additional details on IPv6 support in Java.

IPv6 support has been switched off by default, since some platforms have issues using the IPv6 stack. Some other platforms such as Amazon AWS have no support at all. To enable IPv6 support, just set configuration property hazelcast.prefer.ipv4.stack to false. Please refer to the System Properties appendix for details.

5.10.9. Member Address Provider SPI

This SPI is not intended to provide addresses of other cluster members with which the Hazelcast instance will form a cluster. To do that, refer to the other network configuration sections above.

By default, Hazelcast chooses the public and bind address. You can influence on the choice by defining a public-address in the configuration or by using other properties mentioned above. In some cases, though, these properties are not enough and the default address picking strategy will choose wrong addresses. This may be the case when deploying Hazelcast in some cloud environments, such as AWS, when using Docker or when the instance is deployed behind a NAT and the public-address property is not enough (please see the Public Address section).

In these cases, it is possible to configure the bind and public address in a more advanced way. You can provide an implementation of the com.hazelcast.spi.MemberAddressProvider interface which will provide the bind and public address. The implementation may then choose these addresses in any way - it may read from a system property or file or even invoke a web service to retrieve the public and private address.

The details of the implementation depend heavily on the environment in which Hazelcast is deployed. As such, we will demonstrate how to configure Hazelcast to use a simplified custom member address provider SPI implementation. An example of an implementation is shown below:

public static final class SimpleMemberAddressProvider implements MemberAddressProvider {
    @Override
    public InetSocketAddress getBindAddress() {
        // determine the address using some configuration, calling an API, ...
        return new InetSocketAddress(hostname, port);
    }

    @Override
    public InetSocketAddress getPublicAddress() {
        // determine the address using some configuration, calling an API, ...
        return new InetSocketAddress(hostname, port);
    }
}

Note that if the bind address port is 0 then it will use a port as configured in the Hazelcast network configuration (see the Port section). If the public address port is set to 0 then it will broadcast the same port that it is bound to. If you wish to bind to any local interface, you may return new InetSocketAddress((InetAddress) null, port) from the getBindAddress() address.

The following configuration examples contain properties that will be provided to the constructor of the provided class. If you do not provide any properties, the class may have either a no-arg constructor or a constructor accepting a single java.util.Properties instance. On the other hand, if you do provide properties in the configuration, the class must have a constructor accepting a single java.util.Properties instance.

Declarative:

<network>
     <member-address-provider enabled="true">
         <class-name>SimpleMemberAddressProvider</class-name>
         <properties>
             <property name="prop1">prop1-value</property>
             <property name="prop2">prop2-value</property>
         </properties>
     </member-address-provider>
     <!-- other network configuration -->
</network>

Programmatic:

Config config = new Config();
MemberAddressProviderConfig memberAddressProviderConfig = config.getNetworkConfig().getMemberAddressProviderConfig();
memberAddressProviderConfig
      .setEnabled(true)
      .setClassName(MemberAddressProviderWithStaticProperties.class.getName());
Properties properties = memberAddressProviderConfig.getProperties();
properties.setProperty("prop1", "prop1-value");
properties.setProperty("prop2", "prop2-value");

config.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);

// perform other configuration

Hazelcast.newHazelcastInstance(config);

5.11. Failure Detector Configuration

A failure detector is responsible to determine if a member in the cluster is unreachable or crashed. The most important problem in failure detection is to distinguish whether a member is still alive but slow or has crashed. But according to the famous FLP result, it is impossible to distinguish a crashed member from a slow one in an asynchronous system. A workaround to this limitation is to use unreliable failure detectors. An unreliable failure detector allows a member to suspect that others have failed, usually based on liveness criteria but it can make mistakes to a certain degree.

Hazelcast has two built-in failure detectors; Deadline Failure Detector and Phi Accrual Failure Detector.

Since 3.9.1, Hazelcast provides yet another failure detector, Ping Failure Detector, that, if enabled, works in parallel with the above ones, but identifies failures on OSI Layer 3 (Network Layer). This detector is by default disabled.

Note that, Hazelcast also offers failure detectors for its Java client. Please refer to the Client Failure Detectors section for more information.

5.11.1. Deadline Failure Detector

Deadline Failure Detector uses an absolute timeout for missing/lost heartbeats. After timeout, a member is considered as crashed/unavailable and marked as suspected.

Deadline Failure Detector has two configuration properties:

  • hazelcast.heartbeat.interval.seconds: This is the interval at which member heartbeat messages are sent to each other.

  • hazelcast.max.no.heartbeat.seconds: This is the timeout which defines when a cluster member is suspected because it has not sent any heartbeats.

To use Deadline Failure Detector configuration property hazelcast.heartbeat.failuredetector.type should be set to "deadline".

<hazelcast>
    [...]
    <properties>
        <property name="hazelcast.heartbeat.failuredetector.type">deadline</property>
        <property name="hazelcast.heartbeat.interval.seconds">5</property>
        <property name="hazelcast.max.no.heartbeat.seconds">120</property>
        [...]
    </properties>
    [...]
</hazelcast>
Config config = ...;
config.setProperty("hazelcast.heartbeat.failuredetector.type", "deadline");
config.setProperty("hazelcast.heartbeat.interval.seconds", "5");
config.setProperty("hazelcast.max.no.heartbeat.seconds", "120");
[...]
Deadline Failure Detector is the default failure detector in Hazelcast.

5.11.2. Phi Accrual Failure Detector

This is the failure detector based on The Phi Accrual Failure Detector' by Hayashibara et al.

Phi Accrual Failure Detector keeps track of the intervals between heartbeats in a sliding window of time and measures the mean and variance of these samples and calculates a value of suspicion level (Phi). The value of phi will increase when the period since the last heartbeat gets longer. If the network becomes slow or unreliable, the resulting mean and variance will increase, there will need to be a longer period for which no heartbeat is received before the member is suspected. 

hazelcast.heartbeat.interval.seconds and hazelcast.max.no.heartbeat.seconds properties will still be used as period of heartbeat messages and deadline of heartbeat messages. Since Phi Accrual Failure Detector is adaptive to network conditions, a much lower hazelcast.max.no.heartbeat.seconds can be defined than Deadline Failure Detector's timeout.

Additional to above two properties, Phi Accrual Failure Detector has three more configuration properties:

  • hazelcast.heartbeat.phiaccrual.failuredetector.threshold: This is the phi threshold for suspicion. After calculated phi exceeds this threshold, a member is considered as unreachable and marked as suspected. A low threshold allows to detect member crashes/failures faster but can generate more mistakes and cause wrong member suspicions. A high threshold generates fewer mistakes but is slower to detect actual crashes/failures.

    phi = 1 means likeliness that we will make a mistake is about 10%. The likeliness is about 1% with phi = 2, 0.1% with phi = 3 and so on. Default phi threshold is 10.

  • hazelcast.heartbeat.phiaccrual.failuredetector.sample.size: Number of samples to keep for history. Its default value is 200.

  • hazelcast.heartbeat.phiaccrual.failuredetector.min.std.dev.millis: Minimum standard deviation to use for the normal distribution used when calculating phi. Too low standard deviation might result in too much sensitivity.

To use Phi Accrual Failure Detector, configuration property hazelcast.heartbeat.failuredetector.type should be set to "phi-accrual".

<hazelcast>
   [...]
   <properties>
      <property name="hazelcast.heartbeat.failuredetector.type">phi-accrual</property>
      <property name="hazelcast.heartbeat.interval.seconds">1</property>
      <property name="hazelcast.max.no.heartbeat.seconds">60</property>
      <property name="hazelcast.heartbeat.phiaccrual.failuredetector.threshold">10</property>
      <property name="hazelcast.heartbeat.phiaccrual.failuredetector.sample.size">200</property>
      <property name="hazelcast.heartbeat.phiaccrual.failuredetector.min.std.dev.millis">100</property>
      [...]
   </properties>
   [...]
</hazelcast>
Config config = ...;
config.setProperty("hazelcast.heartbeat.failuredetector.type", "phi-accrual");
config.setProperty("hazelcast.heartbeat.interval.seconds", "1");
config.setProperty("hazelcast.max.no.heartbeat.seconds", "60");
config.setProperty("hazelcast.heartbeat.phiaccrual.failuredetector.threshold", "10");
config.setProperty("hazelcast.heartbeat.phiaccrual.failuredetector.sample.size", "200");
config.setProperty("hazelcast.heartbeat.phiaccrual.failuredetector.min.std.dev.millis", "100");
[...]

5.11.3. Ping Failure Detector

The Ping Failure Detector may be configured in addition to one of Deadline and Phi Accrual Failure Detectors. It operates at Layer 3 of the OSI protocol and provides much quicker and more deterministic detection of hardware and other lower level events. This detector may be configured to perform an extra check after a member is suspected by one of the other detectors, or it can work in parallel, which is the default. This way hardware and network level issues will be detected more quickly.

This failure detector is based on InetAddress.isReachable(). When the JVM process has enough permissions to create RAW sockets, the implementation will choose to rely on ICMP Echo requests. This is preferred.

If there are not enough permissions, it can be configured to fallback on attempting a TCP Echo on port 7. In the latter case, both a successful connection or an explicit rejection will be treated as "Host is Reachable". Or, it can be forced to use only RAW sockets. This is not preferred as each call creates a heavy weight socket and moreover the Echo service is typically disabled.

For the Ping Failure Detector to rely only on ICMP Echo requests, there are some criteria that need to be met.

Requirements and Linux/Unix Configuration
  • Supported OS: as of Java 1.8 only Linux/Unix environments are supported. This detector relies on ICMP, i.e., the protocol behind the ping command. It tries to issue the ping attempts periodically, and their responses are used to determine the reachability of the remote member. However, you cannot simply create an ICMP Echo Request because these type of packets do not rely on any of the preexisting transport protocols such as TCP. In order to create such a request, you must have the privileges to create RAW sockets (please see https://linux.die.net/man/7/raw). Most operating systems allow this to the root users, however Unix based ones are more flexible and allow the use of custom privileges per process instead of requiring root access. Therefore, this detector is supported only on Linux.

  • The Java executable must have the cap_net_raw capability. As described in the above requirement, on Linux, you have the ability to define extra capabilities to a single process, which would allow the process to interact with the RAW sockets. This interaction is achieved via the capability cap_net_raw (please see https://linux.die.net/man/7/capabilities). To enable this capability run the following command:

    sudo setcap cap_net_raw=+ep <JDK_HOME>/jre/bin/java

  • When running with custom capabilities, the dynamic linker on Linux will reject loading libs from untrusted paths. Since you have now cap_net_raw as a custom capability for a process, it becomes suspicious to the dynamic linker and it will throw an error: java: error while loading shared libraries: libjli.so: cannot open shared object file: No such file or directory

    • To overcome this rejection, the <JDK_HOME>/jre/lib/amd64/jli/ path needs to be added in the ld.conf. Run the following command to do this: echo "<JDK_HOME>/jre/lib/amd64/jli/" >> /etc/ld.so.conf.d/java.conf && sudo ldconfig

  • ICMP Echo Requests must not be blocked by the receiving hosts. /proc/sys/net/ipv4/icmp_echo_ignore_all set to 0. Run the following command:

    echo 0 > /proc/sys/net/ipv4/icmp_echo_ignore_all

If any of the above criteria isn’t met, then the isReachable will always fallback on TCP Echo attempts on port 7.

To be able to use the Ping Failure Detector, please add the following properties in your Hazelcast declarative configuration file:

<hazelcast>
   [...]
    <properties>
      <property name="hazelcast.icmp.enabled">true</property>
      <property name="hazelcast.icmp.parallel.mode">true</property>
      <property name="hazelcast.icmp.timeout">1000</property>
      <property name="hazelcast.icmp.max.attempts">3</property>
      <property name="hazelcast.icmp.interval">1000</property>
      <property name="hazelcast.icmp.ttl">0</property>
      [...]
   </properties>
   [...]
</hazelcast>
  • hazelcast.icmp.enabled (default false) - Enables legacy ICMP detection mode, works cooperatively with the existing failure detector and only kicks-in after a pre-defined period has passed with no heartbeats from a member.

  • hazelcast.icmp.parallel.mode (default true) - Enabling the parallel ping detector, works separately from the other detectors.

  • hazelcast.icmp.timeout (default 1000) - Number of milliseconds until a ping attempt is considered failed if there was no reply.

  • hazelcast.icmp.max.attempts (default 3) - The maximum number of ping attempts before the member/node gets suspected by the detector.

  • hazelcast.icmp.interval (default 1000) - The interval, in milliseconds, between each ping attempt. 1000ms (1 sec) is also the minimum interval allowed.

  • hazelcast.icmp.ttl (default 0) - The maximum number of hops the packets should go through or 0 for the default.

In the above configuration, the Ping detector will attempt 3 pings, one every second and will wait up-to 1 second for each to complete. If after 3 seconds, there was no successful ping, the member will get suspected.

To enforce the Requirements, the property hazelcast.icmp.echo.fail.fast.on.startup can also be set to true, in which case, if any of the requirements isn’t met, Hazelcast will fail to start.

Below is a summary table of all possible configuration combinations of the ping failure detector.

Table 2. Ping Failure Detector Possible Configuration Combinations
ICMP Parallel Fail-Fast Description Linux Windows macOS

false

false

false

Completely disabled

N/A

N/A

N/A

true

false

false

Legacy ping mode. This works hand-to-hand with the OSI Layer 7 failure detector (see. phi or deadline in sections above). Ping in this mode will only kick in after a period when there are no hearbeats received, in which case the remote Hazelcast member will be pinged up-to 5 times. If all 5 attempts fail, the member gets suspected.

Supported ICMP Echo if available - Falls back on TCP Echo on port 7

Supported TCP Echo on port 7

Supported ICMP Echo if available - Falls back on TCP Echo on port 7

true

true

false

Parallel ping detector, works in parallel with the configured failure detector. Checks periodically if members are live (OSI Layer 3) and suspects them immediately, regardless of the other detectors.

Supported ICMP Echo if available - Falls back on TCP Echo on port 7

Supported TCP Echo on port 7

Supported ICMP Echo if available - Falls back on TCP Echo on port 7

true

true

true

Parallel ping detector, works in parallel with the configured failure detector. Checks periodically if members are live (OSI Layer 3) and suspects them immediately, regardless of the other detectors.

Supported - Requires OS Configuration Enforcing ICMP Echo if available - No start up if not available

Not Supported

Not Supported - Requires root privileges

6. Rolling Member Upgrades

Hazelcast IMDG Enterprise

This chapter explains the procedure of upgrading the version of Hazelcast members in a running cluster without interrupting the operation of the cluster.

6.1. Terminology

  • Minor version: A version change after the decimal point, e.g., 3.8 and 3.9.

  • Patch version: A version change after the second decimal point, e.g., 3.8.1 and 3.8.2.

  • Member codebase version: The major.minor.patch version of the Hazelcast binary on which the member executes. For example, when running on hazelcast-3.8.jar, your member’s codebase version is 3.8.0.

  • Cluster version: The major.minor version at which the cluster operates. This ensures that cluster members are able to communicate using the same cluster protocol and determines the feature set exposed by the cluster.

6.2. Hazelcast Members Compatibility Guarantees

Hazelcast members operating on binaries of the same major and minor version numbers are compatible regardless of patch version. For example, in a cluster with members running on version 3.7.1, it is possible to perform a rolling upgrade to 3.7.2 by shutting down, upgrading to hazelcast-3.7.2.jar binary and starting each member one by one. Patch level compatibility applies to both Hazelcast IMDG and Hazelcast IMDG Enterprise.

Starting with Hazelcast IMDG Enterprise 3.8, each next minor version released will be compatible with the previous one. For example, it will be possible to perform a rolling upgrade on a cluster running Hazelcast IMDG Enterprise 3.8 to Hazelcast IMDG Enterprise 3.9 whenever that is released. Rolling upgrades across minor versions is a Hazelcast IMDG Enterprise feature, starting with version 3.8.

The compatibility guarantees described above are given in the context of rolling member upgrades and only apply to GA (general availability) releases. It is never advisable to run a cluster with members running on different patch or minor versions for prolonged periods of time.

6.3. Rolling Upgrade Procedure

The version numbers used in the paragraph below are only used as an example.

Let’s assume a cluster with four members running on codebase version 3.8.0 with cluster version 3.8, that should be upgraded to codebase version 3.9.0 and cluster version 3.9. The rolling upgrade process for this cluster, i.e., replacing existing 3.8.0 members one by one with an upgraded one at version 3.9.0, includes the following steps which should be repeated for each member:

  • Shutdown gracefully an existing 3.8.0 member.

  • Wait until all partition migrations are completed; during migrations, membership changes (member joins or removals) are not allowed.

  • Update the member with the new 3.9.0 Hazelcast binaries.

  • Start the member and wait until it joins the cluster. You should see something like the following in your logs:

    ...
    INFO: [192.168.2.2]:5701 [cluster] [3.9] Hazelcast 3.9 (20170630 - a67dc3a) starting at [192.168.2.2]:5701
    ...
    INFO: [192.168.2.2]:5701 [cluster] [3.9] Cluster version set to 3.8

The version in brackets [3.9] still denotes the member’s codebase version (running on the hypothetical hazelcast-3.9.jar binary). Once the member locates existing cluster members, it sends its join request to the master. The master validates that the new member is allowed to join the cluster and lets the new member know that the cluster is currently operating at 3.8 cluster version. The new member sets 3.8 as its cluster version and starts operating normally.

At this point all members of the cluster have been upgraded to codebase version `3.9.0` but the cluster still operates at cluster version `3.8`. In order to use `3.9` features
the cluster version must be changed to `3.9`. There are two ways to accomplish this:
You need to upgrade your Management Center version before upgrading the member version if you want to change cluster version using Management Center. Management Center is compatible with the previous minor version of Hazelcast, starting with version 3.9. For example, Management Center 3.9 works with both Hazelcast IMDG 3.8 and 3.9. To change your cluster version to 3.9, you need Management Center 3.9.

6.4. Network Partitions and Rolling Upgrades

In the event of network partitions which split your cluster into two subclusters, split-brain handling works as explained in the Network Partitioning chapter, with the additional constraint that two subclusters will only merge as long as they operate on the same cluster version. This is a requirement to ensure that all members participating in each one of the subclusters will be able to operate as members of the merged cluster at the same cluster version.

With regards to rolling upgrades, the above constraint implies that if a network partition occurs while a change of cluster version is in progress, then with some unlucky timing, one subcluster may be upgraded to the new cluster version and another subcluster may have upgraded members but still operate at the old cluster version.

In order for the two subclusters to merge, it is necessary to change the cluster version of the subcluster that still operates at the old cluster version, so that both subclusters will be operating at the same, upgraded cluster version and will be able to merge as soon as the network partition is fixed.

6.5. Rolling Upgrade FAQ

The following provide answers to the frequently asked questions related to rolling member upgrades.

How is the cluster version set?

When a new member starts, it is not yet joined to a cluster; therefore its cluster version is still undetermined. In order for the cluster version to be set, one of the following must happen:

  • the member cannot locate any members of the cluster to join or is configured without a joiner: in this case, the member will appoint itself as the master of a new single-member cluster and its cluster version will be set to the major.minor version of its own codebase version. So a standalone member running on codebase version 3.8.0 will set its own cluster version to 3.8.

  • the member that is starting locates members of the cluster and identifies which is the master: in this case, the master will validate that the joining member’s codebase version is compatible with the current cluster version. If it is found to be compatible, then the member joins and the master sends the cluster version, which is set on the joining member. Otherwise, the starting member fails to join and shuts down.

What if a new Hazelcast minor version changes fundamental cluster protocol communication, like join messages?

The version numbers used in the paragraph below are only used as an example.

On startup, as answered in the above question (How is the cluster version set?), the cluster version is not yet known to a member that has not joined any cluster. By default the newly started member will use the cluster protocol that corresponds to its codebase version until this member joins a cluster (so for codebase 3.9.0 this means implicitly assuming cluster version 3.9). If, hypothetically, major changes in discovery & join operations have been introduced which do not allow the member to join a 3.8 cluster, then the member should be explicitly configured to start assuming a 3.8 cluster version.

Do I have to upgrade clients to work with rolling upgrades?

Starting with Hazelcast 3.6, the Hazelcast Open Binary Client Protocol was introduced. Clients which implement the Open Binary Client Protocol are compatible with Hazelcast version 3.6 and newer minor versions. Thus older client versions will be compatible with next minor versions. Newer clients connected to a cluster will operate at the lower version of capabilities until all members are upgraded and the cluster version upgrade occurs.

Can I stop and start multiple members at once during a rolling member upgrade?

It is not recommended due to potential network partitions. It is advised to always stop and start one member in each upgrade step.

Can I upgrade my business app together with Hazelcast while doing a rolling member upgrade?

Yes, but make sure to make the new version of your app compatible with the old one since there will be a timespan when both versions interoperate. Checking if two versions of your app are compatible includes verifying binary and algorithmic compatibility and some other steps.

It is worth mentioning that a business app upgrade is orthogonal to a rolling member upgrade. A rolling business app upgrade may be done without upgrading the members.

7. Distributed Data Structures

As mentioned in the Overview section, Hazelcast offers distributed implementations of Java interfaces. Below is the list of these implementations with links to the corresponding sections in this manual.

  • Standard utility collections

    • Map is the distributed implementation of java.util.Map. It lets you read from and write to a Hazelcast map with methods such as get and put.

    • Queue is the distributed implementation of java.util.concurrent.BlockingQueue. You can add an item in one member and remove it from another one.

    • Ringbuffer is implemented for reliable eventing system.

    • Set is the distributed and concurrent implementation of java.util.Set. It does not allow duplicate elements and does not preserve their order.

    • List is similar to Hazelcast Set. The only difference is that it allows duplicate elements and preserves their order.

    • Multimap is a specialized Hazelcast map. It is a distributed data structure where you can store multiple values for a single key.

    • Replicated Map does not partition data. It does not spread data to different cluster members. Instead, it replicates the data to all members.

    • Cardinality Estimator is a data structure which implements Flajolet’s HyperLogLog algorithm.

  • Topic is the distributed mechanism for publishing messages that are delivered to multiple subscribers. It is also known as the publish/subscribe (pub/sub) messaging model. Please see the Topic section for more information. Hazelcast also has a structure called Reliable Topic which uses the same interface of Hazelcast Topic. The difference is that it is backed up by the Ringbuffer data structure. Please see the Reliable Topic section.

  • Concurrency utilities

    • Lock is the distributed implementation of java.util.concurrent.locks.Lock. When you use lock, the critical section that Hazelcast Lock guards is guaranteed to be executed by only one thread in the entire cluster.

    • ISemaphore is the distributed implementation of java.util.concurrent.Semaphore. When performing concurrent activities, semaphores offer permits to control the thread counts.

    • IAtomicLong is the distributed implementation of java.util.concurrent.atomic.AtomicLong. Most of AtomicLong’s operations are available. However, these operations involve remote calls and hence their performances differ from AtomicLong, due to being distributed.

    • IAtomicReference is the distributed implementation of java.util.concurrent.atomic.AtomicReference. When you need to deal with a reference in a distributed environment, you can use Hazelcast IAtomicReference.

    • IdGenerator is used to generate cluster-wide unique identifiers. ID generation occurs almost at the speed of AtomicLong.incrementAndGet(). This feature is deprecated, please use FlakeIdGenerator instead.

    • ICountdownLatch is the distributed implementation of java.util.concurrent.CountDownLatch. Hazelcast CountDownLatch is a gate keeper for concurrent activities. It enables the threads to wait for other threads to complete their operations.

    • PN counter is a distributed data structure where each Hazelcast instance can increment and decrement the counter value and these updates are propagated to all replicas.

  • Event Journal is a distributed data structure that stores the history of mutation actions on map or cache.

7.1. Overview of Hazelcast Distributed Objects

Hazelcast has two types of distributed objects in terms of their partitioning strategies:

  1. Data structures where each partition stores a part of the instance, namely partitioned data structures.

  2. Data structures where a single partition stores the whole instance, namely non-partitioned data structures.

Partitioned Hazelcast data structures are:

  • Map

  • MultiMap

  • Cache (Hazelcast JCache implementation)

  • PN Counter

  • Event Journal

Non-partitioned Hazelcast data structures are:

  • Queue

  • Set

  • List

  • Ringbuffer

  • Lock

  • ISemaphore

  • IAtomicLong

  • IAtomicReference

  • FlakeIdGenerator

  • ICountdownLatch

  • Cardinality Estimator

Besides these, Hazelcast also offers the Replicated Map structure as explained in the above Standard utility collections list.

7.1.1. Loading and Destroying a Distributed Object

Hazelcast offers a get method for most of its distributed objects. To load an object, first create a Hazelcast instance and then use the related get method on this instance. Following example code snippet creates an Hazelcast instance and a map on this instance.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
Map<Integer, String> customers = hazelcastInstance.getMap( "customers" );

As to the configuration of distributed object, Hazelcast uses the default settings from the file hazelcast.xml that comes with your Hazelcast download. Of course, you can provide an explicit configuration in this XML or programmatically according to your needs. Please see the Understanding Configuration section.

Note that, most of Hazelcast’s distributed objects are created lazily, i.e., a distributed object is created once the first operation accesses it.

If you want to use an object you loaded in other places, you can safely reload it using its reference without creating a new Hazelcast instance (customers in the above example).

To destroy a Hazelcast distributed object, you can use the method destroy. This method clears and releases all resources of the object. Therefore, you must use it with care since a reload with the same object reference after the object is destroyed creates a new data structure without an error. Please see the following example code where one of the queues are destroyed and the other one is accessed.

HazelcastInstance hz1 = Hazelcast.newHazelcastInstance();
HazelcastInstance hz2 = Hazelcast.newHazelcastInstance();
IQueue<String> q1 = hz1.getQueue("q");
IQueue<String> q2 = hz2.getQueue("q");
q1.add("foo");
System.out.println("q1.size: "+q1.size()+ " q2.size:"+q2.size());
q1.destroy();
System.out.println("q1.size: " + q1.size() + " q2.size:" + q2.size());

If you start the Member above, the output will be as shown below:

q1.size: 1 q2.size:1
q1.size: 0 q2.size:0

As you see, no error is generated and a new queue resource is created.

7.1.2. Controlling Partitions

Hazelcast uses the name of a distributed object to determine which partition it will be put. Let’s load two semaphores as shown below:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
ISemaphore s1 = hazelcastInstance.getSemaphore("s1");
ISemaphore s2 = hazelcastInstance.getSemaphore("s2");

Since these semaphores have different names, they will be placed into different partitions. If you want to put these two into the same partition, you use the @ symbol as shown below:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
ISemaphore s1 = hazelcastInstance.getSemaphore("s1@foo");
ISemaphore s2 = hazelcastInstance.getSemaphore("s2@foo");

Now, these two semaphores will be put into the same partition whose partition key is foo. Note that you can use the method getPartitionKey to learn the partition key of a distributed object. It may be useful when you want to create an object in the same partition of an existing object. Please see its usage as shown below:

String partitionKey = s1.getPartitionKey();
ISemaphore s3 = hazelcastInstance.getSemaphore("s3@"+partitionKey);

7.1.3. Common Features of all Hazelcast Data Structures

  • If a member goes down, its backup replica (which holds the same data) will dynamically redistribute the data, including the ownership and locks on them, to the remaining live members. As a result, there will not be any data loss.

  • There is no single cluster master that can be a single point of failure. Every member in the cluster has equal rights and responsibilities. No single member is superior. There is no dependency on an external 'server' or 'master'.

7.1.4. Example Distributed Object Code

Here is an example of how you can retrieve existing data structure instances (map, queue, set, lock, topic, etc.) and how you can listen for instance events, such as an instance being created or destroyed.

    SampleDOL sample = new SampleDOL();
    Config config = new Config();

    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);
    hazelcastInstance.addDistributedObjectListener(sample);

    Collection<DistributedObject> distributedObjects = hazelcastInstance.getDistributedObjects();
    for (DistributedObject distributedObject : distributedObjects) {
        System.out.println(distributedObject.getName());
    }
}

@Override
public void distributedObjectCreated(DistributedObjectEvent event) {
    DistributedObject instance = event.getDistributedObject();
    System.out.println("Created " + instance.getName());
}

@Override
public void distributedObjectDestroyed(DistributedObjectEvent event) {
    DistributedObject instance = event.getDistributedObject();
    System.out.println("Destroyed " + instance.getName());
}

7.2. Map

Hazelcast Map (IMap) extends the interface java.util.concurrent.ConcurrentMap and hence java.util.Map. It is the distributed implementation of Java map. You can perform operations like reading and writing from/to a Hazelcast map with the well known get and put methods.


IMap data structure can also be used by Hazelcast Jet for Real-Time Stream Processing (by enabling the Event Journal on your map) and Fast Batch Processing. Hazelcast Jet uses IMap as a source (reads data from IMap) and as a sink (writes data to IMap). Please see the Fast Batch Processing and Real-Time Stream Processing use cases for Hazelcast Jet. Please also see here in the Hazelcast Jet Reference Manual to learn how Jet uses IMap, i.e., how it can read from and write to IMap.

7.2.1. Getting a Map and Putting an Entry

Hazelcast will partition your map entries and their backups, and almost evenly distribute them onto all Hazelcast members. Each member carries approximately "number of map entries * 2 * 1/n" entries, where n is the number of members in the cluster. For example, if you have a member with 1000 objects to be stored in the cluster and then you start a second member, each member will both store 500 objects and back up the 500 objects in the other member.

Let’s create a Hazelcast instance and fill a map named Capitals with key-value pairs using the following code. Use the HazelcastInstance getMap method to get the map, then use the map put method to put an entry into the map.

HazelcastInstance hzInstance = Hazelcast.newHazelcastInstance();
Map<String, String> capitalcities = hzInstance.getMap( "capitals" );
    capitalcities.put( "1", "Tokyo" );
    capitalcities.put( "2", "Paris" );
    capitalcities.put( "3", "Washington" );
    capitalcities.put( "4", "Ankara" );
    capitalcities.put( "5", "Brussels" );
    capitalcities.put( "6", "Amsterdam" );
    capitalcities.put( "7", "New Delhi" );
    capitalcities.put( "8", "London" );
    capitalcities.put( "9", "Berlin" );
    capitalcities.put( "10", "Oslo" );
    capitalcities.put( "11", "Moscow" );
    ...
    capitalcities.put( "120", "Stockholm" );

When you run this code, a cluster member is created with a map whose entries are distributed across the members' partitions. See the below illustration. For now, this is a single member cluster.

Map Entries in a Single Member
Please note that some of the partitions will not contain any data entries since we only have 120 objects and the partition count is 271 by default. This count is configurable and can be changed using the system property hazelcast.partition.count. Please see the System Properties appendix.

7.2.2. Creating A Member for Map Backup

Now let’s create a second member by running the above code again. This will create a cluster with two members. This is also where backups of entries are created - remember the backup partitions mentioned in the Hazelcast Overview section. The following illustration shows two members and how the data and its backup is distributed.

Map Entries with Backups in Two Members

As you see, when a new member joins the cluster, it takes ownership and loads some of the data in the cluster. Eventually, it will carry almost "(1/n * total-data) + backups" of the data, reducing the load on other members.

HazelcastInstance.getMap() returns an instance of com.hazelcast.core.IMap which extends the java.util.concurrent.ConcurrentMap interface. Methods like ConcurrentMap.putIfAbsent(key,value) and ConcurrentMap.replace(key,value) can be used on the distributed map, as shown in the example below.

public class BasicMapOperations {

    private HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

    public Customer getCustomer(String id) {
        ConcurrentMap<String, Customer> customers = hazelcastInstance.getMap("customers");
        Customer customer = customers.get(id);
        if (customer == null) {
            customer = new Customer(id);
            customer = customers.putIfAbsent(id, customer);
        }
        return customer;
    }

    public boolean updateCustomer(Customer customer) {
        ConcurrentMap<String, Customer> customers = hazelcastInstance.getMap("customers");
        return (customers.replace(customer.getId(), customer) != null);
    }

    public boolean removeCustomer(Customer customer) {
        ConcurrentMap<String, Customer> customers = hazelcastInstance.getMap("customers");
        return customers.remove(customer.getId(), customer);
    }
}

All ConcurrentMap operations such as put and remove might wait if the key is locked by another thread in the local or remote JVM. But, they will eventually return with success. ConcurrentMap operations never throw a java.util.ConcurrentModificationException.

Also see:

7.2.3. Backing Up Maps

Hazelcast distributes map entries onto multiple cluster members (JVMs). Each member holds some portion of the data.

Distributed maps have one backup by default. If a member goes down, your data is recovered using the backups in the cluster. There are two types of backups as described below: sync and async.

Creating Sync Backups

To provide data safety, Hazelcast allows you to specify the number of backup copies you want to have. That way, data on a cluster member will be copied onto other member(s).

To create synchronous backups, select the number of backup copies using the backup-count property.

<hazelcast>
  <map name="default">
    <backup-count>1</backup-count>
  </map>
</hazelcast>

When this count is 1, a map entry will have its backup on one other member in the cluster. If you set it to 2, then a map entry will have its backup on two other members. You can set it to 0 if you do not want your entries to be backed up, e.g., if performance is more important than backing up. The maximum value for the backup count is 6.

Hazelcast supports both synchronous and asynchronous backups. By default, backup operations are synchronous and configured with backup-count. In this case, backup operations block operations until backups are successfully copied to backup members (or deleted from backup members in case of remove) and acknowledgements are received. Therefore, backups are updated before a put operation is completed, provided that the cluster is stable. Sync backup operations have a blocking cost which may lead to latency issues.

Creating Async Backups

Asynchronous backups, on the other hand, do not block operations. They are fire & forget and do not require acknowledgements; the backup operations are performed at some point in time.

To create asynchronous backups, select the number of async backups with the async-backup-count property. An example is shown below.

<hazelcast>
  <map name="default">
    <backup-count>0</backup-count>
    <async-backup-count>1</async-backup-count>
  </map>
</hazelcast>

See Consistency and Replication Model for more detail.

Backups increase memory usage since they are also kept in memory.
A map can have both sync and async backups at the same time.
Enabling Backup Reads

By default, Hazelcast has one sync backup copy. If backup-count is set to more than 1, then each member will carry both owned entries and backup copies of other members. So for the map.get(key) call, it is possible that the calling member has a backup copy of that key. By default, map.get(key) will always read the value from the actual owner of the key for consistency.

To enable backup reads (read local backup entries), set the value of the read-backup-data property to true. Its default value is false for consistency. Enabling backup reads can improve performance but on the other hand it can cause stale reads while still preserving monotonic-reads property.

<hazelcast>
  <map name="default">
    <backup-count>0</backup-count>
    <async-backup-count>1</async-backup-count>
    <read-backup-data>true</read-backup-data>
  </map>
</hazelcast>

This feature is available when there is at least one sync or async backup.

Please note that if you are performing a read from a backup, you should take into account that your hits to the keys in the backups are not reflected as hits to the original keys on the primary members. This has an impact on IMap’s maximum idle seconds or time-to-live seconds expiration. Therefore, even though there is a hit on a key in backups, your original key on the primary member may expire.

7.2.4. Map Eviction

Starting with Hazelcast 3.7, Hazelcast Map uses a new eviction mechanism which is based on the sampling of entries. Please see the Eviction Algorithm section for details.

Unless you delete the map entries manually or use an eviction policy, they will remain in the map. Hazelcast supports policy-based eviction for distributed maps. Currently supported policies are LRU (Least Recently Used) and LFU (Least Frequently Used).

Understanding Map Eviction

Hazelcast Map performs eviction based on partitions. For example, when you specify a size using the PER_NODE attribute for max-size (please see Configuring Map Eviction), Hazelcast internally calculates the maximum size for every partition. Hazelcast uses the following equation to calculate the maximum size of a partition:

partition-maximum-size = max-size * member-count / partition-count
If the partition-maximum-size is less than 1 in the equation above, it will be set to 1 (otherwise, the partitions would be emptied immediately by eviction due to the exceedance of max-size being less than 1).

The eviction process starts according to this calculated partition maximum size when you try to put an entry. When entry count in that partition exceeds partition maximum size, eviction starts on that partition.

Assume that you have the following figures as examples:

  • Partition count: 200

  • Entry count for each partition: 100

  • max-size (PER_NODE): 20000

The total number of entries here is 20000 (partition count * entry count for each partition). This means you are at the eviction threshold since you set the max-size to 20000. When you try to put an entry

  1. the entry goes to the relevant partition;

  2. the partition checks whether the eviction threshold is reached (max-size);

  3. only one entry will be evicted.

As a result of this eviction process, when you check the size of your map, it is 19999. After this eviction, subsequent put operations will not trigger the next eviction until the map size is again close to the max-size.

The above scenario is simply an example that describes how the eviction process works. Hazelcast finds the most optimum number of entries to be evicted according to your cluster size and selected policy.
Configuring Map Eviction

The following is an example declarative configuration for map eviction.

<hazelcast>
  <map name="default">
    ...
    <time-to-live-seconds>0</time-to-live-seconds>
    <max-idle-seconds>0</max-idle-seconds>
    <eviction-policy>LRU</eviction-policy>
    <max-size policy="PER_NODE">5000</max-size>
    ...
  </map>
</hazelcast>

Let’s describe each element:

  • time-to-live-seconds: Maximum time in seconds for each entry to stay in the map (TTL). It limits the lifetime of the entries relative to the time of the last write access performed on them. If it is not 0, the entries whose lifetime exceeds this period (without any write access performed on them during this period) are expired and evicted automatically. An individual entry may have its own lifetime limit by using one of the methods accepting a TTL; see Evicting Specific Entries section. If there is no TTL value provided for the individual entry, it inherits the value set for this element. Valid values are integers between 0 and Integer.MAX VALUE. Its default value is 0, which means infinite (no expiration and eviction). If it is not 0, entries are evicted regardless of the set eviction-policy described below.

  • max-idle-seconds: Maximum time in seconds for each entry to stay idle in the map. It limits the lifetime of the entries relative to the time of the last read or write access performed on them. The entries whose idle period exceeds this limit are expired and evicted automatically. An entry is idle if no get, put, EntryProcessor.process or containsKey is called on it. Valid values are integers between 0 and Integer.MAX VALUE. Its default value is 0, which means infinite.

    Both time-to-live-seconds and max-idle-seconds may be used simultaneously on the map entries. In that case, the entry is considered expired if at least one of the policies marks it as expired.
  • eviction-policy: Eviction policy to be applied when the size of map grows larger than the value specified by the max-size element described below. Valid values are:

    • NONE: Default policy. If set, no items will be evicted and the property max-size described below will be ignored. You still can combine it with time-to-live-seconds and max-idle-seconds.

    • LRU: Least Recently Used.

    • LFU: Least Frequently Used.

      Apart from the above values, you can also develop and use your own eviction policy. Please see the Custom Eviction Policy section.

  • max-size: Maximum size of the map. When maximum size is reached, the map is evicted based on the policy defined. Valid values are integers between 0 and Integer.MAX VALUE. Its default value is 0, which means infinite. If you want max-size to work, set the eviction-policy property to a value other than NONE. Its attributes are described below.

    • PER_NODE: Maximum number of map entries in each cluster member. This is the default policy.

      <max-size policy="PER_NODE">5000</max-size>

    • PER_PARTITION: Maximum number of map entries within each partition. Storage size depends on the partition count in a cluster member. This attribute should not be used often. For instance, avoid using this attribute with a small cluster. If the cluster is small, it will be hosting more partitions, and therefore map entries, than that of a larger cluster. Thus, for a small cluster, eviction of the entries will decrease performance (the number of entries is large).

      <max-size policy="PER_PARTITION">27100</max-size>

    • USED_HEAP_SIZE: Maximum used heap size in megabytes per map for each Hazelcast instance. Please note that this policy does not work when in-memory format is set to OBJECT, since the memory footprint cannot be determined when data is put as OBJECT.

      <max-size policy="USED_HEAP_SIZE">4096</max-size>

    • USED_HEAP_PERCENTAGE: Maximum used heap size percentage per map for each Hazelcast instance. If, for example, a JVM is configured to have 1000 MB and this value is 10, then the map entries will be evicted when used heap size exceeds 100 MB. Please note that this policy does not work when in-memory format is set to OBJECT, since the memory footprint cannot be determined when data is put as OBJECT.

      <max-size policy="USED_HEAP_PERCENTAGE">10</max-size>

    • FREE_HEAP_SIZE: Minimum free heap size in megabytes for each JVM.

      <max-size policy="FREE_HEAP_SIZE">512</max-size>

    • FREE_HEAP_PERCENTAGE: Minimum free heap size percentage for each JVM. If, for example, a JVM is configured to have 1000 MB and this value is 10, then the map entries will be evicted when free heap size is below 100 MB.

      <max-size policy="FREE_HEAP_PERCENTAGE">10</max-size>

    • USED_NATIVE_MEMORY_SIZE: (Hazelcast IMDG Enterprise HD) Maximum used native memory size in megabytes per map for each Hazelcast instance.

      <max-size policy="USED_NATIVE_MEMORY_SIZE">1024</max-size>

    • USED_NATIVE_MEMORY_PERCENTAGE: (Hazelcast IMDG Enterprise HD) Maximum used native memory size percentage per map for each Hazelcast instance.

      <max-size policy="USED_NATIVE_MEMORY_PERCENTAGE">65</max-size>

    • FREE_NATIVE_MEMORY_SIZE: (Hazelcast IMDG Enterprise HD) Minimum free native memory size in megabytes for each Hazelcast instance.

      <max-size policy="FREE_NATIVE_MEMORY_SIZE">256</max-size>

    • FREE_NATIVE_MEMORY_PERCENTAGE: (Hazelcast IMDG Enterprise HD) Minimum free native memory size percentage for each Hazelcast instance.

      <max-size policy="FREE_NATIVE_MEMORY_PERCENTAGE">5</max-size>

As of Hazelcast 3.7, the elements eviction-percentage and min-eviction-check-millis are deprecated. They will be ignored if configured since map eviction is based on the sampling of entries. Please see the Eviction Algorithm section for details.

To put it briefly, Hazelcast maps have no restrictions on the size and may grow arbitrarily large, by default. When it comes to reducing the size of a map, there are two concepts: expiration and eviction.

Expiration puts a limit on the maximum lifetime of an entry stored inside the map. When the entry expires it cannot be retrieved from the map any longer and at some point in time it will be cleaned out from the map to free up the memory. Expiration, and hence the eviction based on the expiration, can be configured using the element time-to-live-seconds- and `max-idle-seconds as described above.

Eviction puts a limit on the maximum size of the map. If the size of the map grows larger than the maximum allowed size, an eviction policy decides which item to evict from the map to reduce its size. The maximum allowed size can be configured using the element max-size and the eviction policy can be configured using the element eviction-policy as described above.

Eviction and expiration can be used together. In this case, the expiration configurations (time-to-live-seconds- and `max-idle-seconds) continue to work as usual cleaning out the expired entries regardless of the map size. Note that locked map entries are not the subjects for eviction and expiration.

Example Eviction Configurations
<map name="documents">
  <max-size policy="PER_NODE">10000</max-size>
  <eviction-policy>LRU</eviction-policy>
  <max-idle-seconds>60</max-idle-seconds>
</map>

In the above example, documents map starts to evict its entries from a member when the map size exceeds 10000 in that member. Then the entries least recently used will be evicted. The entries not used for more than 60 seconds will be evicted as well.

And the following is an example eviction configuration for a map having NATIVE as the in-memory format:

<map name="nativeMap*">
    <in-memory-format>NATIVE</in-memory-format>
    <eviction-policy>LFU</eviction-policy>
    <max-size policy="USED_NATIVE_MEMORY_PERCENTAGE">99</max-size>
</map>
Evicting Specific Entries

The eviction policies and configurations explained above apply to all the entries of a map. The entries that meet the specified eviction conditions are evicted.

If you want to evict some specific map entries, you can use the ttl and ttlUnit parameters of the method map.put(). An example code line is given below.

myMap.put( "1", "John", 50, TimeUnit.SECONDS )

The map entry with the key "1" will be evicted 50 seconds after it is put into myMap.

You may also use map.setTTL method to alter the time-to-live value of an existing entry. It is done as follows:

myMap.setTTL( "1", 50, TimeUnit.SECONDS )

In addition to the ttl, you may also specify a maximum idle timeout for specific map entries using the maxIdle and maxIdleUnit parameters:

myMap.put( "1", "John", 50, TimeUnit.SECONDS, 40, TimeUnit.SECONDS )

Here ttl is set as 50 seconds and maxIdle is set as 40 seconds. The entry is considered to be evicted if at least one of these policies marks it as expired. If you want to specify only the maxIdle parameter, you need to set ttl as 0 seconds.

Evicting All Entries

To evict all keys from the map except the locked ones, use the method evictAll(). If a MapStore is defined for the map, deleteAll is not called by evictAll. If you want to call the method deleteAll, use clear().

An example is given below.

final int numberOfKeysToLock = 4;
final int numberOfEntriesToAdd = 1000;

HazelcastInstance node1 = Hazelcast.newHazelcastInstance();
HazelcastInstance node2 = Hazelcast.newHazelcastInstance();

IMap<Integer, Integer> map = node1.getMap( "map" );
for (int i = 0; i < numberOfEntriesToAdd; i++) {
    map.put(i, i);
}

for (int i = 0; i < numberOfKeysToLock; i++) {
    map.lock(i);
}

// should keep locked keys and evict all others.
map.evictAll();

System.out.printf("# After calling evictAll...\n");
System.out.printf("# Expected map size\t: %d\n", numberOfKeysToLock);
System.out.printf("# Actual map size\t: %d\n", map.size());
Only EVICT_ALL event is fired for any registered listeners.
Forced Eviction

Hazelcast IMDG Enterprise

Hazelcast may use forced eviction in the cases when the eviction explained in Understanding Map Eviction is not enough to free up your memory. Note that this is valid if you are using Hazelcast IMDG Enterprise and you set your in-memory format to NATIVE.

Forced eviction mechanism is explained below as steps in the given order:

  • When the normal eviction is not enough, forced eviction is triggered and first it tries to evict approx. 20% of the entries from the current partition. It retries this five times.

  • If the result of above step is still not enough, forced eviction applies the above step to all maps. This time it might perform eviction from some other partitions too, provided that they are owned by the same thread.

  • If that is still not enough to free up your memory, it evicts not the 20% but all the entries from the current partition.

  • if that is not enough, it will evict all the entries from the other data structures; from the partitions owned by the local thread.

Finally, when all the above steps are not enough, Hazelcast throws a Native Out of Memory Exception.

Custom Eviction Policy
This section is valid for Hazelcast 3.7 and higher releases.

Apart from the policies such as LRU and LFU, which Hazelcast provides out-of-the-box, you can develop and use your own eviction policy.

To achieve this, you need to provide an implementation of MapEvictionPolicy as in the following OddEvictor example:

public class MapCustomEvictionPolicy {

    public static void main(String[] args) {
        Config config = new Config();
        config.getMapConfig("test")
                .setMapEvictionPolicy(new OddEvictor())
                .getMaxSizeConfig()
                .setMaxSizePolicy(PER_NODE).setSize(10000);

        HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
        IMap<Integer, Integer> map = instance.getMap("test");

        final Queue<Integer> oddKeys = new ConcurrentLinkedQueue<Integer>();
        final Queue<Integer> evenKeys = new ConcurrentLinkedQueue<Integer>();

        map.addEntryListener(new EntryEvictedListener<Integer, Integer>() {
            @Override
            public void entryEvicted(EntryEvent<Integer, Integer> event) {
                Integer key = event.getKey();
                if (key % 2 == 0) {
                    evenKeys.add(key);
                } else {
                    oddKeys.add(key);
                }
            }
        }, false);

        // wait some more time to receive evicted-events
        parkNanos(SECONDS.toNanos(5));

        for (int i = 0; i < 15000; i++) {
            map.put(i, i);
        }

        String msg = "IMap uses sampling based eviction. After eviction is completed, we are expecting "
                + "number of evicted-odd-keys should be greater than number of evicted-even-keys"
                + "\nNumber of evicted-odd-keys = %d, number of evicted-even-keys = %d";
        out.println(format(msg, oddKeys.size(), evenKeys.size()));

        instance.shutdown();
    }

    /**
     * Odd evictor tries to evict odd keys first.
     */
    private static class OddEvictor extends MapEvictionPolicy {

        @Override
        public int compare(EntryView o1, EntryView o2) {
            Integer key = (Integer) o1.getKey();
            if (key % 2 != 0) {
                return -1;
            }

            return 1;
        }
    }
}

Then you can enable your policy by setting it via the method MapConfig.setMapEvictionPolicy() programmatically or via XML declaratively. Following is the example declarative configuration for the eviction policy OddEvictor implemented above:

<map name="test">
   ...
   <map-eviction-policy-class-name>com.package.OddEvictor</map-eviction-policy-class-name>
   ....
</map>

If you Hazelcast with Spring, you can enable your policy as shown below.

<hz:map name="test">
    <hz:map-eviction-policy class-name="com.package.OddEvictor"/>
</hz:map>

7.2.5. Setting In-Memory Format

IMap (and a few other Hazelcast data structures, such as ICache) has an in-memory-format configuration option. By default, Hazelcast stores data into memory in binary (serialized) format. Sometimes it can be efficient to store the entries in their object form, especially in cases of local processing, such as entry processor and queries.

To set how the data will be stored in memory, set in-memory-format in the configuration. You have the following format options:

  • BINARY (default): The data (both the key and value) will be stored in serialized binary format. You can use this option if you mostly perform regular map operations, such as put and get.

  • OBJECT: The data will be stored in deserialized form. This configuration is good for maps where entry processing and queries form the majority of all operations and the objects are complex, making the serialization cost comparatively high. By storing objects, entry processing will not contain the deserialization cost. Note that when you use OBJECT as the in-memory format, the key will still be stored in binary format and the value will be stored in object format.

  • NATIVE: (Hazelcast IMDG Enterprise HD) This format behaves the same as BINARY, however, instead of heap memory, key and value will be stored in the off-heap memory.

Regular operations like get rely on the object instance. When the OBJECT format is used and a get is performed, the map does not return the stored instance, but creates a clone. Therefore, this whole get operation first includes a serialization on the member owning the instance and then a deserialization on the member calling the instance. When the BINARY format is used, only a deserialization is required; BINARY is faster.

Similarly, a put operation is faster when the BINARY format is used. If the format was OBJECT, the map would create a clone of the instance, and there would first be a serialization and then a deserialization. When BINARY is used, only a deserialization is needed.

If a value is stored in OBJECT format, a change on a returned value does not affect the stored instance. In this case, the returned instance is not the actual one but a clone. Therefore, changes made on an object after it is returned will not reflect on the actual stored data. Similarly, when a value is written to a map and the value is stored in OBJECT format, it will be a copy of the put value. Therefore, changes made on the object after it is stored will not reflect on the stored data.

7.2.6. Using High-Density Memory Store with Map

Hazelcast IMDG Enterprise HD

Hazelcast instances are Java programs. In case of BINARY and OBJECT in-memory formats, Hazelcast stores your distributed data into the heap of its server instances. Java heap is subject to garbage collection (GC). In case of larger heaps, garbage collection might cause your application to pause for tens of seconds (even minutes for really large heaps), badly affecting your application performance and response times.

As the data gets bigger, you either run the application with larger heap, which would result in longer GC pauses or run multiple instances with smaller heap which can turn into an operational nightmare if the number of such instances becomes very high.

To overcome this challenge, Hazelcast offers High-Density Memory Store for your maps. You can configure your map to use High-Density Memory Store by setting the in-memory format to NATIVE. The following snippet is the declarative configuration example.

<map name="nativeMap*">
   <in-memory-format>NATIVE</in-memory-format>
</map>

Keep in mind that you should have already enabled the High-Density Memory Store usage for your cluster. Please see Configuring High-Density Memory Store section.

Required configuration changes when using NATIVE

Note that the eviction mechanism is different for NATIVE in-memory format. The new eviction algorithm for map with High-Density Memory Store is similar to that of JCache with High-Density Memory Store and is described here.

  • Eviction percentage has no effect.

    <map name="nativeMap*">
        <in-memory-format>NATIVE</in-memory-format>
        <eviction-percentage>25</eviction-percentage> <--! NO IMPACT with NATIVE -->
    </map>
  • These IMap eviction policies for max-size cannot be used: FREE_HEAP_PERCENTAGE, FREE_HEAP_SIZE, USED_HEAP_PERCENTAGE, USED_HEAP_SIZE.

  • Near Cache eviction configuration is also different for NATIVE in-memory format. For a Near Cache configuration with in-memory format set to BINARY:

    <map name="nativeMap*">
       <near-cache>
          <in-memory-format>BINARY</in-memory-format>
          <max-size>10000</max-size> <--! NO IMPACT with NATIVE -->
          <eviction-policy>LFU</eviction-policy> <--! NO IMPACT with NATIVE -->
       </near-cache>
    </map>

    the equivalent configuration for NATIVE in-memory format would be similar to the following:

    <map name="nativeMap*">
       <near-cache>
          <in-memory-format>NATIVE</in-memory-format>
          <eviction size="10000" eviction-policy="LFU" max-size-policy="USED_NATIVE_MEMORY_SIZE"/>   <--! Correct configuration with NATIVE -->
       </near-cache>
    </map>
  • Near Cache eviction policy ENTRY_COUNT cannot be used for max-size-policy.

Please refer to the High-Density Memory Store section for more information.

7.2.7. Loading and Storing Persistent Data

Hazelcast allows you to load and store the distributed map entries from/to a persistent data store such as a relational database. To do this, you can use Hazelcast’s MapStore and MapLoader interfaces.

When you provide a MapLoader implementation and request an entry (IMap.get()) that does not exist in memory, MapLoader’s `load method will load that entry from the data store. This loaded entry is placed into the map and will stay there until it is removed or evicted.

When a MapStore implementation is provided, an entry is also put into a user defined data store.

Data store needs to be a centralized system that is accessible from all Hazelcast members. Persistence to a local file system is not supported.
Also note that the MapStore interface extends the MapLoader interface as you can see in the interface code.
Starting with Hazelcast IMDG 3.11, all loads can be listened via EntryLoadedListener.

Following is a MapStore example.

public class PersonMapStore implements MapStore<Long, Person> {

    private final Connection con;
    private final PreparedStatement allKeysStatement;

    public PersonMapStore() {
        try {
            con = DriverManager.getConnection("jdbc:hsqldb:mydatabase", "SA", "");
            con.createStatement().executeUpdate(
                    "create table if not exists person (id bigint not null, name varchar(45), primary key (id))");
            allKeysStatement = con.prepareStatement("select id from person");
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    public synchronized void delete(Long key) {
        System.out.println("Delete:" + key);
        try {
            con.createStatement().executeUpdate(
                    format("delete from person where id = %s", key));
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    public synchronized void store(Long key, Person value) {
        try {
            con.createStatement().executeUpdate(
                    format("insert into person values(%s,'%s')", key, value.getName()));
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    public synchronized void storeAll(Map<Long, Person> map) {
        for (Map.Entry<Long, Person> entry : map.entrySet()) {
            store(entry.getKey(), entry.getValue());
        }
    }

    public synchronized void deleteAll(Collection<Long> keys) {
        for (Long key : keys) {
            delete(key);
        }
    }

    public synchronized Person load(Long key) {
        try {
            ResultSet resultSet = con.createStatement().executeQuery(
                    format("select name from person where id =%s", key));
            try {
                if (!resultSet.next()) {
                    return null;
                }
                String name = resultSet.getString(1);
                return new Person(key, name);
            } finally {
                resultSet.close();
            }
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    public synchronized Map<Long, Person> loadAll(Collection<Long> keys) {
        Map<Long, Person> result = new HashMap<Long, Person>();
        for (Long key : keys) {
            result.put(key, load(key));
        }
        return result;
    }

    public Iterable<Long> loadAllKeys() {
        return new StatementIterable<Long>(allKeysStatement);
    }
}
During the initial loading process, MapStore uses a thread different from the partition threads that are used by the ExecutorService. After the initialization is completed, the map.get method looks up any nonexistent value from the database in a partition thread, or the map.put method looks up the database to return the previously associated value for a key also in a partition thread.
For more MapStore/MapLoader code samples, please see here.

Hazelcast supports read-through, write-through and write-behind persistence modes, which are explained in the subsections below.

Using Read-Through Persistence

If an entry does not exist in memory when an application asks for it, Hazelcast asks the loader implementation to load that entry from the data store. If the entry exists there, the loader implementation gets it, hands it to Hazelcast, and Hazelcast puts it into memory. This is read-through persistence mode.

Setting Write-Through Persistence

MapStore can be configured to be write-through by setting the write-delay-seconds property to 0. This means the entries will be put to the data store synchronously.

In this mode, when the map.put(key,value) call returns:

  • MapStore.store(key,value) is successfully called so the entry is persisted.

  • In-Memory entry is updated.

  • In-Memory backup copies are successfully created on other cluster members (if backup-count is greater than 0).

If MapStore throws an exception then the exception is propagated to the original put or remove call in the form of RuntimeException.

There is a key difference in the behaviors of map.remove(key) and map.delete(key), i.e., the latter results in MapStore.delete(key) to be invoked whereas the former only removes the entry from IMap.
Setting Write-Behind Persistence

You can configure MapStore as write-behind by setting the write-delay-seconds property to a value bigger than 0. This means the modified entries will be put to the data store asynchronously after a configured delay.

In write-behind mode, Hazelcast coalesces updates on a specific key by default, which means it applies only the last update on that key. However, you can set MapStoreConfig.setWriteCoalescing() to FALSE and you can store all updates performed on a key to the data store.
When you set MapStoreConfig.setWriteCoalescing() to FALSE, after you reached per-node maximum write-behind-queue capacity, subsequent put operations will fail with ReachedMaxSizeException. This exception will be thrown to prevent uncontrolled grow of write-behind queues. You can set per-node maximum capacity using the system property hazelcast.map.write.behind.queue.capacity. Please refer to the System Properties section for information on this property and how to set the system properties.

In write-behind mode, when the map.put(key,value) call returns:

  • In-Memory entry is updated.

  • In-Memory backup copies are successfully created on other cluster members (if backup-count is greater than 0).

  • The entry is marked as dirty so that after write-delay-seconds, it can be persisted with MapStore.store(key,value) call.

  • For fault tolerance, dirty entries are stored in a queue on the primary member and also on a back-up member.

The same behavior goes for the map.remove(key), the only difference is that MapStore.delete(key) is called when the entry will be deleted.

If MapStore throws an exception, then Hazelcast tries to store the entry again. If the entry still cannot be stored, a log message is printed and the entry is re-queued.

For batch write operations, which are only allowed in write-behind mode, Hazelcast will call MapStore.storeAll(map) and MapStore.deleteAll(collection) to do all writes in a single call.

If a map entry is marked as dirty, meaning that it is waiting to be persisted to the MapStore in a write-behind scenario, the eviction process forces the entry to be stored. This way you have control over the number of entries waiting to be stored, and thus you can prevent a possible OutOfMemory exception.
MapStore or MapLoader implementations should not use Hazelcast Map/Queue/MultiMap/List/Set operations. Your implementation should only work with your data store. Otherwise, you may get into deadlock situations.

Here is a sample configuration:

<hazelcast>
  ...
  <map name="default">
    ...
    <map-store enabled="true" initial-mode="LAZY">
      <class-name>com.hazelcast.examples.DummyStore</class-name>
      <write-delay-seconds>60</write-delay-seconds>
      <write-batch-size>1000</write-batch-size>
      <write-coalescing>true</write-coalescing>
    </map-store>
  </map>
</hazelcast>

The following are the descriptions of MapStore configuration elements and attributes:

  • class-name: Name of the class implementing MapLoader and/or MapStore.

  • write-delay-seconds: Number of seconds to delay to call the MapStore.store(key, value). If the value is zero then it is write-through so MapStore.store(key, value) will be called as soon as the entry is updated. Otherwise it is write-behind so updates will be stored after write-delay-seconds value by calling Hazelcast.storeAll(map). Its default value is 0.

  • write-batch-size: Used to create batch chunks when writing map store. In default mode, all map entries will be tried to be written in one go. To create batch chunks, the minimum meaningful value for write-batch-size is 2. For values smaller than 2, it works as in default mode.

  • write-coalescing: In write-behind mode, Hazelcast coalesces updates on a specific key by default; it applies only the last update on it. You can set this element to false to store all updates performed on a key to the data store.

  • enabled: True to enable this map-store, false to disable. Its default value is true.

  • initial-mode: Sets the initial load mode. LAZY is the default load mode, where load is asynchronous. EAGER means load is blocked till all partitions are loaded. Please see the Initializing Map on Startup section for more details.

Storing Entries to Multiple Maps

A configuration can be applied to more than one map using wildcards (see Using Wildcards), meaning that the configuration is shared among the maps. But MapStore does not know which entries to store when there is one configuration applied to multiple maps.

To store entries when there is one configuration applied to multiple maps, use Hazelcast’s MapStoreFactory interface. Using the MapStoreFactory interface, `MapStore`s for each map can be created when a wildcard configuration is used. Example code is shown below.

Config config = new Config();
MapConfig mapConfig = config.getMapConfig( "*" );
MapStoreConfig mapStoreConfig = mapConfig.getMapStoreConfig();
mapStoreConfig.setFactoryImplementation( new MapStoreFactory<Object, Object>() {
  @Override
  public MapLoader<Object, Object> newMapStore( String mapName, Properties properties ) {
    return null;
  }
});

To initialize the MapLoader implementation with the given map name, configuration properties and the Hazelcast instance, implement the MapLoaderLifecycleSupport interface. This interface has the methods init() and destroy().

The method init() initializes the MapLoader implementation. Hazelcast calls this method when the map is first used on the Hazelcast instance. The MapLoader implementation can initialize the required resources for implementing MapLoader such as reading a configuration file or creating a database connection.

Hazelcast calls the method destroy() before shutting down. You can override this method to cleanup the resources held by this MapLoader implementation, such as closing the database connections.

Initializing Map on Startup

To pre-populate the in-memory map when the map is first touched/used, use the MapLoader.loadAllKeys API.

If MapLoader.loadAllKeys returns NULL, then nothing will be loaded. Your MapLoader.loadAllKeys implementation can return all or some of the keys. For example, you may select and return only the keys which are most important to you that you want to load them while initializing the map. MapLoader.loadAllKeys is the fastest way of pre-populating the map since Hazelcast will optimize the loading process by having each cluster member load its owned portion of the entries.

The InitialLoadMode configuration parameter in the class MapStoreConfig has two values: LAZY and EAGER. If InitialLoadMode is set to LAZY, data is not loaded during the map creation. If it is set to EAGER, all the data is loaded while the map is created and everything becomes ready to use. Also, if you add indices to your map with the MapIndexConfig class or the addIndex method, then InitialLoadMode is overridden and MapStoreConfig behaves as if EAGER mode is on.

Here is the MapLoader initialization flow:

  1. When getMap() is first called from any member, initialization will start depending on the value of InitialLoadMode. If it is set to EAGER, initialization starts on all partitions as soon as the map is touched, i.e., all partitions will be loaded when getMap is called. If it is set to LAZY, data will be loaded partition by partition, i.e., each partition will be loaded with its first touch.

  2. Hazelcast will call MapLoader.loadAllKeys() to get all your keys on one of the members.

  3. That member will distribute keys to all other members in batches.

  4. Each member will load values of all its owned keys by calling MapLoader.loadAll(keys).

  5. Each member puts its owned entries into the map by calling IMap.putTransient(key,value).

If the load mode is LAZY and the clear() method is called (which triggers MapStore.deleteAll()), Hazelcast will remove ONLY the loaded entries from your map and datastore. Since all the data is not loaded in this case (LAZY mode), please note that there may still be entries in your datastore.*
If you do not want the MapStore start to load as soon as the first cluster member starts, you can use the system property hazelcast.initial.min.cluster.size. For example, if you set its value as 3, loading process will be blocked until all three members are completely up.*
The return type of loadAllKeys() is changed from Set to Iterable with the release of Hazelcast 3.5. MapLoader implementations from previous releases are also supported and do not need to be adapted.
Loading Keys Incrementally

If the number of keys to load is large, it is more efficient to load them incrementally rather than loading them all at once. To support incremental loading, the MapLoader.loadAllKeys() method returns an Iterable which can be lazily populated with the results of a database query.

Hazelcast iterates over the Iterable and, while doing so, sends out the keys to their respective owner members. The Iterator obtained from MapLoader.loadAllKeys() may also implement the Closeable interface, in which case Iterator is closed once the iteration is over. This is intended for releasing resources such as closing a JDBC result set.

Forcing All Keys To Be Loaded

The method loadAll loads some or all keys into a data store in order to optimize the multiple load operations. The method has two signatures; the same method can take two different parameter lists. One signature loads the given keys and the other loads all keys. Please see the example code below.

final int numberOfEntriesToAdd = 1000;
final String mapName = LoadAll.class.getCanonicalName();
final Config config = createNewConfig(mapName);
final HazelcastInstance node = Hazelcast.newHazelcastInstance(config);
final IMap<Integer, Integer> map = node.getMap(mapName);

populateMap(map, numberOfEntriesToAdd);
System.out.printf("# Map store has %d elements\n", numberOfEntriesToAdd);

map.evictAll();
System.out.printf("# After evictAll map size\t: %d\n", map.size());

map.loadAll(true);
System.out.printf("# After loadAll map size\t: %d\n", map.size());
Post-Processing Objects in Map Store

In some scenarios, you may need to modify the object after storing it into the map store. For example, you can get an ID or version auto-generated by your database and then need to modify your object stored in the distributed map, but not to break the synchronization between the database and the data grid.

To post-process an object in the map store, implement the PostProcessingMapStore interface to put the modified object into the distributed map. This will trigger an extra step of Serialization, so use it only when needed. (This is only valid when using the write-through map store configuration.)

Here is an example of post processing map store:

class ProcessingStore implements MapStore<Integer, Employee>, PostProcessingMapStore {
    @Override
    public void store( Integer key, Employee employee ) {
        EmployeeId id = saveEmployee();
        employee.setId( id.getId() );
    }
}
Please note that if you are using a post processing map store in combination with entry processors, post-processed values will not be carried to backups.
Accessing a Database Using Properties

You can prepare your own MapLoader to access a database such as Cassandra and MongoDB. For this, you can first declaratively specify the database properties in your hazelcast.xml configuration file and then implement the MapLoaderLifecycleSupport interface to pass those properties.

You can define the database properties, such as its URL and name, using the properties configuration element. The following is a configuration example for MongoDB:

<map name="supplements">
    <map-store enabled="true" initial-mode="LAZY">
        <class-name>com.hazelcast.loader.YourMapStoreImplementation</class-name>
        <properties>
            <property name="mongo.url">mongodb://localhost:27017</property>
            <property name="mongo.db">mydb</property>
            <property name="mongo.collection">supplements</property>
        </properties>
    </map-store>
</map>

After you specified the database properties in your configuration, you need to implement the MapLoaderLifecycleSupport interface and give those properties in the init() method, as shown below:

public class YourMapStoreImplementation implements MapStore<String, Supplement>, MapLoaderLifecycleSupport {

    private MongoClient mongoClient;
    private MongoCollection collection;

    public YourMapStoreImplementation() {
    }

    @Override
    public void init(HazelcastInstance hazelcastInstance, Properties properties, String mapName) {
        String mongoUrl = (String) properties.get("mongo.url");
        String dbName = (String) properties.get("mongo.db");
        String collectionName = (String) properties.get("mongo.collection");
        this.mongoClient = new MongoClient(new MongoClientURI(mongoUrl));
        this.collection = mongoClient.getDatabase(dbName).getCollection(collectionName);
    }

You can refer to the full example here.

7.2.8. Creating Near Cache for Map

The Hazelcast distributed map supports a local Near Cache for remotely stored entries to increase the performance of local read operations. Please refer to the Near Cache section for a detailed explanation of the Near Cache feature and its configuration.

7.2.9. Locking Maps

Hazelcast Distributed Map (IMap) is thread-safe to meet your thread safety requirements. When these requirements increase or you want to have more control on the concurrency, consider the Hazelcast solutions described here.

Let’s work on a sample case as shown below.

public class RacyUpdateMember {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, Value> map = hz.getMap( "map" );
        String key = "1";
        map.put( key, new Value() );
        System.out.println( "Starting" );
        for ( int k = 0; k < 1000; k++ ) {
            if ( k % 100 == 0 ) System.out.println( "At: " + k );
            Value value = map.get( key );
            Thread.sleep( 10 );
            value.amount++;
            map.put( key, value );
        }
        System.out.println( "Finished! Result = " + map.get(key).amount );
    }

    static class Value implements Serializable {
        public int amount;
    }
}

If the above code is run by more than one cluster member simultaneously, a race condition is likely. You can solve this condition with Hazelcast using either pessimistic locking or optimistic locking.

Pessimistic Locking

One way to solve the race issue is by using pessimistic locking - lock the map entry until you are finished with it.

To perform pessimistic locking, use the lock mechanism provided by the Hazelcast distributed map, i.e., the map.lock and map.unlock methods. See the below example code.

public class PessimisticUpdateMember {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, Value> map = hz.getMap( "map" );
        String key = "1";
        map.put( key, new Value() );
        System.out.println( "Starting" );
        for ( int k = 0; k < 1000; k++ ) {
            map.lock( key );
            try {
                Value value = map.get( key );
                Thread.sleep( 10 );
                value.amount++;
                map.put( key, value );
            } finally {
                map.unlock( key );
            }
        }
        System.out.println( "Finished! Result = " + map.get( key ).amount );
    }

    static class Value implements Serializable {
        public int amount;
    }
}

The IMap lock will automatically be collected by the garbage collector when the lock is released and no other waiting conditions exist on the lock.

The IMap lock is reentrant, but it does not support fairness.

Another way to solve the race issue is by acquiring a predictable Lock object from Hazelcast. This way, every value in the map can be given a lock, or you can create a stripe of locks.

Optimistic Locking

In Hazelcast, you can apply the optimistic locking strategy with the map’s replace method. This method compares values in object or data forms depending on the in-memory format configuration. If the values are equal, it replaces the old value with the new one. If you want to use your defined equals method, in-memory-format should be OBJECT. Otherwise, Hazelcast serializes objects to BINARY forms and compares them.

See the below example code.

The below example code is intentionally broken.
public class OptimisticMember {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, Value> map = hz.getMap( "map" );
        String key = "1";
        map.put( key, new Value() );
        System.out.println( "Starting" );
        for ( int k = 0; k < 1000; k++ ) {
            if ( k % 10 == 0 ) System.out.println( "At: " + k );
            for (; ; ) {
                Value oldValue = map.get( key );
                Value newValue = new Value( oldValue );
                Thread.sleep( 10 );
                newValue.amount++;
                if ( map.replace( key, oldValue, newValue ) )
                    break;
            }
        }
        System.out.println( "Finished! Result = " + map.get( key ).amount );
    }

    static class Value implements Serializable {
        public int amount;

        public Value() {
        }

        public Value( Value that ) {
            this.amount = that.amount;
        }

        public boolean equals( Object o ) {
            if ( o == this ) return true;
            if ( !( o instanceof Value ) ) return false;
            Value that = ( Value ) o;
            return that.amount == this.amount;
        }
    }
}
Pessimistic vs. Optimistic Locking

The locking strategy you choose will depend on your locking requirements.

Optimistic locking is better for mostly read-only systems. It has a performance boost over pessimistic locking.

Pessimistic locking is good if there are lots of updates on the same key. It is more robust than optimistic locking from the perspective of data consistency.

In Hazelcast, use IExecutorService to submit a task to a key owner, or to a member or members. This is the recommended way to perform task executions, rather than using pessimistic or optimistic locking techniques. IExecutorService will have fewer network hops and less data over wire, and tasks will be executed very near to the data. Please refer to the Data Affinity section.

Solving the ABA Problem

The ABA problem occurs in environments when a shared resource is open to change by multiple threads. Even if one thread sees the same value for a particular key in consecutive reads, it does not mean that nothing has changed between the reads. Another thread may change the value, do work and change the value back, while the first thread thinks that nothing has changed.

To prevent these kind of problems, you can assign a version number and check it before any write to be sure that nothing has changed between consecutive reads. Although all the other fields will be equal, the version field will prevent objects from being seen as equal. This is the optimistic locking strategy; it is used in environments that do not expect intensive concurrent changes on a specific key.

In Hazelcast, you can apply the optimistic locking strategy with the map replace method.

Lock Split-Brain Protection with Pessimistic Locking

Locks can be configured to check the number of currently present members before applying a locking operation. If the check fails, the lock operation will fail with a QuorumException (see Split-Brain Protection). As pessimistic locking uses lock operations internally, it will also use the configured lock quorum. This means that you can configure a lock quorum with the same name or a pattern that matches the map name. Note that the quorum for IMap locking actions can be different from the quorum for other IMap actions.

The following actions will then check for lock quorum before being applied:

  • IMap.lock(K) and IMap.lock(K, long, java.util.concurrent.TimeUnit)

  • IMap.isLocked()

  • IMap.tryLock(K), IMap.tryLock(K, long, java.util.concurrent.TimeUnit) and IMap.tryLock(K, long, java.util.concurrent.TimeUnit, long, java.util.concurrent.TimeUnit)

  • IMap.unlock()

  • IMap.forceUnlock()

  • MultiMap.lock(K) and MultiMap.lock(K, long, java.util.concurrent.TimeUnit)

  • MultiMap.isLocked()

  • MultiMap.tryLock(K), MultiMap.tryLock(K, long, java.util.concurrent.TimeUnit) and MultiMap.tryLock(K, long, java.util.concurrent.TimeUnit, long, java.util.concurrent.TimeUnit)

  • MultiMap.unlock()

  • MultiMap.forceUnlock()

An example of declarative configuration:

<map name="myMap">
  <quorum-ref>map-actions-quorum</quorum-ref>
</map>

<lock name="myMap">
    <quorum-ref>map-lock-actions-quorum</quorum-ref>
</lock>

Here the configured map will use the map-lock-actions-quorum quorum for map lock actions and the map-actions-quorum quorum for other map actions.

7.2.10. Accessing Entry Statistics

Hazelcast keeps statistics about each map entry, such as creation time, last update time, last access time, number of hits and version. To access the map entry statistics, use an IMap.getEntryView(key) call. Here is an example.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
EntryView entry = hz.getMap( "quotes" ).getEntryView( "1" );
System.out.println ( "size in memory  : " + entry.getCost() );
System.out.println ( "creationTime    : " + entry.getCreationTime() );
System.out.println ( "expirationTime  : " + entry.getExpirationTime() );
System.out.println ( "number of hits  : " + entry.getHits() );
System.out.println ( "lastAccessedTime: " + entry.getLastAccessTime() );
System.out.println ( "lastUpdateTime  : " + entry.getLastUpdateTime() );
System.out.println ( "version         : " + entry.getVersion() );
System.out.println ( "key             : " + entry.getKey() );
System.out.println ( "value           : " + entry.getValue() );

7.2.12. Listening to Map Entries with Predicates

You can listen to the modifications performed on specific map entries. You can think of it as an entry listener with predicates. Please see the Listening for Map Events section for information on how to add entry listeners to a map.

The default backwards-compatible event publishing strategy only publishes UPDATED events when map entries are updated to a value that matches the predicate with which the listener was registered. This implies that when using the default event publishing strategy, your listener will not be notified about an entry whose value is updated from one that matches the predicate to a new value that does not match the predicate.

Since version 3.7, when you configure Hazelcast members with property hazelcast.map.entry.filtering.natural.event.types set to true, handling of entry updates conceptually treats value transition as entry, update or exit with regards to the predicate value space. The following table compares how a listener is notified about an update to a map entry value under the default backwards-compatible Hazelcast behavior (when property hazelcast.map.entry.filtering.natural.event.types is not set or is set to false) versus when set to true:

Default

hazelcast.map.entry.filtering.natural.event.types = true

When old value matches predicate, new value does not match predicate

No event is delivered to entry listener

REMOVED event is delivered to entry listener

When old value matches predicate, new value matches predicate

UPDATED event is delivered to entry listener

UPDATED event is delivered to entry listener

When old value does not match predicate, new value does not match predicate

No event is delivered to entry listener

No event is delivered to entry listener

When old value does not match predicate, new value matches predicate

UPDATED event is delivered to entry listener

ADDED event is delivered to entry listener

As an example, let’s listen to the changes made on an employee with the surname "Smith". First, let’s create the Employee class.

public class Employee implements Serializable {

    private final String surname;

    public Employee(String surname) {
        this.surname = surname;
    }

    @Override
    public String toString() {
        return "Employee{" +
                "surname='" + surname + '\'' +
                '}';
    }
}

Then, let’s create a listener with predicate by adding a listener that tracks ADDED, UPDATED and REMOVED entry events with the surname predicate.

public class ListenerWithPredicate {

    public static void main(String[] args) {
        Config config = new Config();
        config.setProperty("hazelcast.map.entry.filtering.natural.event.types", "true");
        HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);
        IMap<String, String> map = hz.getMap("map");
        map.addEntryListener(new MyEntryListener(),
                new SqlPredicate("surname=smith"), true);
        System.out.println("Entry Listener registered");
    }

    static class MyEntryListener
            implements EntryAddedListener<String, String>,
            EntryUpdatedListener<String, String>,
            EntryRemovedListener<String, String> {
        @Override
        public void entryAdded(EntryEvent<String, String> event) {
            System.out.println("Entry Added:" + event);
        }

        @Override
        public void entryRemoved(EntryEvent<String, String> event) {
            System.out.println("Entry Removed:" + event);
        }

        @Override
        public void entryUpdated(EntryEvent<String, String> event) {
            System.out.println("Entry Updated:" + event);
        }

    }
}

And now, let’s play with the employee "smith" and see how that employee will be listened to.

public class Modify {

    public static void main(String[] args) {
        Config config = new Config();
        config.setProperty("hazelcast.map.entry.filtering.natural.event.types", "true");
        HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);
        IMap<String, Employee> map = hz.getMap("map");

        map.put("1", new Employee("smith"));
        map.put("2", new Employee("jordan"));
        System.out.println("done");
        System.exit(0);
    }
}

When you first run the class ListenerWithPredicate and then run Modify, you will see output similar to the listing below.

entryAdded:EntryEvent {Address[192.168.178.10]:5702} key=1,oldValue=null,
value=Person{name= smith }, event=ADDED, by Member [192.168.178.10]:5702
Please refer to Continuous Query Cache for more information.

7.2.13. Removing Map Entries in Bulk with Predicates

You can remove all map entries that match your predicate. For this, Hazelcast offers the method removeAll(). Its syntax is as follows:

void removeAll(Predicate<K, V> predicate);

Normally the map entries matching the predicate are found with a full scan of the map. If the entries are indexed, Hazelcast uses the index search to find them. With index, you can expect that finding the entries is faster.

When removeAll() is called, ALL entries in the caller member’s Near Cache are also removed.

7.2.14. Adding Interceptors

You can add intercept operations and execute your own business logic synchronously blocking the operations. You can change the returned value from a get operation, change the value in put, or cancel operations by throwing an exception.

Interceptors are different from listeners. With listeners, you take an action after the operation has been completed. Interceptor actions are synchronous and you can alter the behavior of operation, change its values, or totally cancel it.

Map interceptors are chained, so adding the same interceptor multiple times to the same map can result in duplicate effects. This can easily happen when the interceptor is added to the map at member initialization, so that each member adds the same interceptor. When you add the interceptor in this way, be sure to implement the hashCode() method to return the same value for every instance of the interceptor. It is not strictly necessary, but it is a good idea to also implement equals() as this will ensure that the map interceptor can be removed reliably.

The IMap API has two methods for adding and removing an interceptor to the map: addInterceptor and removeInterceptor. Please also refer to the MapInterceptor interface to see the methods used to intercept the changes in a map.

The following is an example usage.

public class MapInterceptorMember {

    public static void main(String[] args) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, String> map = hz.getMap("themap");
        map.addInterceptor(new MyMapInterceptor());

        map.put("1", "1");
        System.out.println(map.get("1"));
    }

    private static class MyMapInterceptor implements MapInterceptor {

        @Override
        public Object interceptGet(Object value) {
            return value + "-foo";
        }

        @Override
        public void afterGet(Object value) {
        }

        @Override
        public Object interceptPut(Object oldValue, Object newValue) {
            return null;
        }

        @Override
        public void afterPut(Object value) {
        }

        @Override
        public Object interceptRemove(Object removedValue) {
            return null;
        }

        @Override
        public void afterRemove(Object value) {
        }
    }
}

7.2.15. Preventing Out of Memory Exceptions

It is very easy to trigger an out of memory exception (OOME) with query-based map methods, especially with large clusters or heap sizes. For example, on a cluster with five members having 10 GB of data and 25 GB heap size per member, a single call of IMap.entrySet() fetches 50 GB of data and crashes the calling instance.

A call of IMap.values() may return too much data for a single member. This can also happen with a real query and an unlucky choice of predicates, especially when the parameters are chosen by a user of your application.

To prevent this, you can configure a maximum result size limit for query based operations. This is not a limit like SELECT * FROM map LIMIT 100, which you can achieve by a Paging Predicate. A maximum result size limit for query based operations is meant to be a last line of defense to prevent your members from retrieving more data than they can handle.

The Hazelcast component which calculates this limit is the QueryResultSizeLimiter.

Setting Query Result Size Limit

If the QueryResultSizeLimiter is activated, it calculates a result size limit per partition. Each QueryOperation runs on all partitions of a member, so it collects result entries as long as the member limit is not exceeded. If that happens, a QueryResultSizeExceededException is thrown and propagated to the calling instance.

This feature depends on an equal distribution of the data on the cluster members to calculate the result size limit per member. Therefore, there is a minimum value defined in QueryResultSizeLimiter.MINIMUM_MAX_RESULT_LIMIT. Configured values below the minimum will be increased to the minimum.

Local Pre-check

In addition to the distributed result size check in the QueryOperations, there is a local pre-check on the calling instance. If you call the method from a client, the pre-check is executed on the member that invokes the QueryOperations.

Since the local pre-check can increase the latency of a QueryOperation, you can configure how many local partitions should be considered for the pre-check, or you can deactivate the feature completely.

Scope of Result Size Limit

Besides the designated query operations, there are other operations that use predicates internally. Those method calls will throw the QueryResultSizeExceededException as well. Please see the following matrix to see the methods that are covered by the query result size limit.

Methods Covered by Query Result Size Limit
Configuring Query Result Size

The query result size limit is configured via the following system properties.

  • hazelcast.query.result.size.limit: Result size limit for query operations on maps. This value defines the maximum number of returned elements for a single query result. If a query exceeds this number of elements, a QueryResultSizeExceededException is thrown.

  • hazelcast.query.max.local.partition.limit.for.precheck: Maximum value of local partitions to trigger local pre-check for TruePredicate query operations on maps.

Please refer to the System Properties appendix to see the full descriptions of these properties and how to set them.

7.3. Queue

Hazelcast distributed queue is an implementation of java.util.concurrent.BlockingQueue. Being distributed, Hazelcast distributed queue enables all cluster members to interact with it. Using Hazelcast distributed queue, you can add an item in one cluster member and remove it from another one.

7.3.1. Getting a Queue and Putting Items

Use the Hazelcast instance’s getQueue method to get the queue, then use the queue’s put method to put items into the queue.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
BlockingQueue<MyTask> queue = hazelcastInstance.getQueue( "tasks" );
queue.put( new MyTask() );
MyTask task = queue.take();

boolean offered = queue.offer( new MyTask(), 10, TimeUnit.SECONDS );
task = queue.poll( 5, TimeUnit.SECONDS );
if ( task != null ) {
    //process task
}

FIFO ordering will apply to all queue operations across the cluster. The user objects (such as MyTask in the example above) that are enqueued or dequeued have to be Serializable.

Hazelcast distributed queue performs no batching while iterating over the queue. All items will be copied locally and iteration will occur locally.

Hazelcast distributed queue uses ItemListener to listen to the events that occur when items are added to and removed from the queue. Please refer to the Listening for Item Events section for information on how to create an item listener class and register it.

7.3.2. Creating an Example Queue

The following example code illustrates a distributed queue that connects a producer and consumer.

Putting Items on the Queue

Let’s put one integer on the queue every second, 100 integers total.

public class ProducerMember {

    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IQueue<Integer> queue = hz.getQueue( "queue" );
        for ( int k = 1; k < 100; k++ ) {
            queue.put( k );
            System.out.println( "Producing: " + k );
            Thread.sleep(1000);
        }
        queue.put( -1 );
        System.out.println( "Producer Finished!" );
    }
}

Producer puts a -1 on the queue to show that the `put`s are finished.

Taking Items off the Queue

Now, let’s create a Consumer class to take a message from this queue, as shown below.

public class ConsumerMember {

    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IQueue<Integer> queue = hz.getQueue( "queue" );
        while ( true ) {
            int item = queue.take();
            System.out.println( "Consumed: " + item );
            if ( item == -1 ) {
                queue.put( -1 );
                break;
            }
            Thread.sleep( 5000 );
        }
        System.out.println( "Consumer Finished!" );
    }
}

As seen in the above example code, Consumer waits five seconds before it consumes the next message. It stops once it receives -1. Also note that Consumer puts -1 back on the queue before the loop is ended.

When you first start Producer and then start Consumer, items produced on the queue will be consumed from the same queue.

Balancing the Queue Operations

From the above example code, you can see that an item is produced every second and consumed every five seconds. Therefore, the consumer keeps growing. To balance the produce/consume operation, let’s start another consumer. This way, consumption is distributed to these two consumers, as seen in the sample outputs below.

The second consumer is started. After a while, here is the first consumer output:

...
Consumed 13
Consumed 15
Consumer 17
...

Here is the second consumer output:

...
Consumed 14
Consumed 16
Consumer 18
...

In the case of a lot of producers and consumers for the queue, using a list of queues may solve the queue bottlenecks. In this case, be aware that the order of the messages sent to different queues is not guaranteed. Since in most cases strict ordering is not important, a list of queues is a good solution.

The items are taken from the queue in the same order they were put on the queue. However, if there is more than one consumer, this order is not guaranteed.
ItemIDs When Offering Items

Hazelcast gives an itemId for each item you offer, which is an incrementing sequence identification for the queue items. You should consider the following to understand the itemId assignment behavior:

  • When a Hazelcast member has a queue and that queue is configured to have at least one backup, and that member is restarted, the itemId assignment resumes from the last known highest itemId before the restart; itemId assignment does not start from the beginning for the new items.

  • When the whole cluster is restarted, the same behavior explained in the above consideration applies if your queue has a persistent data store (QueueStore). If the queue has QueueStore, the itemId for the new items are given, starting from the highest itemId found in the IDs returned by the method loadAllKeys. If the method loadAllKeys does not return anything, the `itemId`s will started from the beginning after a cluster restart.

  • The above two considerations mean there will be no duplicated `itemId`s in the memory or in the persistent data store.

7.3.3. Setting a Bounded Queue

A bounded queue is a queue with a limited capacity. When the bounded queue is full, no more items can be put into the queue until some items are taken out.

To turn a Hazelcast distributed queue into a bounded queue, set the capacity limit with the max-size property. You can set the max-size property in the configuration, as shown below. max-size specifies the maximum size of the queue. Once the queue size reaches this value, put operations will be blocked until the queue size goes below max-size, which happens when a consumer removes items from the queue.

Let’s set 10 as the maximum size of our example queue in Creating an Example Queue.

<hazelcast>
  ...
  <queue name="queue">
    <max-size>10</max-size>
  </queue>
  ...
</hazelcast>

When the producer is started, ten items are put into the queue and then the queue will not allow more put operations. When the consumer is started, it will remove items from the queue. This means that the producer can put more items into the queue until there are ten items in the queue again, at which point the put operation again becomes blocked.

In this example code, the producer is five times faster than the consumer. It will effectively always be waiting for the consumer to remove items before it can put more on the queue. For this example code, if maximum throughput is the goal, it would be a good option to start multiple consumers to prevent the queue from filling up.

7.3.4. Queueing with Persistent Datastore

Hazelcast allows you to load and store the distributed queue items from/to a persistent datastore using the interface QueueStore. If queue store is enabled, each item added to the queue will also be stored at the configured queue store. When the number of items in the queue exceeds the memory limit, the subsequent items are persisted in the queue store, they are not stored in the queue memory.

The QueueStore interface enables you to store, load and delete queue items with methods like store, storeAll, load and delete. The following example class includes all of the QueueStore methods.

public class TheQueueStore implements QueueStore<Item> {

    @Override
    public void delete(Long key) {
        System.out.println("delete");
    }

    @Override
    public void store(Long key, Item value) {
        System.out.println("store");
    }

    @Override
    public void storeAll(Map<Long, Item> map) {
        System.out.println("store all");
    }

    @Override
    public void deleteAll(Collection<Long> keys) {
        System.out.println("deleteAll");
    }

    @Override
    public Item load(Long key) {
        System.out.println("load");
        return null;
    }

    @Override
    public Map<Long, Item> loadAll(Collection<Long> keys) {
        System.out.println("loadALl");
        return null;
    }

    @Override
    public Set<Long> loadAllKeys() {
        System.out.println("loadAllKeys");
        return null;
    }
}

Item must be serializable. The following is an example queue store configuration.

<queue-store>
  <class-name>com.hazelcast.QueueStoreImpl</class-name>
  <properties>
    <property name="binary">false</property>
    <property name="memory-limit">1000</property>
    <property name="bulk-load">500</property>
  </properties>
</queue-store>

Let’s explain the queue store properties.

  • Binary: By default, Hazelcast stores the queue items in serialized form, and before it inserts the queue items into the queue store, it deserializes them. If you are not reaching the queue store from an external application, you might prefer that the items be inserted in binary form. Do this by setting the binary property to true: then you can get rid of the deserialization step, which is a performance optimization. The binary property is false by default.

  • Memory Limit: This is the number of items after which Hazelcast will store items only to the datastore. For example, if the memory limit is 1000, then the 1001st item will be put only to the datastore. This feature is useful when you want to avoid out-of-memory conditions. If you want to always use memory, you can set it to Integer.MAX_VALUE. The default number for memory-limit is 1000.

  • Bulk Load: When the queue is initialized, items are loaded from QueueStore in bulks. Bulk load is the size of these bulks. The default value of bulk-load is 250.

7.3.5. Split-Brain Protection for Queue

Queues can be configured to check for a minimum number of available members before applying queue operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE

    • Collection.addAll()

    • Collection.removeAll(), Collection.retainAll()

    • BlockingQueue.offer(), BlockingQueue.add(), BlockingQueue.put()

    • BlockingQueue.drainTo()

    • IQueue.poll(), Queue.remove(), IQueue.take()

    • BlockingQueue.remove()

  • READ, READ_WRITE

    • Collection.clear()

    • Collection.containsAll(), BlockingQueue.contains()

    • Collection.isEmpty()

    • Collection.iterator(), Collection.toArray()

    • Queue.peek(), Queue.element()

    • Collection.size()

    • BlockingQueue.remainingCapacity()

7.3.6. Configuring Queue

The following are examples of queue configurations. It includes the QueueStore configuration, which is explained in the Queueing with Persistent Datastore section.

Declarative:

<queue name="default">
    <max-size>0</max-size>
    <backup-count>1</backup-count>
    <async-backup-count>0</async-backup-count>
    <empty-queue-ttl>-1</empty-queue-ttl>
    <item-listeners>
        <item-listener>com.hazelcast.examples.ItemListener</item-listener>
    </item-listeners>
    <statistics-enabled>true</statistics-enabled>
    <queue-store>
        <class-name>com.hazelcast.QueueStoreImpl</class-name>
        <properties>
            <property name="binary">false</property>
            <property name="memory-limit">10000</property>
            <property name="bulk-load">500</property>
        </properties>
    </queue-store>
    <quorum-ref>quorumname</quorum-ref>
</queue>

Programmatic:

Config config = new Config();
QueueConfig queueConfig = config.getQueueConfig("default");
queueConfig.setName("MyQueue")
        .setBackupCount(1)
        .setMaxSize(0)
        .setStatisticsEnabled(true)
        .setQuorumName("quorumname");
queueConfig.getQueueStoreConfig()
        .setEnabled(true)
        .setClassName("com.hazelcast.QueueStoreImpl")
        .setProperty("binary", "false");
config.addQueueConfig(queueConfig);

Hazelcast distributed queue has one synchronous backup by default. By having this backup, when a cluster member with a queue goes down, another member having the backup of that queue will continue. Therefore, no items are lost. You can define the number of synchronous backups for a queue using the backup-count element in the declarative configuration. A queue can also have asynchronous backups: you can define the number of asynchronous backups using the async-backup-count element.

To set the maximum size of the queue, use the max-size element. To purge unused or empty queues after a period of time, use the empty-queue-ttl element. If you define a value (time in seconds) for the empty-queue-ttl element, then your queue will be destroyed if it stays empty or unused for the time in seconds that you give.

The following is the full list of queue configuration elements with their descriptions.

  • max-size: Maximum number of items in the queue. It is used to set an upper bound for the queue. You will not be able to put more items when the queue reaches to this maximum size whether you have a queue store configured or not.

  • backup-count: Number of synchronous backups. Queue is a non-partitioned data structure, so all entries of a queue reside in one partition. When this parameter is '1', it means there will be one backup of that queue in another member in the cluster. When it is '2', two members will have the backup.

  • async-backup-count: Number of asynchronous backups.

  • empty-queue-ttl: Used to purge unused or empty queues. If you define a value (time in seconds) for this element, then your queue will be destroyed if it stays empty or unused for that time.

  • item-listeners: Adds listeners (listener classes) for the queue items. You can also set the attribute include-value to true if you want the item event to contain the item values. You can set local to true if you want to listen to the items on the local member.

  • queue-store: Includes the queue store factory class name and the properties binary, memory limit and bulk load. Please refer to Queueing with Persistent Datastore.

  • statistics-enabled: If set to true, you can retrieve statistics for this queue using the method getLocalQueueStats().

  • quorum-ref : Name of quorum configuration that you want this queue to use.

7.4. MultiMap

Hazelcast MultiMap is a specialized map where you can store multiple values under a single key. Just like any other distributed data structure implementation in Hazelcast, MultiMap is distributed and thread-safe.

Hazelcast MultiMap is not an implementation of java.util.Map due to the difference in method signatures. It supports most features of Hazelcast Map except for indexing, predicates and MapLoader/MapStore. Yet, like Hazelcast Map, entries are almost evenly distributed onto all cluster members. When a new member joins the cluster, the same ownership logic used in the distributed map applies.

7.4.1. Getting a MultiMap and Putting an Entry

The following example creates a MultiMap and puts items into it. Use the HazelcastInstance getMultiMap method to get the MultiMap, then use the MultiMap put method to put an entry into the MultiMap.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
MultiMap<String , String > map = hazelcastInstance.getMultiMap( "map" );

map.put( "a", "1" );
map.put( "a", "2" );
map.put( "b", "3" );
System.out.println( "PutMember:Done" );

Now let’s print the entries in this MultiMap.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
MultiMap<String, String> map = hazelcastInstance.getMultiMap("map");

map.put("a", "1");
map.put("a", "2");
map.put("b", "3");
System.out.printf("PutMember:Done");

for (String key: map.keySet()){
    Collection<String> values = map.get(key);
    System.out.printf("%s -> %s\n", key, values);
}

After you run the first code sample, run the PrintMember sample. You will see the key a has two values, as shown below.

b → [3]

a → [2, 1]

Hazelcast MultiMap uses EntryListener to listen to events which occur when entries are added to, updated in or removed from the MultiMap. Please refer to the Listening for MultiMap Events section for information on how to create an entry listener class and register it.

7.4.2. Configuring MultiMap

When using MultiMap, the collection type of the values can be either Set or List. Configure the collection type with the valueCollectionType parameter. If you choose Set, duplicate and null values are not allowed in your collection and ordering is irrelevant. If you choose List, ordering is relevant and your collection can include duplicate and null values.

You can also enable statistics for your MultiMap with the statisticsEnabled parameter. If you enable statisticsEnabled, statistics can be retrieved with getLocalMultiMapStats() method.

Currently, eviction is not supported for the MultiMap data structure.

The following are the example MultiMap configurations.

Declarative:

<multimap name="default">
    <backup-count>0</backup-count>
    <async-backup-count>1</async-backup-count>
    <value-collection-type>SET</value-collection-type>
    <entry-listeners>
        <entry-listener include-value="false" local="false" >com.hazelcast.examples.EntryListener</entry-listener>
    </entry-listeners>
    <quorum-ref>quorumname</quorum-ref>
</multimap>

Programmatic:

MultiMapConfig mmConfig = new MultiMapConfig();
mmConfig.setName( "default" )
        .setBackupCount( 0 ).setAsyncBackupCount( 1 )
        .setValueCollectionType( "SET" )
        .setQuorumName( "quorumname" );

The following are the configuration elements and their descriptions:

  • backup-count: Defines the number of synchronous backups. For example, if it is set to 1, backup of a partition will be placed on one other member. If it is 2, it will be placed on two other members.

  • async-backup-count: The number of asynchronous backups. Behavior is the same as that of the backup-count element.

  • statistics-enabled: You can retrieve some statistics such as owned entry count, backup entry count, last update time and locked entry count by setting this parameter’s value as "true". The method for retrieving the statistics is getLocalMultiMapStats().

  • value-collection-type: Type of the value collection. It can be SET or LIST.

  • entry-listeners: Lets you add listeners (listener classes) for the map entries. You can also set the attribute include-value to true if you want the item event to contain the entry values. You can set local to true if you want to listen to the entries on the local member.

  • quorum-ref: Name of quorum configuration that you want this MultiMap to use. Please see the Split-Brain Protection for MultiMap and TransactionalMultiMap section.

7.4.3. Split-Brain Protection for MultiMap and TransactionalMultiMap

MultiMap & TransactionalMultiMap can be configured to check for a minimum number of available members before applying their operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

MultiMap:

  • WRITE, READ_WRITE:

    • clear

    • forceUnlock

    • lock

    • put

    • remove

    • tryLock

    • unlock

  • READ, READ_WRITE:

    • containsEntry

    • containsKey

    • containsValue

    • entrySet

    • get

    • isLocked

    • keySet

    • localKeySet

    • size

    • valueCount

    • values

TransactionalMultiMap:

  • WRITE, READ_WRITE:

    • put

    • remove

  • READ, READ_WRITE:

    • size

    • get

    • valueCount

Configuring Split-Brain Protection

Split-Brain protection for MultiMap can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<multimap name="default">
    ...
    <quorum-ref>quorumname</quorum-ref>
    ...
</multimap>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

7.5. Set

Hazelcast Set is a distributed and concurrent implementation of java.util.Set.

  • Hazelcast Set does not allow duplicate elements.

  • Hazelcast Set does not preserve the order of elements.

  • Hazelcast Set is a non-partitioned data structure—​all the data that belongs to a set will live on one single partition in that member.

  • Hazelcast Set cannot be scaled beyond the capacity of a single machine. Since the whole set lives on a single partition, storing a large amount of data on a single set may cause memory pressure. Therefore, you should use multiple sets to store a large amount of data. This way, all the sets will be spread across the cluster, sharing the load.

  • A backup of Hazelcast Set is stored on a partition of another member in the cluster so that data is not lost in the event of a primary member failure.

  • All items are copied to the local member and iteration occurs locally.

  • The equals method implemented in Hazelcast Set uses a serialized byte version of objects, as opposed to java.util.HashSet.

7.5.1. Getting a Set and Putting Items

Use the HazelcastInstance getSet method to get the Set, then use the add method to put items into the Set.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
ISet<String> set = hz.getSet("set");
set.add("Tokyo");
set.add("Paris");
set.add("London");
set.add("New York");
System.out.println("Putting finished!");

Hazelcast Set uses ItemListener to listen to events that occur when items are added to and removed from the Set. Please refer to the Listening for Item Events section for information on how to create an item listener class and register it.

7.5.2. Configuring Set

The following are the example set configurations.

Declarative:

<set name="default">
    <backup-count>1</backup-count>
    <async-backup-count>0</async-backup-count>
    <max-size>10</max-size>
    <item-listeners>
        <item-listener>com.hazelcast.examples.ItemListener</item-listener>
    </item-listeners>
    <quorum-ref>quorumname</quorum-ref>
</set>

Programmatic:

Config config = new Config();
CollectionConfig collectionSet = config.getSetConfig("MySet");
collectionSet.setBackupCount(1)
        .setMaxSize(10)
        .setQuorumName("quorumname");

Set configuration has the following elements.

  • statistics-enabled: True (default) if statistics gathering is enabled on the Set, false otherwise.

  • backup-count: Count of synchronous backups. Set is a non-partitioned data structure, so all entries of a Set reside in one partition. When this parameter is '1', it means there will be one backup of that Set in another member in the cluster. When it is '2', two members will have the backup.

  • async-backup-count: Count of asynchronous backups.

  • max-size: The maximum number of entries for this Set. It can be any number between 0 and Integer.MAX_VALUE. Its default value is 0, meaning there is no capacity constraint.

  • item-listeners: Lets you add listeners (listener classes) for the list items. You can also set the attributes include-value to true if you want the item event to contain the item values. You can set local to true if you want to listen to the items on the local member.

  • quorum-ref: Name of quorum configuration that you want this Set to use. Please refer to the Split-Brain Protection for ISet and TransactionalSet section.

7.5.3. Split-Brain Protection for ISet and TransactionalSet

ISet & TransactionalSet can be configured to check for a minimum number of available members before applying queue operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

ISet:

  • WRITE, READ_WRITE:

    • add

    • addAll

    • clear

    • remove

    • removeAll

  • READ, READ_WRITE:

    • contains

    • containsAll

    • isEmpty

    • iterator

    • size

    • toArray

TransactionalSet:

  • WRITE, READ_WRITE:

    • add

    • remove

  • READ, READ_WRITE:

    • size

Configuring Split-Brain Protection

Split-Brain protection for ISet can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<set name="default">
    ...
    <quorum-ref>quorumname</quorum-ref>
    ...
</set>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

7.6. List

Hazelcast List (IList) is similar to Hazelcast Set, but Hazelcast List also allows duplicate elements.

  • Besides allowing duplicate elements, Hazelcast List preserves the order of elements.

  • Hazelcast List is a non-partitioned data structure where values and each backup are represented by their own single partition.

  • Hazelcast List cannot be scaled beyond the capacity of a single machine.

  • All items are copied to local and iteration occurs locally.


While IMap and ICache are the recommended data structures to be used by Hazelcast Jet, IList can also be used by it for unit testing or similar non-production situations. Please see here in the Hazelcast Jet Reference Manual to learn how Jet can use IList, e.g., how it can fill IList with data, consume it in a Jet job and drain the results to another IList. Please also see the Fast Batch Processing and Real-Time Stream Processing use cases for Hazelcast Jet.

7.6.1. Getting a List and Putting Items

Use the HazelcastInstance getList method to get the List, then use the add method to put items into the List.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
IList<String> list = hz.getList("list");
list.add("Tokyo");
list.add("Paris");
list.add("London");
list.add("New York");
System.out.println("Putting finished!");

Hazelcast List uses ItemListener to listen to events that occur when items are added to and removed from the List. Please refer to the Listening for Item Events section for information on how to create an item listener class and register it.

7.6.2. Configuring List

The following are example list configurations.

Declarative:

<list name="default">
   <backup-count>1</backup-count>
   <async-backup-count>0</async-backup-count>
   <max-size>10</max-size>
   <item-listeners>
      <item-listener>
          com.hazelcast.examples.ItemListener
      </item-listener>
   </item-listeners>
   <quorum-ref>quorumname</quorum-ref>
</list>

Programmatic:

Config config = new Config();
CollectionConfig collectionList = config.getListConfig("MyList");
collectionList.setBackupCount(1)
        .setMaxSize(10)
        .setQuorumName("quorumname");

List configuration has the following elements.

  • statistics-enabled: True (default) if statistics gathering is enabled on the list, false otherwise.

  • backup-count: Number of synchronous backups. List is a non-partitioned data structure, so all entries of a List reside in one partition. When this parameter is '1', there will be one backup of that List in another member in the cluster. When it is '2', two members will have the backup.

  • async-backup-count: Number of asynchronous backups.

  • max-size: The maximum number of entries for this List.

  • item-listeners: Lets you add listeners (listener classes) for the list items. You can also set the attribute include-value to true if you want the item event to contain the item values. You can set the attribute local to true if you want to listen the items on the local member.

  • quorum-ref: Name of quorum configuration that you want this List to use. Please see the Split-Brain Protection for IList and TransactionalList section.

7.6.3. Split-Brain Protection for IList and TransactionalList

IList & TransactionalList can be configured to check for a minimum number of available members before applying queue operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

IList:

  • WRITE, READ_WRITE:

    • add

    • addAll

    • clear

    • remove

    • removeAll

    • set

  • READ, READ_WRITE:

    • add

    • contains

    • containsAll

    • get

    • indexOf

    • isEmpty

    • iterator

    • lastIndexOf

    • listIterator

    • size

    • subList

    • toArray

TransactionalList:

  • WRITE, READ_WRITE:

    • add

    • remove

  • READ, READ_WRITE:

    • size

Configuring Split-Brain Protection

Split-Brain protection for IList can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<list name="default">
   ...
   <quorum-ref>quorumname</quorum-ref>
   ...
</list>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

7.7. Ringbuffer

Hazelcast Ringbuffer is a replicated but not partitioned data structure that stores its data in a ring-like structure. You can think of it as a circular array with a given capacity. Each Ringbuffer has a tail and a head. The tail is where the items are added and the head is where the items are overwritten or expired. You can reach each element in a Ringbuffer using a sequence ID, which is mapped to the elements between the head and tail (inclusive) of the Ringbuffer.

7.7.1. Getting a Ringbuffer and Reading Items

Reading from Ringbuffer is simple: get the Ringbuffer with the HazelcastInstance getRingbuffer method, get its current head with the headSequence method and start reading. Use the method readOne to return the item at the given sequence; readOne blocks if no item is available. To read the next item, increment the sequence by one.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
Ringbuffer<String> ringbuffer = hz.getRingbuffer("rb");
long sequence = ringbuffer.headSequence();
while(true){
    String item = ringbuffer.readOne(sequence);
    sequence++;
    // process item
}

By exposing the sequence, you can now move the item from the Ringbuffer as long as the item is still available. If the item is not available any longer, StaleSequenceException is thrown.

7.7.2. Adding Items to a Ringbuffer

Adding an item to a Ringbuffer is also easy with the Ringbuffer add method:

Ringbuffer<String> ringbuffer = hz.getRingbuffer("SampleRB");
ringbuffer.add("someitem");

Use the method add to return the sequence of the inserted item; the sequence value will always be unique. You can use this as a very cheap way of generating unique IDs if you are already using Ringbuffer.

7.7.3. IQueue vs. Ringbuffer

Hazelcast Ringbuffer can sometimes be a better alternative than an Hazelcast IQueue. Unlike IQueue, Ringbuffer does not remove the items, it only reads items using a certain position. There are many advantages to this approach:

  • The same item can be read multiple times by the same thread. This is useful for realizing semantics of read-at-least-once or read-at-most-once.

  • The same item can be read by multiple threads. Normally you could use an IQueue per thread for the same semantic, but this is less efficient because of the increased remoting. A take from an IQueue is destructive, so the change needs to be applied for backup also, which is why a queue.take() is more expensive than a ringBuffer.read(…​).

  • Reads are extremely cheap since there is no change in the Ringbuffer. Therefore no replication is required.

  • Reads and writes can be batched to speed up performance. Batching can dramatically improve the performance of Ringbuffer.

7.7.4. Configuring Ringbuffer Capacity

By default, a Ringbuffer is configured with a capacity of 10000 items. This creates an array with a size of 10000. If a time-to-live is configured, then an array of longs is also created that stores the expiration time for every item. In a lot of cases you may want to change this capacity number to something that better fits your needs.

Below is a declarative configuration example of a Ringbuffer with a capacity of 2000 items.

<ringbuffer name="rb">
    <capacity>2000</capacity>
</ringbuffer>

Currently, Hazelcast Ringbuffer is not a partitioned data structure; its data is stored in a single partition and the replicas are stored in another partition. Therefore, create a Ringbuffer that can safely fit in a single cluster member.

7.7.5. Backing Up Ringbuffer

Hazelcast Ringbuffer has a single synchronous backup by default. You can control the Ringbuffer backup just like most of the other Hazelcast distributed data structures by setting the synchronous and asynchronous backups: backup-count and async-backup-count. In the example below, a Ringbuffer is configured with no synchronous backups and one asynchronous backup:

<ringbuffer name="rb">
    <backup-count>0</backup-count>
    <async-backup-count>1</async-backup-count>
</ringbuffer>

An asynchronous backup will probably give you better performance. However, there is a chance that the item added will be lost when the member owning the primary crashes before the backup could complete. You may want to consider batching methods if you need high performance but do not want to give up on consistency.

7.7.6. Configuring Ringbuffer Time-To-Live

You can configure Hazelcast Ringbuffer with a time-to-live in seconds. Using this setting, you can control how long the items remain in the Ringbuffer before they are expired. By default, the time-to-live is set to 0, meaning that unless the item is overwritten, it will remain in the Ringbuffer indefinitely. If you set a time-to-live and an item is added, then, depending on the Overflow Policy, either the oldest item is overwritten, or the call is rejected.

In the example below, a Ringbuffer is configured with a time-to-live of 180 seconds.

<ringbuffer name="rb">
    <time-to-live-seconds>180</time-to-live-seconds>
</ringbuffer>

7.7.7. Setting Ringbuffer Overflow Policy

Using the overflow policy, you can determine what to do if the oldest item in the Ringbuffer is not old enough to expire when more items than the configured Ringbuffer capacity are being added. The below options are currently available.

  • OverflowPolicy.OVERWRITE: The oldest item is overwritten.

  • OverflowPolicy.FAIL: The call is aborted. The methods that make use of the OverflowPolicy return -1 to indicate that adding the item has failed.

Overflow policy gives you fine control on what to do if the Ringbuffer is full. You can also use the overflow policy to apply a back pressure mechanism. The following example code shows the usage of an exponential backoff.

Random random = new Random();
HazelcastInstance hz = Hazelcast.newHazelcastInstance();
Ringbuffer<Long> rb = hz.getRingbuffer("rb");

long i = 100;
while (true) {
    long sleepMs = 100;
    for (; ; ) {
        long result = rb.addAsync(i, OverflowPolicy.FAIL).get();
        if (result != -1) {
            break;
        }
        TimeUnit.MILLISECONDS.sleep(sleepMs);
        sleepMs = min(5000, sleepMs * 2);
    }

    // add a bit of random delay to make it look a bit more realistic
    Thread.sleep(random.nextInt(10));

    System.out.println("Written: " + i);
    i++;
}

7.7.8. Ringbuffer with Persistent Datastore

Hazelcast allows you to load and store the Ringbuffer items from/to a persistent datastore using the interface RingbufferStore. If a Ringbuffer store is enabled, each item added to the Ringbuffer will also be stored at the configured Ringbuffer store.

If the Ringbuffer store is configured, you can get items with sequences which are no longer in the actual Ringbuffer but are only in the Ringbuffer store. This will probably be much slower but still allow you to continue consuming items from the Ringbuffer even if they are overwritten with newer items in the Ringbuffer.

When a Ringbuffer is being instantiated, it will check if the Ringbuffer store is configured and will request the latest sequence in the Ringbuffer store. This is to enable the Ringbuffer to start with sequences larger than the ones in the Ringbuffer store. In this case, the Ringbuffer is empty but you can still request older items from it (which will be loaded from the Ringbuffer store).

The Ringbuffer store will store items in the same format as the Ringbuffer. If the BINARY in-memory format is used, the Ringbuffer store must implement the interface RingbufferStore<byte[]> meaning that the Ringbuffer will receive items in the binary format. If the OBJECT in-memory format is used, the Ringbuffer store must implement the interface RingbufferStore<K>, where K is the type of item being stored (meaning that the Ringbuffer store will receive the deserialized object).

When adding items to the Ringbuffer, the method storeAll allows you to store items in batches.

The following example class includes all of the RingbufferStore methods.

public class TheRingbufferObjectStore implements RingbufferStore<Item> {

    @Override
    public void store(long sequence, Item data) {
        System.out.println("Object store");
    }

    @Override
    public void storeAll(long firstItemSequence, Item[] items) {
        System.out.println("Object store all");
    }

    @Override
    public Item load(long sequence) {
        System.out.println("Object load");
        return null;
    }

    @Override
    public long getLargestSequence() {
        System.out.println("Object get largest sequence");
        return -1;
    }
}

Item must be serializable. The following is an example of a Ringbuffer with the Ringbuffer store configured and enabled.

<ringbuffer name="default">
    <capacity>10000</capacity>
    <time-to-live-seconds>30</time-to-live-seconds>
    <backup-count>1</backup-count>
    <async-backup-count>0</async-backup-count>
    <in-memory-format>BINARY</in-memory-format>
    <ringbuffer-store>
        <class-name>com.hazelcast.RingbufferStoreImpl</class-name>
    </ringbuffer-store>
</ringbuffer>

Below are the explanations for the Ringbuffer store configuration elements:

  • class-name: Name of the class implementing the `RingbufferStore interface.

  • factory-class-name: Name of the class implementing the RingbufferStoreFactory interface. This interface allows a factory class to be registered instead of a class implementing the RingbufferStore interface.

Either the class-name or the factory-class-name element should be used.

7.7.9. Configuring Ringbuffer In-Memory Format

You can configure Hazelcast Ringbuffer with an in-memory format that controls the format of the Ringbuffer’s stored items. By default, BINARY in-memory format is used, meaning that the object is stored in a serialized form. You can select the OBJECT in-memory format, which is useful when filtering is applied or when the OBJECT in-memory format has a smaller memory footprint than BINARY.

In the declarative configuration example below, a Ringbuffer is configured with the OBJECT in-memory format:

<ringbuffer name="rb">
    <in-memory-format>OBJECT</in-memory-format>
</ringbuffer>

7.7.10. Configuring Split-Brain Protection for Ringbuffer

Ringbuffer can be configured to check for a minimum number of available members before applying Ringbuffer operations. This is a check to avoid performing successful Ringbuffer operations on all parts of a cluster during a network partition and can be configured using the element quorum-ref. You should set this element’s value as the quorum’s name, which you configured under the quorum element as explained in the Split-Brain Protection section. Following is an example snippet:

<ringbuffer name="rb">
    <quorum-ref>quorumname</quorum-ref>
</ringbuffer>

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • add

    • addAllAsync

    • addAsync

  • READ, READ_WRITE:

    • capacity

    • headSequence

    • readManyAsync

    • readOne

    • remainingCapacity

    • size

    • tailSequence

7.7.11. Adding Batched Items

In the previous examples, the method ringBuffer.add() is used to add an item to the Ringbuffer. The problems with this method are that it always overwrites and that it does not support batching. Batching can have a huge impact on the performance. You can use the method addAllAsync to support batching.

Please see the following example code.

List<String> items = Arrays.asList("1","2","3");
ICompletableFuture<Long> f = rb.addAllAsync(items, OverflowPolicy.OVERWRITE);
f.get();

In the above case, three strings are added to the Ringbuffer using the policy OverflowPolicy.OVERWRITE. Please see the Overflow Policy section for more information.

7.7.12. Reading Batched Items

In the previous example, the readOne method read items from the Ringbuffer. readOne is simple but not very efficient for the following reasons:

  • readOne does not use batching.

  • readOne cannot filter items at the source; the items need to be retrieved before being filtered.

The method readManyAsync can read a batch of items and can filter items at the source.

Please see the following example code.

ICompletableFuture<ReadResultSet<E>> readManyAsync(
   long startSequence,
   int minCount,
   int maxCount,
   IFunction<E, Boolean> filter);

The meanings of the readManyAsync arguments are given below.

  • startSequence: Sequence of the first item to read.

  • minCount: Minimum number of items to read. If you do not want to block, set it to 0. If you want to block for at least one item, set it to 1.

  • maxCount: Maximum number of the items to retrieve. Its value cannot exceed 1000.

  • filter: A function that accepts an item and checks if it should be returned. If no filtering should be applied, set it to null.

A full example is given below.

long sequence = rb.headSequence();
for(;;) {
    ICompletableFuture<ReadResultSet<String>> f = rb.readManyAsync(sequence, 1, 10, null);
    ReadResultSet<String> rs = f.get();
    for (String s : rs) {
        System.out.println(s);
    }
    sequence+=rs.readCount();
}

Please take a careful look at how your sequence is being incremented. You cannot always rely on the number of items being returned if the items are filtered out.

7.7.13. Using Async Methods

Hazelcast Ringbuffer provides asynchronous methods for more powerful operations like batched writing or batched reading with filtering. To make these methods synchronous, just call the method get() on the returned future.

Please see the following example code.

ICompletableFuture f = ringbuffer.addAsync(item, OverflowPolicy.FAIL);
f.get();

However, you can also use ICompletableFuture to get notified when the operation has completed. The advantage of ICompletableFuture is that the thread used for the call is not blocked till the response is returned.

Please see the below code as an example of when you want to get notified when a batch of reads has completed.

ICompletableFuture<ReadResultSet<String>> f = rb.readManyAsync(sequence, min, max, someFilter);
f.andThen(new ExecutionCallback<ReadResultSet<String>>() {
   @Override
   public void onResponse(ReadResultSet<String> response) {
        for (String s : response) {
            System.out.println("Received:" + s);
        }
   }

   @Override
   public void onFailure(Throwable t) {
        t.printStackTrace();
   }
});

7.7.14. Ringbuffer Configuration Examples

The following shows the declarative configuration of a Ringbuffer called rb. The configuration is modeled after the Ringbuffer defaults.

<ringbuffer name="rb">
    <capacity>10000</capacity>
    <backup-count>1</backup-count>
    <async-backup-count>0</async-backup-count>
    <time-to-live-seconds>0</time-to-live-seconds>
    <in-memory-format>BINARY</in-memory-format>
    <quorum-ref>quorumname</quorum-ref>
</ringbuffer>

You can also configure a Ringbuffer programmatically. The following is a programmatic version of the above declarative configuration.

Config config = new Config();
RingbufferConfig rbConfig = config.getRingbufferConfig("myRB");
rbConfig.setCapacity(10000)
        .setBackupCount(1)
        .setAsyncBackupCount(0)
        .setTimeToLiveSeconds(0)
        .setInMemoryFormat(InMemoryFormat.BINARY)
        .setQuorumName("quorumname");

7.8. Topic

Hazelcast provides a distribution mechanism for publishing messages that are delivered to multiple subscribers. This is also known as a publish/subscribe (pub/sub) messaging model. Publishing and subscribing operations are cluster wide. When a member subscribes to a topic, it is actually registering for messages published by any member in the cluster, including the new members that joined after you add the listener.

Publish operation is async. It does not wait for operations to run in remote members; it works as fire and forget.

7.8.1. Getting a Topic and Publishing Messages

Use the HazelcastInstance’s getTopic method to get the topic, then use the topic’s publish method to publish your messages. The following is a sample publisher:

public class TopicPublisher {

    public static void main(String[] args) {

        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        ITopic<Date> topic = hz.getTopic("topic");
        topic.publish(new Date());
    }
}

And here is a sample subscriber:

public class TopicSubscriber {

    public static void main(String[] args) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        ITopic<Date> topic = hz.getTopic("topic");
        topic.addMessageListener(new MessageListenerImpl());
        System.out.println("Subscribed");
    }

    private static class MessageListenerImpl implements MessageListener<Date> {
        public void onMessage(Message<Date> m) {
            System.out.println("Received: " + m.getMessageObject());
        }
    }
}

Hazelcast Topic uses the MessageListener interface to listen for events that occur when a message is received. Please refer to the Listening for Topic Messages section for information on how to create a message listener class and register it.

7.8.2. Getting Topic Statistics

Topic has two statistic variables that you can query. These values are incremental and local to the member.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
ITopic<Object> myTopic = hazelcastInstance.getTopic( "myTopicName" );

myTopic.getLocalTopicStats().getPublishOperationCount();
myTopic.getLocalTopicStats().getReceiveOperationCount();

getPublishOperationCount and getReceiveOperationCount returns the total number of published and received messages since the start of this member, respectively. Please note that these values are not backed up, so if the member goes down, these values will be lost.

You can disable this feature with topic configuration. Please see the Configuring Topic section.

These statistics values can be also viewed in Management Center. Please see the Monitoring Topics section in Hazelcast Management Center Reference Manual.

7.8.3. Understanding Topic Behavior

Each cluster member has a list of all registrations in the cluster. When a new member is registered for a topic, it sends a registration message to all members in the cluster. Also, when a new member joins the cluster, it will receive all registrations made so far in the cluster.

The behavior of a topic varies depending on the value of the configuration parameter globalOrderEnabled.

Ordering Messages as Published

If globalOrderEnabled is disabled, messages are not ordered and listeners (subscribers) process the messages in the order that the messages are published. If cluster member M publishes messages m1, m2, m3, …​, mn to a topic T, then Hazelcast makes sure that all of the subscribers of topic T will receive and process m1, m2, m3, …​, mn in the given order.

Here is how it works. Let’s say that we have three members (member1, member2 and member3) and that member1 and member2 are registered to a topic named news. Note that all three members know that member1 and member2 are registered to news.

In this example, member1 publishes two messages: a1 and a2. Member3 publishes two messages: c1 and c2. When member1 and member3 publish a message, they will check their local list for registered members, they will discover that member1 and member2 are in their lists, and then they will fire messages to those members. One possible order of the messages received could be the following.

member1c1, a1, a2, c2

member2c1, c2, a1, a2

Ordering Messages for Members

If globalOrderEnabled is enabled, all members listening to the same topic will get its messages in the same order.

Here is how it works. Let’s say that we have three members (member1, member2 and member3) and that member1 and member2 are registered to a topic named news. Note that all three members know that member1 and member2 are registered to news.

In this example, member1 publishes two messages: a1 and a2. Member3 publishes two messages: c1 and c2. When a member publishes messages over the topic news, it first calculates which partition the news ID corresponds to. Then it sends an operation to the owner of the partition for that member to publish messages. Let’s assume that news corresponds to a partition that member2 owns. member1 and member3 first sends all messages to member2. Assume that the messages are published in the following order:

member1a1, c1, a2, c2

member2 then publishes these messages by looking at registrations in its local list. It sends these messages to member1 and member2 (it makes a local dispatch for itself).

member1a1, c1, a2, c2

member2a1, c1, a2, c2

This way we guarantee that all members will see the events in the same order.

Keeping Generated and Published Order the Same

In both cases, there is a StripedExecutor in EventService that is responsible for dispatching the received message. For all events in Hazelcast, the order that events are generated and the order they are published to the user are guaranteed to be the same via this StripedExecutor.

In StripedExecutor, there are as many threads as are specified in the property hazelcast.event.thread.count (default is five). For a specific event source (for a particular topic name), hash of that source’s name % 5 gives the ID of the responsible thread. Note that there can be another event source (entry listener of a map, item listener of a collection, etc.) corresponding to the same thread. In order not to make other messages to block, heavy processing should not be done in this thread. If there is time-consuming work that needs to be done, the work should be handed over to another thread. Please see the Getting a Topic and Publishing Messages section.

7.8.4. Configuring Topic

To configure a topic, set the topic name, decide on statistics and global ordering, and set message listeners. Default values are:

  • global-ordering is false, meaning that by default, there is no guarantee of global order.

  • statistics is true, meaning that by default, statistics are calculated.

You can see the example configuration snippets below.

Declarative:

<hazelcast>
  ...
  <topic name="yourTopicName">
    <global-ordering-enabled>true</global-ordering-enabled>
    <statistics-enabled>true</statistics-enabled>
    <message-listeners>
      <message-listener>MessageListenerImpl</message-listener>
    </message-listeners>
  </topic>
  ...
</hazelcast>

Programmatic:

TopicConfig topicConfig = new TopicConfig();
topicConfig.setGlobalOrderingEnabled( true );
topicConfig.setStatisticsEnabled( true );
topicConfig.setName( "yourTopicName" );
MessageListener<String> implementation = new MessageListener<String>() {
    @Override
    public void onMessage( Message<String> message ) {
        // process the message
    }
};
topicConfig.addMessageListenerConfig( new ListenerConfig( implementation ) );
HazelcastInstance instance = Hazelcast.newHazelcastInstance();

Topic configuration has the following elements.

  • statistics-enabled: Default is true, meaning statistics are calculated.

  • global-ordering-enabled: Default is false, meaning there is no global order guarantee.

  • message-listeners: Lets you add listeners (listener classes) for the topic messages.

Besides the above elements, there are the following system properties that are topic related but not topic specific:

  • hazelcast.event.queue.capacity with a default value of 1,000,000

  • hazelcast.event.queue.timeout.millis with a default value of 250

  • hazelcast.event.thread.count with a default value of 5

For a description of these parameters, please see the Global Event Configuration section.

7.9. Reliable Topic

Reliable Topic uses the same ITopic interface as a regular topic. The main difference is that Reliable Topic is backed up by the Ringbuffer data structure. The following are the advantages of this approach:

  • Events are not lost since the Ringbuffer is configured with one synchronous backup by default.

  • Each Reliable ITopic gets its own Ringbuffer; if a topic has a very fast producer, it will not lead to problems at topics that run at a slower pace.

  • Since the event system behind a regular ITopic is shared with other data structures, e.g., collection listeners, you can run into isolation problems. This does not happen with the Reliable ITopic.

Here is a sample code snippet for a publisher using Reliable Topic:

public class PublisherMember {
    public static void main(String[] args) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        Random random = new Random();
        ITopic<Long> topic = hz.getReliableTopic("sometopic");
        long messageId = 0;

        while (true) {
            topic.publish(messageId);
            messageId++;
            System.out.println("Written: " + messageId);
            sleepMillis(random.nextInt(100));
        }
    }
    public static boolean sleepMillis(int millis) {
        try {
            MILLISECONDS.sleep(millis);
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            return false;
        }
        return true;
    }
}

And the following is a sample code snippet for the subscriber:

public class SubscribedMember {

    public static void main(String[] args) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        ITopic<Long> topic = hz.getReliableTopic("sometopic");
        topic.addMessageListener(new MessageListenerImpl());
    }

    private static class MessageListenerImpl implements MessageListener<Long> {
        public void onMessage(Message<Long> m) {
            System.out.println("Received: " + m.getMessageObject());
        }
    }
}

When you create a Reliable Topic, Hazelcast automatically creates a Ringbuffer for it. You may configure this Ringbuffer by adding a Ringbuffer config with the same name as the Reliable Topic. For instance, if you have a Reliable Topic with the name "sometopic", you should add a Ringbuffer config with the name "sometopic" to configure the backing Ringbuffer. Some of the things that you may configure is the capacity, the time-to-live for the topic messages and you can even add a Ringbuffer store which will allow you to have a persistent topic. By default, a Ringbuffer does not have any TTL (time-to-live) and it has a limited capacity; you may want to change that configuration. The following is an example configuration for the "sometopic" given above.

<hazelcast>

    <!-- This is the ringbuffer that is used by the 'sometopic' Reliable-topic. As you can see the
         ringbuffer has the same name as the topic. -->
    <ringbuffer name="sometopic">
        <capacity>1000</capacity>
        <time-to-live-seconds>5</time-to-live-seconds>
    </ringbuffer>

    <reliable-topic name="sometopic">
        <topic-overload-policy>BLOCK</topic-overload-policy>
    </reliable-topic>
</hazelcast>

Please see the Configuring Reliable Topic section below for the descriptions of all Reliable Topic configuration elements.

By default, the Reliable ITopic uses a shared thread pool. If you need a better isolation, you can configure a custom executor on the ReliableTopicConfig.

Because the reads on a Ringbuffer are not destructive, batching is easy to apply. ITopic uses read batching and reads ten items at a time (if available) by default. See Reading Batched Items for more information.

7.9.1. Slow Consumers

The Reliable ITopic provides control and a way to deal with slow consumers. It is unwise to keep events for a slow consumer in memory indefinitely since you do not know when the slow consumer is going to catch up. You can control the size of the Ringbuffer by using its capacity. For the cases when a Ringbuffer runs out of its capacity, you can specify the following policies for the TopicOverloadPolicy configuration:

  • DISCARD_OLDEST: Overwrite the oldest item, even if a TTL is set. In this case the fast producer supersedes a slow consumer.

  • DISCARD_NEWEST: Discard the newest item.

  • BLOCK: Wait until the items are expired in the Ringbuffer.

  • ERROR: Immediately throw TopicOverloadException if there is no space in the Ringbuffer.

7.9.2. Configuring Reliable Topic

The following are example Reliable Topic configurations.

Declarative:

<reliable-topic name="default">
    <statistics-enabled>true</statistics-enabled>
    <message-listeners>
        <message-listener>
        ...
        </message-listener>
    </message-listeners>
    <read-batch-size>10</read-batch-size>
    <topic-overload-policy>BLOCK</topic-overload-policy>
</reliable-topic>

Programmatic:

Config config = new Config();
ReliableTopicConfig rtConfig = config.getReliableTopicConfig( "default" );
rtConfig.setTopicOverloadPolicy( TopicOverloadPolicy.BLOCK )
        .setReadBatchSize( 10 )
        .setStatisticsEnabled( true );

Reliable Topic configuration has the following elements.

  • statistics-enabled: Enables or disables the statistics collection for the Reliable Topic. Its default value is true.

  • message-listener: Message listener class that listens to the messages when they are added or removed.

  • read-batch-size: Minimum number of messages that Reliable Topic will try to read in batches. Its default value is 10.

  • topic-overload-policy: Policy to handle an overloaded topic. Available values are DISCARD_OLDEST, DISCARD_NEWEST, BLOCK and ERROR. Its default value is BLOCK. See Slow Consumers for definitions of these policies.

7.10. Lock

ILock is the distributed implementation of java.util.concurrent.locks.Lock, meaning that if you lock using an ILock, the critical section that it guards is guaranteed to be executed by only one thread in the entire cluster, provided that there are no network failures. Even though locks are great for synchronization, they can lead to problems if not used properly. Also note that Hazelcast Lock does not support fairness.

7.10.1. Using Try-Catch Blocks with Locks

Always use locks with try-catch blocks. This will ensure that locks are released if an exception is thrown from the code in a critical section. Also note that the lock method is outside the try-catch block because we do not want to unlock if the lock operation itself fails.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
Lock lock = hazelcastInstance.getLock( "myLock" );
lock.lock();
try {
    // do something here
} finally {
    lock.unlock();
}

7.10.2. Releasing Locks with tryLock Timeout

If a lock is not released in the cluster, another thread that is trying to get the lock can wait forever. To avoid this, use tryLock with a timeout value. You can set a high value (normally it should not take that long) for tryLock. You can check the return value of tryLock as follows:

if ( lock.tryLock ( 10, TimeUnit.SECONDS ) ) {
  try {
    // do some stuff here..
  } finally {
    lock.unlock();
  }
} else {
  // warning
}

7.10.3. Avoiding Waiting Threads with Lease Time

You can also avoid indefinitely waiting threads by using lock with lease time - the lock will be released in the given lease time. The lock can be safely unlocked before the lease time expires. Note that the unlock operation can throw an IllegalMonitorStateException if the lock is released because the lease time expires. If that is the case, critical section guarantee is broken.

Please see the below example.

lock.lock( 5, TimeUnit.SECONDS )
try {
  // do some stuff here..
} finally {
  try {
    lock.unlock();
  } catch ( IllegalMonitorStateException ex ){
    // WARNING Critical section guarantee can be broken
  }
}

You can also specify a lease time when trying to acquire a lock: tryLock(time, unit, leaseTime, leaseUnit). In that case, it tries to acquire the lock within the specified lease time. If the lock is not available, the current thread becomes disabled for thread scheduling purposes until either it acquires the lock or the specified waiting time elapses. Note that this lease time cannot be longer than the time you specify with the system property hazelcast.lock.max.lease.time.seconds. Please see the System Properties appendix to see the description of this property and to learn how to set a system property.

7.10.4. Understanding Lock Behavior

  • Locks are fail-safe. If a member holds a lock and some other members go down, the cluster will keep your locks safe and available. Moreover, when a member leaves the cluster, all the locks acquired by that dead member will be removed so that those locks are immediately available for live members.

  • Locks are re-entrant. The same thread can lock multiple times on the same lock. Note that for other threads to be able to require this lock, the owner of the lock must call unlock as many times as the owner called lock.

  • In the split-brain scenario, the cluster behaves as if it were two different clusters. Since two separate clusters are not aware of each other, two members from different clusters can acquire the same lock. For more information on places where split-brain syndrome can be handled, please see Split-Brain syndrome.

  • Locks are not automatically removed. If a lock is not used anymore, Hazelcast will not automatically garbage collect the lock. This can lead to an OutOfMemoryError. If you create locks on the fly, make sure they are destroyed.

  • Hazelcast IMap also provides locking support on the entry level with the method IMap.lock(key). Although the same infrastructure is used, IMap.lock(key) is not an ILock and it is not possible to expose it directly.

ILock vs. IMap.lock

ILock-based locks use system resources even when they are not acquired. You have to call destroy(), but this can have a side effect when another thread is trying to acquire it. ILock-based lock is a good choice when you have a limited number of locks. The IMap-based locks are auto-destructed. They use no resources when they are not acquired. This implies they are invisible for monitoring unless they are being hold by some thread.

7.10.5. Synchronizing Threads with ICondition

ICondition is the distributed implementation of the notify, notifyAll and wait operations on the Java object. You can use it to synchronize threads across the cluster. More specifically, you use ICondition when a thread’s work depends on another thread’s output. A good example is producer/consumer methodology.

Please see the below code examples for a producer/consumer implementation.

Producer thread:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
Lock lock = hazelcastInstance.getLock( "myLockId" );
ICondition condition = lock.newCondition( "myConditionId" );

lock.lock();
try {
  while ( !shouldProduce() ) {
    condition.await(); // frees the lock and waits for signal
                       // when it wakes up it re-acquires the lock
                       // if available or waits for it to become
                       // available
  }
  produce();
  condition.signalAll();
} finally {
  lock.unlock();
}
The method await() takes time value and time unit as arguments. If you specify a negative value for the time, it is interpreted as infinite.

Consumer thread:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
Lock lock = hazelcastInstance.getLock( "myLockId" );
ICondition condition = lock.newCondition( "myConditionId" );

lock.lock();
try {
  while ( !canConsume() ) {
    condition.await(); // frees the lock and waits for signal
                       // when it wakes up it re-acquires the lock if
                       // available or waits for it to become
                       // available
  }
  consume();
  condition.signalAll();
} finally {
  lock.unlock();
}

7.10.6. Split-Brain Protection for Lock

Locks can be configured to check for a minimum number of available members before applying lock operations (see Split-Brain Protection). This is a check to avoid performing successful lock operations on all parts of a cluster during a network partition. Due to the implementation details, the check does not guarantee that the lock will fail in all conditions of a network partition and it can happen that two members can acquire the same lock.

Once the membership change has been detected, the lock operations will fail with a QuorumException if enough size of members are not present. In essence, this does not provide correctness but rather narrows down the window of opportunity in which locks can continue operations on several members concurrently.

Although the check does not provide correctness it can still be useful. In cases where members acquire the lock to perform some costly but idempotent operation, configuring lock quorum can further prevent some cases where the cluster has been split into several sub-clusters and more than one member perform the same operation.

Following is a list of methods that now support quorum checks. The list is grouped by quorum type. Additionally, since Hazelcast IMap also provides locking support, certain map and multimap methods also allow quorum checks.

  • WRITE, READ_WRITE

    • Condition.await(()), Condition.awaitUninterruptibly(), Condition.awaitNanos(), Condition.awaitUntil()

    • Lock.lockInterruptibly(), ILock.lock(), IMap.tryLock(), IMap.lock(), MultiMap.lock(), MultiMap.tryLock()

    • Condition.signal(), Condition.signalAll()

  • READ, READ_WRITE

    • ILock.getLockCount()

    • ILock.getRemainingLeaseTime()

    • ILock.isLocked(), IMap.isLocked(), MultiMap.isLocked()

    • ILock.forceUnlock(), IMap.forceUnlock(), MultiMap.forceUnlock(), ILock.unlock(), IMap.unlock(), ObjectMultiMapProxy.unlock()

7.10.7. Lock Configuration

As mentioned in the Split-brain Protection for Lock section, Lock allows for split-brain protection.

An example of declarative configuration is as follows:

<lock name="myLock">
    <quorum-ref>quorum-name</quorum-ref>
</lock>
  • quorum-ref : Name of quorum configuration that you want this lock to use.

An example of programmatic configuration is as follows:

Config config = new Config();
LockConfig lockConfig = new LockConfig();
lockConfig.setName("myLock")
        .setQuorumName("quorum-name");
config.addLockConfig(lockConfig);
As mentioned above, a quorum definition for a lock that has the same name or a pattern that matches a map name will force the map locking actions to use the defined quorum. It is important to keep this in mind when using lock quorum and map locking actions.

7.11. IAtomicLong

Hazelcast IAtomicLong is the distributed implementation of java.util.concurrent.atomic.AtomicLong. It offers most of AtomicLong’s operations such as get, set, getAndSet, compareAndSet and incrementAndGet. Since IAtomicLong is a distributed implementation, these operations involve remote calls and thus their performances differ from AtomicLong.

The following example code creates an instance, increments it by a million and prints the count.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
IAtomicLong counter = hazelcastInstance.getAtomicLong( "counter" );
for ( int k = 0; k < 1000 * 1000; k++ ) {
    if ( k % 500000 == 0 ) {
        System.out.println( "At: " + k );
    }
    counter.incrementAndGet();
}
System.out.printf( "Count is %s\n", counter.get() );

When you start other instances with the code above, you will see the count as member count times a million.

7.11.1. Sending Functions to IAtomicLong

You can send functions to an IAtomicLong. IFunction is a Hazelcast owned, single method interface. The following sample IFunction implementation adds two to the original value.

private static class Add2Function implements IFunction<Long, Long> {
    @Override
    public Long apply( Long input ) {
        return input + 2;
    }
}

7.11.2. Executing Functions on IAtomicLong

You can use the following methods to execute functions on IAtomicLong.

  • apply: Applies the function to the value in IAtomicLong without changing the actual value and returning the result.

  • alter: Alters the value stored in the IAtomicLong by applying the function. It will not send back a result.

  • alterAndGet: Alters the value stored in the IAtomicLong by applying the function, storing the result in the IAtomicLong and returning the result.

  • getAndAlter: Alters the value stored in the IAtomicLong by applying the function and returning the original value.

The following sample code includes these methods.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
IAtomicLong atomicLong = hazelcastInstance.getAtomicLong( "counter" );

atomicLong.set( 1 );
long result = atomicLong.apply( new Add2Function() );
System.out.println( "apply.result: " + result);
System.out.println( "apply.value: " + atomicLong.get() );

atomicLong.set( 1 );
atomicLong.alter( new Add2Function() );
System.out.println( "alter.value: " + atomicLong.get() );

atomicLong.set( 1 );
result = atomicLong.alterAndGet( new Add2Function() );
System.out.println( "alterAndGet.result: " + result );
System.out.println( "alterAndGet.value: " + atomicLong.get() );

atomicLong.set( 1 );
result = atomicLong.getAndAlter( new Add2Function() );
System.out.println( "getAndAlter.result: " + result );
System.out.println( "getAndAlter.value: " + atomicLong.get() );

The output of the above class when run is as follows:

apply.result: 3
apply.value: 1
alter.value: 3
alterAndGet.result: 3
alterAndGet.value: 3
getAndAlter.result: 1
getAndAlter.value: 3

7.11.3. Reasons to Use Functions with IAtomic

The reason for using a function instead of a simple code line like atomicLong.set(atomicLong.get() + 2)); is that the IAtomicLong read and write operations are not atomic. Since IAtomicLong is a distributed implementation, those operations can be remote ones, which may lead to race problems. By using functions, the data is not pulled into the code, but the code is sent to the data. This makes it more scalable.

IAtomicLong has one synchronous backup and no asynchronous backups. Its backup count is not configurable.

7.11.4. Split-Brain Protection for IAtomicLong

IAtomicLong can be configured to check for a minimum number of available members before applying its operations (see Split-Brain Protection). This is a check to avoid performing successful operations on all parts of a cluster during a network partition. Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • addAndGet

    • addAndGetAsync

    • alter

    • alterAndGet

    • alterAndGetAsync

    • alterAsync

    • apply

    • applyAsync

    • compareAndSet

    • compareAndSetAsync

    • decrementAndGet

    • decrementAndGetAsync

    • getAndAdd

    • getAndAddAsync

    • getAndAlter

    • getAndAlterAsync

    • getAndIncrement

    • getAndIncrementAsync

    • getAndSet

    • getAndSetAsync

    • incrementAndGet

    • incrementAndGetAsync

    • set

    • setAsync

  • READ, READ_WRITE:

    • get

    • getAsync

Configuring Split-Brain Protection

Split-Brain protection for IAtomicLong can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<atomic-long name="default">
   ...
   <quorum-ref>quorumname</quorum-ref>
   ...
</atomic-long>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

7.12. ISemaphore

Hazelcast ISemaphore is the distributed implementation of java.util.concurrent.Semaphore.

7.12.1. Controlling Thread Counts with Permits

Semaphores offer permits to control the thread counts when performing concurrent activities. To execute a concurrent activity, a thread grants a permit or waits until a permit becomes available. When the execution is completed, the permit is released.

ISemaphore with a single permit may be considered as a lock. Unlike the locks, when semaphores are used, any thread can release the permit, and semaphores can have multiple permits.
Hazelcast ISemaphore does not support fairness at all times. There are some edge cases where the fairness is not honored, e.g., when the permit becomes available at the time when an internal timeout occurs.

When a permit is acquired on ISemaphore:

  • if there are permits, the number of permits in the semaphore is decreased by one and the calling thread performs its activity. If there is contention, the longest waiting thread will acquire the permit before all other threads.

  • if no permits are available, the calling thread blocks until a permit becomes available. When a timeout happens during this block, the thread is interrupted.

7.12.2. Example Semaphore Code

The following example code uses an IAtomicLong resource 1000 times, increments the resource when a thread starts to use it and decrements it when the thread completes.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
ISemaphore semaphore = hazelcastInstance.getSemaphore( "semaphore" );
IAtomicLong resource = hazelcastInstance.getAtomicLong( "resource" );
for ( int k = 0 ; k < 1000 ; k++ ) {
    System.out.println( "At iteration: " + k + ", Active Threads: " + resource.get() );
    semaphore.acquire();
    try {
        resource.incrementAndGet();
        Thread.sleep( 1000 );
        resource.decrementAndGet();
    } finally {
        semaphore.release();
    }
}
System.out.println("Finished");

Let’s limit the concurrent access to this resource by allowing at most three threads. You can configure it declaratively by setting the initial-permits property, as shown below.

<semaphore name="semaphore">
  <initial-permits>3</initial-permits>
</semaphore>
If there is a shortage of permits while the semaphore is being created, value of this property can be set to a negative number.

If you execute the above SemaphoreMember class 5 times, the output will be similar to the following:

At iteration: 0, Active Threads: 1

At iteration: 1, Active Threads: 2

At iteration: 2, Active Threads: 3

At iteration: 3, Active Threads: 3

At iteration: 4, Active Threads: 3

As you can see, the maximum count of concurrent threads is equal or smaller than three. If you remove the semaphore acquire/release statements in SemaphoreMember, you will see that there is no limitation on the number of concurrent usages.

Hazelcast also provides backup support for ISemaphore. When a member goes down, you can have another member take over the semaphore with the permit information (permits are automatically released when a member goes down). To enable this, configure synchronous or asynchronous backup with the properties backup-count and async-backup-count (by default, synchronous backup is already enabled).

7.12.3. Configuring Semaphore

The following are example semaphore configurations.

Declarative:

<semaphore name="semaphore">
   <backup-count>1</backup-count>
   <async-backup-count>0</async-backup-count>
   <initial-permits>3</initial-permits>
   <quorum-ref>quorumname</quorum-ref>
</semaphore>

Programmatic:

Config config = new Config();
SemaphoreConfig semaphoreConfig = config.getSemaphoreConfig("MySemaphore");
semaphoreConfig.setName( "semaphore" ).setBackupCount( 1 )
        .setInitialPermits( 3 )
        .setQuorumName( "quorumname" );

Semaphore configuration has the below elements.

  • initial-permits: Thread count to which the concurrent access is limited. For example, if you set it to "3", concurrent access to the object is limited to 3 threads.

  • backup-count: Number of synchronous backups.

  • async-backup-count: Number of asynchronous backups.

  • quorum-ref: Name of quorum configuration that you want this Semaphore to use. Please see the Split-Brain Protection for ISemaphore section below.

If high performance is more important than not losing the permit information, you can disable the backups by setting backup-count to 0.

7.12.4. Split-Brain Protection for ISemaphore

ISemaphore can be configured to check for a minimum number of available members before applying its operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • acquire

    • drainPermits

    • init

    • reducePermits

    • release

    • tryAcquire

  • READ, READ_WRITE:

    • availablePermits

Configuring Split-Brain Protection

Split-Brain protection for ISemaphore can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<semaphore name="default">
   ...
   <quorum-ref>quorumname</quorum-ref>
   ...
</semaphore>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

7.13. IAtomicReference

The IAtomicLong is very useful if you need to deal with a long, but in some cases you need to deal with a reference. That is why Hazelcast also supports the IAtomicReference which is the distributed version of the java.util.concurrent.atomic.AtomicReference.

Here is an IAtomicReference example.

Config config = new Config();

HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);

IAtomicReference<String> ref = hz.getAtomicReference("reference");
ref.set("foo");
System.out.println(ref.get());
System.exit(0);

When you execute the above example, you will see the following output.

foo

7.13.1. Sending Functions to IAtomicReference

Just like IAtomicLong, IAtomicReference has methods that accept a 'function' as an argument, such as alter, alterAndGet, getAndAlter and apply. There are two big advantages of using these methods:

  • From a performance point of view, it is better to send the function to the data then the data to the function. Often the function is a lot smaller than the data and therefore cheaper to send over the line. Also the function only needs to be transferred once to the target machine and the data needs to be transferred twice.

  • You do not need to deal with concurrency control. If you would perform a load, transform, store, you could run into a data race since another thread might have updated the value you are about to overwrite.

7.13.2. Using IAtomicReference

Below are some considerations you need to know when you use IAtomicReference.

  • IAtomicReference works based on the byte-content and not on the object-reference. If you use the compareAndSet method, do not change to original value because its serialized content will then be different. It is also important to know that if you rely on Java serialization, sometimes (especially with hashmaps) the same object can result in different binary content.

  • IAtomicReference will always have one synchronous backup.

  • All methods returning an object will return a private copy. You can modify the private copy, but the rest of the world will be shielded from your changes. If you want these changes to be visible to the rest of the world, you need to write the change back to the IAtomicReference; but be careful about introducing a data-race.

  • The 'in-memory format' of an IAtomicReference is binary. The receiving side does not need to have the class definition available unless it needs to be deserialized on the other side, e.g., because a method like 'alter' is executed. This deserialization is done for every call that needs to have the object instead of the binary content, so be careful with expensive object graphs that need to be deserialized.

  • If you have an object with many fields or an object graph and you only need to calculate some information or need a subset of fields, you can use the apply method. With the apply method, the whole object does not need to be sent over the line; only the information that is relevant is sent.

7.13.3. Split-Brain Protection for IAtomicReference

IAtomicReference can be configured to check for a minimum number of available members before applying its operations (see Split-Brain Protection). This is a check to avoid performing successful operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • alter

    • alterAndGet

    • alterAndGetAsync

    • alterAsync

    • apply

    • applyAsync

    • clear

    • clearAsync

    • compareAndSet

    • compareAndSetAsync

    • getAndAlter

    • getAndAlterAsync

    • getAndSet

    • getAndSetAsync

    • set

    • setAndGet

    • setAsync

  • READ, READ_WRITE:

    • contains

    • containsAsync

    • get

    • getAsync

    • isNull

    • isNullAsync

Configuring Split-Brain Protection

Split-Brain protection for IAtomicReference can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<atomic-reference name="default">
   ...
   <quorum-ref>quorumname</quorum-ref>
   ...
</atomic-reference>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

7.14. ICountDownLatch

Hazelcast ICountDownLatch is the distributed implementation of java.util.concurrent.CountDownLatch.

7.14.1. Gate-Keeping Concurrent Activities

CountDownLatch is considered to be a gate keeper for concurrent activities. It enables the threads to wait for other threads to complete their operations.

The following code samples describe the mechanism of ICountDownLatch. Assume that there is a leader process and there are follower processes that will wait until the leader completes. Here is the leader:

public class Leader {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
        ICountDownLatch latch = hazelcastInstance.getCountDownLatch( "countDownLatch" );
        System.out.println( "Starting" );
        latch.trySetCount( 1 );
        Thread.sleep( 30000 );
        latch.countDown();
        System.out.println( "Leader finished" );
        latch.destroy();
    }
}

Since only a single step is needed to be completed as a sample, the above code initializes the latch with 1. Then, the code sleeps for a while to simulate a process and starts the countdown. Finally, it clears up the latch. Let’s write a follower:

public class Follower {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
        ICountDownLatch latch = hazelcastInstance.getCountDownLatch( "countDownLatch" );
        System.out.println( "Waiting" );
        boolean success = latch.await( 10, TimeUnit.SECONDS );
        System.out.println( "Complete: " + success );
    }
}

The follower class above first retrieves ICountDownLatch and then calls the await method to enable the thread to listen for the latch. The method await has a timeout value as a parameter. This is useful when the countDown method fails. To see ICountDownLatch in action, start the leader first and then start one or more followers. You will see that the followers will wait until the leader completes.

7.14.2. Recovering From Failure

In a distributed environment, the counting down cluster member may go down. In this case, all listeners are notified immediately and automatically by Hazelcast. The state of the current process just before the failure should be verified and 'how to continue now' should be decided, e.g., restart all process operations, continue with the first failed process operation and throw an exception.

7.14.3. Using ICountDownLatch

Although the ICountDownLatch is a very useful synchronization aid, you will probably not use it on a daily basis. Unlike Java’s implementation, Hazelcast’s ICountDownLatch count can be reset after a countdown has finished, but not during an active count.

ICountDownLatch has 1 synchronous backup and no asynchronous backups. Its backup count is not configurable. Also, the count cannot be re-set during an active count, it should be re-set after the countdown is finished.

7.14.4. Split-Brain Protection for ICountDownLatch

ICountDownLatch can be configured to check for a minimum number of available members before applying ICountDownLatch operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • countDown

    • trySetCount

  • READ, READ_WRITE:

    • await

    • getCount

Configuring Split-Brain Protection

Split-Brain protection for ICountDownLatch can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<count-down-latch name="default">
   ...
   <quorum-ref>quorumname</quorum-ref>
   ...
</count-down-latch>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

7.15. PN Counter

A Conflict-free Replicated Data Type (CRDT) is a distributed data structure that achieves high availability by relaxing consistency constraints. There may be several replicas for the same data and these replicas can be modified concurrently without coordination. This means that you may achieve high throughput and low latency when updating a CRDT data structure. On the other hand, all of the updates are replicated asynchronously. Each replica will then receive updates made on other replicas eventually and if no new updates are done, all replicas which can communicate to each other will return the same state (converge) after some time.

Hazelcast offers a lightweight CRDT PN counter (Positive-Negative Counter) implementation where each Hazelcast instance can increment and decrement the counter value and these updates are propagated to all replicas. Only a Hazelcast member can store state for a counter which means that counter method invocations performed on a Hazelcast member are usually local (depending on the configured replica count). If there is no member failure, it is guaranteed that each replica sees the final value of the counter eventually. Counter’s state converges with each update and all CRDT replicas that can communicate to each other will eventually have the same state.

Using the PN Counter, you can get a distributed counter, increment and decrement it, and query its value with RYW (read-your-writes) and monotonic reads. The implementation borrows most methods from the AtomicLong which should be familiar in most cases and easily interchangeable in the existing code.

Some examples of PN counter are:

  • counting the number of "likes" or "+1"

  • counting the number of logged in users

  • counting the number of page hits/views

How it works

The counter supports adding and subtracting values as well as retrieving the current counter value. Each replica of this counter can perform operations locally without coordination with the other replicas, thus increasing availability. The counter guarantees that whenever two members have received the same set of updates, possibly in a different order, their state is identical, and any conflicting updates are merged automatically. If no new updates are made to the shared state, all members that can communicate will eventually have the same data.

The updates to the counter are applied locally when invoked on a CRDT replica. A CRDT replica can be any Hazelcast instance which is NOT a client or a lite member. You can configure the number of replicas in the cluster using the replica-count configuration element.

When invoking updates from a non-replica instance, the invocation is remote. This may lead to indeterminate state - the update may be applied but the response has not been received. In this case, the caller will be notified with a TargetDisconnectedException when invoked from a client or a MemberLeftException when invoked from a member.

The read and write methods provide monotonic read and RYW (read-your-write) guarantees. These guarantees are session guarantees which means that if no replica with the previously observed state is reachable, the session guarantees are lost and the method invocation will throw a ConsistencyLostException. This does not mean that an update is lost. All of the updates are part of some replica and will be eventually reflected in the state of all other replicas. This exception just means that you cannot observe your own writes because all replicas that contain your updates are currently unreachable. After you have received a ConsistencyLostException, you can either wait for a sufficiently up-to-date replica to become reachable in which case the session can be continued or you can reset the session by calling the method `reset(). If you have called this method, a new session is started with the next invocation to a CRDT replica.

The CRDT state is kept entirely on non-lite (data) members. If there aren’t any and the methods here are invoked on a lite member, they fail with a NoDataMemberInClusterException.

The following is an example code.

final HazelcastInstance instance = Hazelcast.newHazelcastInstance();
final PNCounter counter = instance.getPNCounter("counter");
counter.addAndGet(5);
final long value = counter.get();

This code snippet creates an instance of a PN counter, increments it by 5 and retrieves the value.

7.15.1. Configuring PN Counter

Following is an example declarative configuration snippet:

<hazelcast>
  <pn-counter name="default">
    <replica-count>10</replica-count>
    <statistics-enabled>true</statistics-enabled>
  </pn-counter>
</hazelcast>
  • name: Name of your PN Counter.

  • replica-count: Number of replicas on which state for this PN counter will be kept. This number applies in quiescent state, if there are currently membership changes or clusters are merging, the state may be temporarily kept on more replicas. Its default value is Integer.MAX_VALUE. Generally, keeping the state on more replicas means that more Hazelcast members will be able to perform updates locally but it also means that the PN counter state will be kept on more replicas, increasing the network traffic, decreasing the speed at which replica states converge and increasing the size of the PN counter state kept on each replica.

  • statistics-enabled: Specifies whether the statistics gathering will be enabled for your PN Counter. If set to false, you cannot collect statistics in your implementation and also Hazelcast Management Center will not show them. Its default value is true.

Following is an equivalent snippet of Java configuration:

PNCounterConfig pnCounterConfig = new PNCounterConfig("default")
        .setReplicaCount(10)
        .setStatisticsEnabled(true);
Config hazelcastConfig = new Config()
        .addPNCounterConfig(pnCounterConfig);

7.15.2. Configuring the CRDT replication mechanism

Configuring the replication mechanism is for advanced use cases only - usually the default configuration will work fine for most cases.

In some cases, you may want to configure the replication mechanism for all CRDT implementations. The CRDT states are replicated in rounds (the period is configurable) and in each round the state is replicated up to the configured number of members. Generally speaking, you may increase the speed at which replicas converge at the expense of more network traffic or decrease the network traffic at the expense of slower convergence of replicas. Hazelcast implements the state-based replication mechanism - the CRDT state for changed CRDTs is replicated in its entirety to other replicas on each replication round.

<hazelcast>
  <crdt-replication>
      <max-concurrent-replication-targets>1</max-concurrent-replication-targets>
      <replication-period-millis>1000</replication-period-millis>
  </crdt-replication>
</hazelcast>
  • max-concurrent-replication-targets: The maximum number of target members that we replicate the CRDT states to in one period. A higher count will lead to states being disseminated more rapidly at the expense of burst-like behavior - one update to a CRDT will lead to a sudden burst in the number of replication messages in a short time interval. Its default value is 1 which means that each replica will replicate state to only one other replica in each replication round.

  • replication-period-millis: The period between two replications of CRDT states in milliseconds. A lower value will increase the speed at which changes are disseminated to other cluster members at the expense of burst-like behavior - less updates will be batched together in one replication message and one update to a CRDT may cause a sudden burst of replication messages in a short time interval. The value must be a positive non-null integer. Its default value is 1000 milliseconds which means that the changed CRDT state is replicated every 1 second.

Following is an equivalent snippet of Java configuration:

final CRDTReplicationConfig crdtReplicationConfig = new CRDTReplicationConfig()
        .setMaxConcurrentReplicationTargets(1)
        .setReplicationPeriodMillis(1000);
Config hazelcastConfig = new Config()
        .setCRDTReplicationConfig(crdtReplicationConfig);

7.16. IdGenerator

Hazelcast IdGenerator is used to generate cluster-wide unique identifiers. Generated identifiers are long type primitive values between 0 and Long.MAX_VALUE.

Feature is deprecated. The implementation can produce duplicate IDs in case of a network split, even with split-brain protection enabled (during short window while split-brain is detected). Please use FlakeIdGenerator for an alternative implementation which does not suffer from the issue. Also see the Migration guide at the end of this section.

7.16.1. Generating Cluster-Wide IDs

ID generation occurs almost at the speed of AtomicLong.incrementAndGet(). A group of 10,000 identifiers is allocated for each cluster member. In the background, this allocation takes place with an IAtomicLong incremented by 10,000. Once a cluster member generates IDs (allocation is done), IdGenerator increments a local counter. If a cluster member uses all IDs in the group, it will get another 10,000 IDs. This way, only one time of network traffic is needed, meaning that 9,999 identifiers are generated in memory instead of over the network. This is fast.

Let’s write a sample identifier generator.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
IdGenerator idGen = hazelcastInstance.getIdGenerator( "newId" );
while (true) {
    Long id = idGen.newId();
    System.err.println( "Id: " + id );
    Thread.sleep( 1000 );
}

Let’s run the above code two times. The output will be similar to the following.

Members [1] {
  Member [127.0.0.1]:5701 this
}
Id: 1
Id: 2
Id: 3
Members [2] {
  Member [127.0.0.1]:5701
  Member [127.0.0.1]:5702 this
}
Id: 10001
Id: 10002
Id: 10003

7.16.2. Unique IDs and Duplicate IDs

You can see that the generated IDs are unique and counting upwards. If you see duplicated identifiers, it means your instances could not form a cluster.

Generated IDs are unique during the life cycle of the cluster. If the entire cluster is restarted, IDs start from 0, again or you can initialize to a value using the init() method of IdGenerator.
IdGenerator has one synchronous backup and no asynchronous backups. Its backup count is not configurable.

7.16.3. Migrating to FlakeIdGenerator

The Flake ID generator provides similar features with more safety guarantees during network splits. The two generators are completely different implementations, but both types of generator generate roughly ordered IDs. So in order to ensure uniqueness of the generated IDs, we can force the Flake ID generator to start at least where the old generator ended. This is likely the case, because the values from Flake generator are quite large compared to values from the old generator. Anyway, here are the steps you need to take:

  • Make sure the version of your Hazelcast cluster and of all clients is at least 3.10.

  • If the current ID from old IdGenerator is higher than the ID from FlakeIdGenerator, you need to configure ID offset. See FlakeIdMigrationSample for mor details.

  • Replace all calls to HazelcastInstance.getIdGenerator() with HazelcastInstance.getFlakeIdGenerator(). If you use Spring configuration, replace <id-generator> with <flake-id-generator>

7.17. FlakeIdGenerator

Hazelcast Flake ID Generator is used to generate cluster-wide unique identifiers. Generated identifiers are long primitive values and are k-ordered (roughly ordered). IDs are in the range from 0 to Long.MAX_VALUE.

7.17.1. Generating Cluster-Wide IDs

The IDs contain timestamp component and a node ID component, which is assigned when the member joins the cluster. This allows the IDs to be ordered and unique without any coordination between the members, which makes the generator safe even in split-brain scenarios (for limitations in this case, please see the Node ID assignment section below).

Timestamp component is in milliseconds since 1.1.2018, 0:00 UTC and has 41 bits. This caps the useful lifespan of the generator to little less than 70 years (until ~2088). The sequence component is 6 bits. If more than 64 IDs are requested in single millisecond, IDs will gracefully overflow to the next millisecond and uniqueness is guaranteed in this case. The implementation does not allow overflowing by more than 15 seconds, if IDs are requested at higher rate, the call will block. Note, however, that clients are able to generate even faster because each call goes to a different (random) member and the 64 IDs/ms limit is for single member.

7.17.2. Performance

Operation on member is always local, if the member has valid node ID, otherwise it’s remote. On client, the newId() method goes to a random member and gets a batch of IDs, which will then be returned locally for limited time. The pre-fetch size and the validity time can be configured for each client and member.

7.17.3. Example

Let’s write a sample identifier generator.

public class FlakeIdGeneratorSample {
    public static void main(String[] args) {
        HazelcastInstance hazelcast = Hazelcast.newHazelcastInstance();

        ClientConfig clientConfig = new ClientConfig()
                .addFlakeIdGeneratorConfig(new FlakeIdGeneratorConfig("idGenerator")
                        .setPrefetchCount(10));
        HazelcastInstance client = HazelcastClient.newHazelcastClient(clientConfig);

        FlakeIdGenerator idGenerator = client.getFlakeIdGenerator("idGenerator");
        for (int i = 0; i < 10000; i++) {
            sleepSeconds(1);
            System.out.printf("Id: %s\n", idGenerator.newId());
        }
    }
}

7.17.4. Node ID Assignment

Flake IDs require a unique node ID to be assigned to each member, from which point the member can generate unique IDs without any coordination. Hazelcast uses the member list version from the moment when the member joined the cluster as a unique node ID.

The join algorithm is specifically designed to ensure that member list join version is unique for each member in the cluster. This ensures that IDs are unique even during network splits, with one caveat: at most one member is allowed to join the cluster during a network split. If two members join different subclusters, they are likely to get the same node ID. This will be resolved when the cluster heals, but until then, they can generate duplicate IDs.

Node ID Overflow

Node ID component of the ID has 16 bits. Members with the member list join version higher than 2^16 won’t be able to generate IDs, but functionality will be preserved by forwarding to another member. It is possible to generate IDs on any member or client as long as there is at least one member with join version smaller than 2^16 in the cluster. The remedy is to restart the cluster: the node ID component will be reset and assigned starting from zero again. Uniqueness after the restart will be preserved thanks to the timestamp component.

7.18. Replicated Map

A Replicated Map is a distributed key-value data structure where the data is replicated to all members in the cluster. It provides full replication of entries to all members for high speed access. The following are its features:

  • When you have a Replicated Map in the cluster, your clients can communicate with any cluster member.

  • All cluster members are able to perform write operations.

  • It supports all methods of the interface java.util.Map.

  • It supports automatic initial fill up when a new member is started.

  • It provides statistics for entry access, write and update so that you can monitor it using Hazelcast Management Center.

  • New members joining to the cluster pull all the data from the existing members.

  • You can listen to entry events using listeners. Please refer to Using EntryListener on Replicated Map.

7.18.1. Replicating Instead of Partitioning

A Replicated Map does not partition data (it does not spread data to different cluster members); instead, it replicates the data to all members. All other data structures are partitioned in design.

Replication leads to higher memory consumption. However, a Replicated Map has faster read and write access since the data is available on all members.

Writes could take place on local/remote members in order to provide write-order, eventually being replicated to all other members.

Replicated Map is suitable for objects, catalog data, or idempotent calculable data (such as HTML pages). It fully implements the java.util.Map interface, but it lacks the methods from java.util.concurrent.ConcurrentMap since there are no atomic guarantees to writes or reads.

If Replicated Map is used from a dummy client and this dummy client is connected to a lite member, the entry listeners cannot be registered/de-registered.
You cannot use Replicated Map from a lite member. A com.hazelcast.replicatedmap.ReplicatedMapCantBeCreatedOnLiteMemberException is thrown if com.hazelcast.core.HazelcastInstance.getReplicatedMap(name) is invoked on a lite member.

7.18.2. Example Replicated Map Code

Here is an example of Replicated Map code. The HazelcastInstance’s getReplicatedMap method gets the Replicated Map, and the Replicated Map’s put method creates map entries.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
Map<String, String> map = hz.getReplicatedMap("map");

map.put("1", "Tokyo");
map.put("2", "Paris");
map.put("3", "New York");

System.out.println("Finished loading map");
hz.shutdown();

HazelcastInstance.getReplicatedMap() returns com.hazelcast.core.ReplicatedMap which, as stated above, extends the java.util.Map interface.

The com.hazelcast.core.ReplicatedMap interface has some additional methods for registering entry listeners or retrieving values in an expected order.

7.18.3. Considerations for Replicated Map

If you have a large cluster or very high occurrences of updates, the Replicated Map may not scale linearly as expected since it has to replicate update operations to all members in the cluster.

Since the replication of updates is performed in an asynchronous manner, we recommend you enable back pressure in case your system has high occurrences of updates. Please refer to the Back Pressure section to learn how to enable it.

Replicated Map has an anti-entropy system that will converge values to a common one if some of the members are missing replication updates.

Replicated Map does not guarantee eventual consistency because there are some edge cases that fail to provide consistency.

Replicated Map uses the internal partition system of Hazelcast in order to serialize updates happening on the same key at the same time. This happens by sending updates of the same key to the same Hazelcast member in the cluster.

Due to the asynchronous nature of replication, a Hazelcast member could die before successfully replicating a "write" operation to other members after sending the "write completed" response to its caller during the write process. In this scenario, Hazelcast’s internal partition system will promote one of the replicas of the partition as the primary one. The new primary partition will not have the latest "write" since the dead member could not successfully replicate the update. (This will leave the system in a state that the caller is the only one that has the update and the rest of the cluster have not.) In this case even the anti-entropy system simply could not converge the value since the source of true information is lost for the update. This leads to a break in the eventual consistency because different values can be read from the system for the same key.

Other than the aforementioned scenario, the Replicated Map will behave like an eventually consistent system with read-your-writes and monotonic-reads consistency.

7.18.4. Configuration Design for Replicated Map

There are several technical design decisions you should consider when you configure a Replicated Map.

Initial Provisioning

If a new member joins the cluster, there are two ways you can handle the initial provisioning that is executed to replicate all existing values to the new member. Each involves how you configure the async fill up.

First, you can configure async fill up to true, which does not block reads while the fill up operation is underway. That way, you have immediate access on the new member, but it will take time until all the values are eventually accessible. Not yet replicated values are returned as non-existing (null).

Second, you can configure for a synchronous initial fill up (by configuring the async fill up to false), which blocks every read or write access to the map until the fill up operation is finished. Use this with caution since it might block your application from operating.

7.18.5. Configuring Replicated Map

Replicated Map can be configured programmatically or declaratively.

Declarative Configuration:

You can declare your Replicated Map configuration in the Hazelcast configuration file hazelcast.xml. Please see the following example.

<replicatedmap name="default">
  <in-memory-format>BINARY</in-memory-format>
  <async-fillup>true</async-fillup>
  <statistics-enabled>true</statistics-enabled>
  <entry-listeners>
    <entry-listener include-value="true">
      com.hazelcast.examples.EntryListener
    </entry-listener>
  </entry-listeners>
  <quorum-ref>quorumname</quorum-ref>
</replicatedmap>
  • in-memory-format: Internal storage format. Please see the In-Memory Format section. Its default value is OBJECT.

  • async-fillup: Specifies whether the Replicated Map is available for reads before the initial replication is completed. Its default value is true. If set to false (i.e., synchronous initial fill up), no exception will be thrown when the Replicated Map is not yet ready, but null values can be seen until the initial replication is completed.

  • statistics-enabled: If set to true, the statistics such as cache hits and misses are collected. Its default value is true.

  • entry-listener: Full canonical classname of the EntryListener implementation.

    • entry-listener#include-value: Specifies whether the event includes the value or not. Sometimes the key is enough to react on an event. In those situations, setting this value to false will save a deserialization cycle. Its default value is true.

    • entry-listener#local: Not used for Replicated Map since listeners are always local.

  • quorum-ref: Name of quorum configuration that you want this Replicated Map to use. Please see the Split-Brain Protection for Replicated Map section.

Programmatic Configuration:

You can configure a Replicated Map programmatically, as you can do for all other data structures in Hazelcast. You must create the configuration upfront, when you instantiate the HazelcastInstance. A basic example of how to configure the Replicated Map using the programmatic approach is shown in the following snippet.

Config config = new Config();

ReplicatedMapConfig replicatedMapConfig =
        config.getReplicatedMapConfig( "default" );

replicatedMapConfig.setInMemoryFormat( InMemoryFormat.BINARY )
        .setQuorumName( "quorumname" );

All properties that can be configured using the declarative configuration are also available using programmatic configuration by transforming the tag names into getter or setter names.

In-Memory Format on Replicated Map

Currently, two in-memory-format values are usable with the Replicated Map.

  • OBJECT (default): The data will be stored in deserialized form. This configuration is the default choice since the data replication is mostly used for high speed access. Please be aware that changing the values without a Map.put() is not reflected on the other members but is visible on the changing members for later value accesses.

  • BINARY: The data is stored in serialized binary format and has to be deserialized on every request. This option offers higher encapsulation since changes to values are always discarded as long as the newly changed object is not explicitly Map.put() into the map again.

7.18.6. Using EntryListener on Replicated Map

A com.hazelcast.core.EntryListener used on a Replicated Map serves the same purpose as it would on other data structures in Hazelcast. You can use it to react on add, update and remove operations. Replicated Maps do not yet support eviction.

Difference in EntryListener on Replicated Map

The fundamental difference in Replicated Map behavior, compared to the other data structures, is that an EntryListener only reflects changes on local data. Since replication is asynchronous, all listener events are fired only when an operation is finished on a local member. Events can fire at different times on different members.

Example of Replicated Map EntryListener

Here is a code example for using EntryListener on a Replicated Map.

The HazelcastInstance’s `getReplicatedMap method gets a Replicated Map (customers), and the ReplicatedMap’s `addEntryListener method adds an entry listener to the Replicated Map. Then, the ReplicatedMap’s `put method adds a Replicated Map entry and updates it. The method remove removes the entry.

    HazelcastInstance hz = Hazelcast.newHazelcastInstance();
    ReplicatedMap<String, String> map = hz.getReplicatedMap("somemap");
    map.addEntryListener(new MyEntryListener());
    System.out.println("EntryListener registered");
}

private static class MyEntryListener implements EntryListener<String, String> {

    @Override
    public void entryAdded(EntryEvent<String, String> event) {
        System.out.println("entryAdded: " + event);
    }

    @Override
    public void entryRemoved(EntryEvent<String, String> event) {
        System.out.println("entryRemoved: " + event);
    }

    @Override
    public void entryUpdated(EntryEvent<String, String> event) {
        System.out.println("entryUpdated: " + event);
    }

    @Override
    public void entryEvicted(EntryEvent<String, String> event) {
        System.out.println("entryEvicted: " + event);
    }

    @Override
    public void mapEvicted(MapEvent event) {
        System.out.println("mapEvicted:" + event);

    }

    @Override
    public void mapCleared(MapEvent event) {
        System.out.println("mapCleared: " + event);
    }

7.18.7. Split-Brain Protection for Replicated Map

Replicated Map can be configured to check for a minimum number of available members before applying its operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • clear

    • put

    • putAll

    • remove

  • READ, READ_WRITE:

    • containsKey

    • containsValue

    • entrySet

    • get

    • isEmpty

    • keySet

    • size

    • values

Configuring Split-Brain Protection

Split-Brain protection for Replicated Map can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<replicatedmap name="default">
   ...
   <quorum-ref>quorumname</quorum-ref>
   ...
</replicatedmap>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

7.19. Cardinality Estimator Service

Hazelcast’s cardinality estimator service is a data structure which implements Flajolet’s HyperLogLog algorithm for estimating cardinalities of unique objects in theoretically huge data sets. The implementation offered by Hazelcast includes improvements from Google’s version of the algorithm, i.e., HyperLogLog++.

The cardinality estimator service does not provide any ways to configure its properties, but rather uses some well tested defaults.

  • P: Precision - 14, using the 14 LSB of the hash for the index.

  • M: 2 ^ P = 16384 (16K) registers

  • P': Sparse Precision - 25

  • Durability: How many backups for each estimator, default 2

It is important to understand that this data structure is not 100% accurate, it is used to provide estimates. The error rate is typically a result of 1.04/sqrt(M) which in our implementation is around 0.81% for high percentiles.

The memory consumption of this data structure is close to 16K despite the size of elements in the source data set or stream.

There are two phases in using the cardinality estimator.

  1. Add objects to the instance of the estimator, e.g., for IPs estimator.add("0.0.0.0."). The provided object is first serialized and then the byte array is used to generate a hash for that object.

    Objects must be serializable in a form that Hazelcast understands.
  2. Compute the estimate of the set so far estimator.estimate().

Please see the cardinality estimator Javadoc for more information on its API.

The following is an example code.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
CardinalityEstimator visitorsEstimator = hz.getCardinalityEstimator("visitors");

InputStreamReader isr = new InputStreamReader(CardinalityEstimatorSample.class.getResourceAsStream("visitors.txt"));
BufferedReader br = new BufferedReader(isr);
try {
    String visitor = br.readLine();
    while (visitor != null) {
        visitorsEstimator.add(visitor);
        visitor = br.readLine();
    }
} catch (IOException e) {
    e.printStackTrace();
} finally {
    closeResource(br);
    closeResource(isr);
}

System.out.printf("Estimated unique visitors seen so far: %d%n", visitorsEstimator.estimate());

Hazelcast.shutdownAll();

7.19.1. Split-Brain Protection for Cardinality Estimator

Cardinality Estimator can be configured to check for a minimum number of available members before applying its operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • add

    • addAsync

  • READ, READ_WRITE:

    • estimate

    • estimateAsync

Configuring Split-Brain Protection

Split-Brain protection for Cardinality Estimator can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<cardinality-estimator name="default">
   ...
   <quorum-ref>quorumname</quorum-ref>
   ...
</cardinality-estimator>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

Configuring Merge Policy

While recovering from a Split-Brain syndrome, Cardinality Estimator in the small cluster merges into the bigger cluster based on a configured merge policy. When an estimator merges into the cluster, an estimator with the same name might already exist in the cluster. So the merge policy resolves these kinds of conflicts with different out-of-the-box strategies. It can be configured programmatically using the method setMergePolicyConfig(), or declaratively using the element merge-policy. Following is an example declarative configuration:

<cardinality-estimator name="default">
   ...
   <merge-policy>HyperLogLogMergePolicy</merge-policy>
   ...
</cardinality-estimator>

Following out-of-the-box merge policies are available:

  • DiscardMergePolicy: Estimator from the smaller cluster will be discarded.

  • HyperLogLogMergePolicy: Estimator will merge with the existing one, using the algorithmic merge for HyperLogLog. This is the default policy.

  • PassThroughMergePolicy: Estimator from the smaller cluster wins.

  • PutIfAbsentMergePolicy: Estimator from the smaller cluster wins if it doesn’t exist in the cluster.

7.20. Event Journal

The event journal is a distributed data structure that stores the history of mutation actions on map or cache. Each action on the map or cache which modifies its contents (such as put, remove or scheduled tasks which are not triggered by using the public API) will create an event which will be stored in the event journal. The event will store the event type as well as the key, old value and updated value for the entry (when applicable). As a user, you can only append to the journal indirectly by using the map and cache methods or by configuring expiration and eviction. By reading from the event journal you can recreate the state of the map or cache at any point in time.

Currently the event journal does not expose a public API for reading the event journal in Hazelcast IMDG. The event journal can be used to stream event data to Hazelcast Jet, so it should be used in conjunction with Hazelcast Jet. Because of this we will describe how to configure it but not how to use it from IMDG. If you enable and configure the event journal, you may only reach it through private API and you will most probably not get any benefits but the journal will retain events nevertheless and consume heap space.

The event journal has a fixed capacity and an expiration time. Internally it is structured as a ringbuffer (partitioned by ringbuffer item) and shares many similarities with it.

7.20.1. Interaction with Evictions and Expiration for IMap

Configuring IMap with eviction and expiration can cause the event journal to contain different events on the different replicas of the same partition. You can run into issues if you are reading from the event journal and the partition owner is terminated. A backup replica will then be promoted into the partition owner but the event journal will contain different events. The event count should stay the same but the entries which you previously thought were evicted and expired could now be "alive" and vice versa.

This is because eviction and expiration randomly choose entries to be evicted/expired. The entry is not coordinated between partition replicas. In these cases, the event journal diverges and will not converge at any future point but will remain inconsistent just as well as the contents of the internal record stores are inconsistent between replicas. You may say that the event journal on a specific replica is in-sync with the record store on that replica but the event journals and record stores between replicas are out-of-sync.

7.20.2. Configuring Event Journal Capacity

By default, an event journal is configured with a capacity of 10000 items. This creates a single array per partition, roughly the size of the capacity divided by the number of partitions. Thus, if the configured capacity is 10000 and the number of partitions is 271, we will create 271 arrays of size 36 (10000/271). If a time-to-live is configured, then an array of longs is also created that stores the expiration time for every item. A single array of the event journal keeps events that are only related to the map entries in that partition. In a lot of cases you may want to change this capacity number to something that better fits your needs. As the capacity is shared between partitions, keep in mind not to set it to a value which is too low for you. Setting the capacity to a number lower than the partition count will result in an error when initializing the event journal.

Below is a declarative configuration example of an event journal with a capacity of 5000 items for a map and 10000 items for a cache:

<event-journal enabled="true">
    <mapName>myMap</mapName>
    <capacity>5000</capacity>
    <time-to-live-seconds>20</time-to-live-seconds>
</event-journal>

<event-journal enabled="true">
    <cacheName>myCache</cacheName>
    <capacity>10000</capacity>
    <time-to-live-seconds>0</time-to-live-seconds>
</event-journal>

You can also configure an event journal programmatically. The following is a programmatic version of the above declarative configuration:

EventJournalConfig myMapJournalConfig = new EventJournalConfig()
        .setMapName("myMap")
        .setEnabled(true)
        .setCapacity(5000)
        .setTimeToLiveSeconds(20);

EventJournalConfig myCacheJournalConfig = new EventJournalConfig()
        .setMapName("myCache")
        .setEnabled(true)
        .setCapacity(10000)
        .setTimeToLiveSeconds(0);

Config config = new Config();
config.addEventJournalConfig(myMapJournalConfig);
config.addEventJournalConfig(myCacheJournalConfig);

The mapName and cacheName attributes define the map or cache to which this event journal configuration applies. You can use pattern-matching and the default keyword when doing so. For instance, by using a mapName of journaled*, the journal configuration will apply to all maps whose names start with "journaled" and don’t have other journal configurations that match (e.g., if you would have a more specific journal configuration with an exact name match). If you specify the mapName or cacheName as default, the journal configuration will apply to all maps and caches that don’t have any other journal configuration. This means that potentially all maps and/or caches will have one single event journal configuration.

7.20.3. Event Journal Partitioning

The event journal is a partitioned data structure. The partitioning is done by the event key. Because of this, the map and cache entry with a specific key is co-located with the events for that key and will be migrated accordingly. Also, the backup count for the event journal is equal to the backup count of the map or cache for which it contains events. The events on the backup replicas will be created with the map or cache backup operations and no additional network traffic is introduced when appending events to the event journal.

7.20.4. Configuring Event Journal time-to-live

You can configure Hazelcast event journal with a time-to-live in seconds. Using this setting, you can control how long the items remain in the event journal before they are expired. By default, the time-to-live is set to 0, meaning that unless the item is overwritten, it will remain in the journal indefinitely. The expiration time of the existing journal events is checked whenever a new event is appended to the event journal or when the event journal is being read. If the journal is not being read or written to, the journal may keep expired items indefinitely.

In the example below, an event journal is configured with a time-to-live of 180 seconds:

<event-journal enabled="true">
    <cacheName>myCache</cacheName>
    <capacity>10000</capacity>
    <time-to-live-seconds>180</time-to-live-seconds>
</event-journal>

8. Distributed Events

You can register for Hazelcast entry events so you will be notified when those events occur. Event Listeners are cluster-wide—​when a listener is registered in one member of cluster, it is actually registered for events that originated at any member in the cluster. When a new member joins, events originated at the new member will also be delivered.

An Event is created only if you registered an event listener. If no listener is registered, then no event will be created. If you provided a predicate when you registered the event listener, pass the predicate before sending the event to the listener (member/client).

As a rule of thumb, your event listener should not implement heavy processes in its event methods that block the thread for a long time. If needed, you can use ExecutorService to transfer long running processes to another thread and thus offload the current listener thread.

In a failover scenario, events are not highly available and may get lost. Eventing mechanism is being improved for failover scenarios.

Hazelcast offers the following event listeners.

For cluster events:

  • Membership Listener for cluster membership events.

  • Distributed Object Listener for distributed object creation and destroy events.

  • Migration Listener for partition migration start and complete events.

  • Partition Lost Listener for partition lost events.

  • Lifecycle Listener for HazelcastInstance lifecycle events.

  • Client Listener for client connection events.

For distributed object events:

  • Entry Listener for IMap and MultiMap entry events.

  • Item Listener for IQueue, ISet and IList item events.

  • Message Listener for ITopic message events.

For Hazelcast JCache implementation:

For Hazelcast clients:

  • Lifecycle Listener

  • Membership Listener

  • Distributed Object Listener

8.1. Cluster Events

8.1.1. Listening for Member Events

The Membership Listener interface has methods that are invoked for the following events.

  • memberAdded: A new member is added to the cluster.

  • memberRemoved: An existing member leaves the cluster.

  • memberAttributeChanged: An attribute of a member is changed. Please refer to Defining Member Attributes to learn about member attributes.

To write a Membership Listener class, you implement the MembershipListener interface and its methods.

The following is an example Membership Listener class.

public class ClusterMembershipListener implements MembershipListener {

    public void memberAdded(MembershipEvent membershipEvent) {
        System.err.println("Added: " + membershipEvent);
    }

    public void memberRemoved(MembershipEvent membershipEvent) {
        System.err.println("Removed: " + membershipEvent);
    }

    public void memberAttributeChanged(MemberAttributeEvent memberAttributeEvent) {
        System.err.println("Member attribute changed: " + memberAttributeEvent);
    }
}

When a respective event is fired, the membership listener outputs the addresses of the members that joined and left, and also which attribute changed on which member.

Registering Membership Listeners

After you create your class, you can configure your cluster to include the membership listener. Below is an example using the method addMembershipListener.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
hazelcastInstance.getCluster().addMembershipListener( new ClusterMembershipListener() );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

Config config = new Config();
config.addListenerConfig(
new ListenerConfig( "com.yourpackage.ClusterMembershipListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
   ...
   <listeners>
      <listener>
         com.yourpackage.ClusterMembershipListener
      </listener>
   </listeners>
   ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:listeners>
 <hz:listener class-name="com.yourpackage.ClusterMembershipListener"/>
 <hz:listener implementation="MembershipListener"/>
</hz:listeners>

8.1.2. Listening for Distributed Object Events

The Distributed Object Listener methods distributedObjectCreated and distributedObjectDestroyed are invoked when a distributed object is created and destroyed throughout the cluster. To write a Distributed Object Listener class, you implement the DistributedObjectListener interface and its methods.

The following is an example Distributed Object Listener class.

public class SampleDistObjListener implements DistributedObjectListener {

    @Override
    public void distributedObjectCreated(DistributedObjectEvent event) {
        DistributedObject instance = event.getDistributedObject();
        System.out.println("Created " + instance.getName() + ", service=" + instance.getServiceName());
    }

    @Override
    public void distributedObjectDestroyed(DistributedObjectEvent event) {
        System.out.println("Destroyed " + event.getObjectName() + ", service=" + event.getServiceName());
    }
}

When a respective event is fired, the distributed object listener outputs the event type, the object name and a service name (for example, for a Map object the service name is "hz:impl:mapService").

Registering Distributed Object Listeners

After you create your class, you can configure your cluster to include distributed object listeners. Below is an example using the method addDistributedObjectListener. You can also see this portion in the above class creation.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
SampleDistObjListener sample = new SampleDistObjListener();

hazelcastInstance.addDistributedObjectListener( sample );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register the listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

config.addListenerConfig(
new ListenerConfig( "com.yourpackage.SampleDistObjListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
   ...
   <listeners>
      <listener>
         com.yourpackage.SampleDistObjListener
      </listener>
   </listeners>
   ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:listeners>
   <hz:listener class-name="com.yourpackage.SampleDistObjListener"/>
   <hz:listener implementation="DistributedObjectListener"/>
</hz:listeners>

8.1.3. Listening for Migration Events

The Migration Listener interface has methods that are invoked for the following events:

  • migrationStarted: A partition migration is started.

  • migrationCompleted: A partition migration is completed.

  • migrationFailed: A partition migration failed.

To write a Migration Listener class, you implement the MigrationListener interface and its methods.

The following is an example Migration Listener class.

public class ClusterMigrationListener implements MigrationListener {
    @Override
    public void migrationStarted(MigrationEvent migrationEvent) {
        System.err.println("Started: " + migrationEvent);
    }
    @Override
    public void migrationCompleted(MigrationEvent migrationEvent) {
        System.err.println("Completed: " + migrationEvent);
    }
    @Override
    public void migrationFailed(MigrationEvent migrationEvent) {
        System.err.println("Failed: " + migrationEvent);
    }
}

When a respective event is fired, the migration listener outputs the partition ID, status of the migration, the old member and the new member. The following is an example output.

Started: MigrationEvent{partitionId=98, oldOwner=Member [127.0.0.1]:5701,
newOwner=Member [127.0.0.1]:5702 this}
Registering Migration Listeners

After you create your class, you can configure your cluster to include migration listeners. Below is an example using the method addMigrationListener.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

PartitionService partitionService = hazelcastInstance.getPartitionService();
partitionService.addMigrationListener( new ClusterMigrationListener() );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register the listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

config.addListenerConfig(
new ListenerConfig( "com.yourpackage.ClusterMigrationListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
   ...
   <listeners>
      <listener>
         com.yourpackage.ClusterMigrationListener
      </listener>
   </listeners>
   ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:listeners>
   <hz:listener class-name="com.yourpackage.ClusterMigrationListener"/>
   <hz:listener implementation="MigrationListener"/>
</hz:listeners>

8.1.4. Listening for Partition Lost Events

Hazelcast provides fault-tolerance by keeping multiple copies of your data. For each partition, one of your cluster members becomes the owner and some of the other members become replica members, based on your configuration. Nevertheless, data loss may occur if a few members crash simultaneously.

Let’s consider the following example with three members: N1, N2, N3 for a given partition-0. N1 is owner of partition-0. N2 and N3 are the first and second replicas respectively. If N1 and N2 crash simultaneously, partition-0 loses its data that is configured with less than two backups. For instance, if we configure a map with one backup, that map loses its data in partition-0 since both owner and first replica of partition-0 have crashed. However, if we configure our map with two backups, it does not lose any data since a copy of partition-0’s data for the given map also resides in N3.

The Partition Lost Listener notifies for possible data loss occurrences with the information of how many replicas are lost for a partition. It listens to PartitionLostEvent instances. Partition lost events are dispatched per partition.

Partition loss detection is done after a member crash is detected by the other members and the crashed member is removed from the cluster. Please note that false-positive PartitionLostEvent instances may be fired on the network split errors.

Writing a Partition Lost Listener Class

To write a Partition Lost Listener, you implement the PartitionLostListener interface and its partitionLost method, which is invoked when a partition loses its owner and all backups.

The following is an example Partition Lost Listener class.

public class ConsoleLoggingPartitionLostListener implements PartitionLostListener {
    @Override
    public void partitionLost(PartitionLostEvent event) {
        System.out.println(event);
    }
}

When a PartitionLostEvent is fired, the partition lost listener given above outputs the partition ID, the replica index that is lost and the member that has detected the partition loss. The following is an example output.

com.hazelcast.partition.PartitionLostEvent{partitionId=242, lostBackupCount=0,
eventSource=Address[192.168.2.49]:5701}
Registering Partition Lost Listeners

After you create your class, you can configure your cluster programmatically or declaratively to include the partition lost listener. Below is an example of its programmatic configuration.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
hazelcastInstance.getPartitionService().addPartitionLostListener( new ConsoleLoggingPartitionLostListener() );

The following is an example of the equivalent declarative configuration.

<hazelcast>
   ...
   <listeners>
      <listener>
         com.yourpackage.ConsoleLoggingPartitionLostListener
      </listener>
   </listeners>
   ...
</hazelcast>

8.1.5. Listening for Lifecycle Events

The Lifecycle Listener notifies for the following events:

  • STARTING: A member is starting.

  • STARTED: A member started.

  • SHUTTING_DOWN: A member is shutting down.

  • SHUTDOWN: A member’s shutdown has completed.

  • MERGING: A member is merging with the cluster.

  • MERGED: A member’s merge operation has completed.

  • CLIENT_CONNECTED: A Hazelcast Client connected to the cluster.

  • CLIENT_DISCONNECTED: A Hazelcast Client disconnected from the cluster.

The following is an example Lifecycle Listener class.

public class NodeLifecycleListener implements LifecycleListener {
     @Override
     public void stateChanged(LifecycleEvent event) {
         System.err.println(event);
     }
}

This listener is local to an individual member. It notifies the application that uses Hazelcast about the events mentioned above for a particular member.

Registering Lifecycle Listeners

After you create your class, you can configure your cluster to include lifecycle listeners. Below is an example using the method addLifecycleListener.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
hazelcastInstance.getLifecycleService().addLifecycleListener( new NodeLifecycleListener() );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register the listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

config.addListenerConfig(
    new ListenerConfig( "com.yourpackage.NodeLifecycleListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
   ...
   <listeners>
      <listener>
         com.yourpackage.NodeLifecycleListener
      </listener>
   </listeners>
   ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:listeners>
   <hz:listener class-name="com.yourpackage.NodeLifecycleListener"/>
   <hz:listener implementation="LifecycleListener"/>
</hz:listeners>

8.1.6. Listening for Clients

The Client Listener is used by the Hazelcast cluster members. It notifies the cluster members when a client is connected to or disconnected from the cluster.

To write a client listener class, you implement the ClientListener interface and its methods clientConnected and clientDisconnected, which are invoked when a client is connected to or disconnected from the cluster. You can add your client listener as shown below.

hazelcastInstance.getClientService().addClientListener(new SampleClientListener());

The following is the equivalent declarative configuration.

<listeners>
   <listener>
      com.yourpackage.SampleClientListener
   </listener>
</listeners>

The following is the equivalent configuration in the Spring context.

<hz:listeners>
   <hz:listener class-name="com.yourpackage.SampleClientListener"/>
   <hz:listener implementation="com.yourpackage.SampleClientListener"/>
</hz:listeners>
You can also add event listeners to a Hazelcast client. Please refer to Client Listenerconfig for the related information.

8.2. Distributed Object Events

8.2.1. Listening for Map Events

You can listen to map-wide or entry-based events using the listeners provided by the Hazelcast’s eventing framework. To listen to these events, implement a MapListener sub-interface.

A map-wide event is fired as a result of a map-wide operation. For example, IMap.clear() or IMap.evictAll(). An entry-based event is fired after the operations that affect a specific entry. For example, IMap.remove() or IMap.evict().

Catching a Map Event

To catch an event, you should explicitly implement a corresponding sub-interface of a MapListener, such as EntryAddedListener or MapClearedListener.

The EntryListener interface still can be implemented (we kept it for backward compatibility reasons). However, if you need to listen to a different event, one that is not available in the EntryListener interface, you should also implement a relevant MapListener sub-interface.*

Let’s take a look at the following class example.

public class Listen {

    public static void main( String[] args ) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, String> map = hz.getMap( "somemap" );
        map.addEntryListener( new MyEntryListener(), true );
        System.out.println( "EntryListener registered" );
    }

    static class MyEntryListener implements
            EntryAddedListener<String, String>,
            EntryRemovedListener<String, String>,
            EntryUpdatedListener<String, String>,
            EntryEvictedListener<String, String>,
            EntryLoadedListener<String,String>,
            MapEvictedListener,
            MapClearedListener   {
        @Override
        public void entryAdded( EntryEvent<String, String> event ) {
            System.out.println( "Entry Added:" + event );
        }

        @Override
        public void entryRemoved( EntryEvent<String, String> event ) {
            System.out.println( "Entry Removed:" + event );
        }

        @Override
        public void entryUpdated( EntryEvent<String, String> event ) {
            System.out.println( "Entry Updated:" + event );
        }

        @Override
        public void entryEvicted( EntryEvent<String, String> event ) {
            System.out.println( "Entry Evicted:" + event );
        }

        @Override
        public void entryLoaded(EntryEvent<String, String> event) {
            System.out.println( "Entry Loaded:" + event );
        }

        @Override
        public void mapEvicted( MapEvent event ) {
            System.out.println( "Map Evicted:" + event );
        }

        @Override
        public void mapCleared( MapEvent event ) {
            System.out.println( "Map Cleared:" + event );
        }
    }
}

Now, let’s perform some modifications on the map entries using the following example code.

public class ModifyMap {

    public static void main( String[] args ) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, String> map = hz.getMap( "somemap");
        String key = "" + System.nanoTime();
        String value = "1";
        map.put( key, value );
        map.put( key, "2" );
        map.delete( key );
    }
}

If you execute the Listen class and then the Modify class, you get the following output produced by the Listen class.

Entry Added:EntryEvent{entryEventType=ADDED, member=Member [192.168.1.100]]:5702
 - ffedb655-bbad-43ea-aee8-d429d37ce528, name='somemap', key=11455268066242,
 oldValue=null, value=1, mergingValue=null}

Entry Updated:EntryEvent{entryEventType=UPDATED, member=Member [192.168.1.100]]:5702
 - ffedb655-bbad-43ea-aee8-d429d37ce528, name='somemap', key=11455268066242,
 oldValue=1, value=2, mergingValue=null}

Entry Removed:EntryEvent{entryEventType=REMOVED, member=Member [192.168.1.100]]:5702
 - ffedb655-bbad-43ea-aee8-d429d37ce528, name='somemap', key=11455268066242,
 oldValue=null, value=null, mergingValue=null}
Please note that the method IMap.clear() does not fire an "EntryRemoved" event, but fires a "MapCleared" event.
Listeners have to offload all blocking operations to another thread (pool).

8.2.2. Listening for Lost Map Partitions

You can listen to MapPartitionLostEvent instances by registering an implementation of MapPartitionLostListener, which is also a sub-interface of MapListener.

Let’s consider the following example code:

public class ListenMapPartitionLostEvents {

    public static void main(String[] args) {
        Config config = new Config();
        // keeps its data if a single node crashes
        config.getMapConfig("map").setBackupCount(1);

        HazelcastInstance instance = HazelcastInstanceFactory.newHazelcastInstance(config);

        IMap<Object, Object> map = instance.getMap("map");
        map.put(0, 0);

        map.addPartitionLostListener(new MapPartitionLostListener() {
            @Override
            public void partitionLost(MapPartitionLostEvent event) {
                System.out.println(event);
            }
        });
    }
}

Within this example code, a MapPartitionLostListener implementation is registered to a map that is configured with one backup. For this particular map and any of the partitions in the system, if the partition owner member and its first backup member crash simultaneously, the given MapPartitionLostListener receives a corresponding MapPartitionLostEvent. If only a single member crashes in the cluster, there will be no MapPartitionLostEvent fired for this map since backups for the partitions owned by the crashed member are kept on other members.

Please refer to Listening for Partition Lost Events for more information about partition lost detection and partition lost events.

Registering Map Listeners

After you create your listener class, you can configure your cluster to include map listeners using the method addEntryListener (as you can see in the example Listen class above). Below is the related portion from this code, showing how to register a map listener.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
IMap<String, String> map = hz.getMap( "somemap" );
map.addEntryListener( new MyEntryListener(), true );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register listeners in configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

mapConfig.addEntryListenerConfig(
new EntryListenerConfig( "com.yourpackage.MyEntryListener",
                                 false, false ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
   ...
   <map name="somemap">
      ...
      <entry-listeners>
         <entry-listener include-value="false" local="false">
            com.yourpackage.MyEntryListener
         </entry-listener>
      </entry-listeners>
   </map>
   ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:map name="somemap">
   <hz:entry-listeners>
      <hz:entry-listener include-value="true"
         class-name="com.hazelcast.spring.DummyEntryListener"/>
      <hz:entry-listener implementation="dummyEntryListener" local="true"/>
   </hz:entry-listeners>
</hz:map>
Map Listener Attributes

As you see, there are attributes of the map listeners in the above examples: include-value and local. The attribute include-value is a boolean attribute that is optional, and if you set it to true, the map event will contain the map value. Its default value is true.

The attribute local is also a boolean attribute that is optional, and if you set it to true, you can listen to the map on the local member. Its default value is false.

8.2.3. Listening for MultiMap Events

You can listen to entry-based events in the MultiMap using EntryListener. The following is an example entry listener implementation for MultiMap.

public class SampleEntryListener implements EntryListener<String, String> {
    @Override
    public void entryAdded(EntryEvent<String, String> event) {
        System.out.println("Entry Added: " + event);
    }
    @Override
    public void entryRemoved( EntryEvent<String, String> event ) {
        System.out.println( "Entry Removed: " + event );
    }
    @Override
    public void entryUpdated(EntryEvent<String, String> event) {
        System.out.println( "Entry Updated: " + event );
    }
    @Override
    public void entryEvicted(EntryEvent<String, String> event) {
        System.out.println( "Entry evicted: " + event );
    }
    @Override
    public void mapCleared(MapEvent event) {
        System.out.println( "Map Cleared: " + event );
    }
    @Override
    public void mapEvicted(MapEvent event) {
        System.out.println( "Map Evicted: " + event );
    }
}
Registering MultiMap Listeners

After you create your listener class, you can configure your cluster to include MultiMap listeners using the method addEntryListener. Below is the related portion from a code, showing how to register a map listener.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
MultiMap<String, String> map = hz.getMultiMap( "somemap" );
map.addEntryListener( new SampleEntryListener(), true );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

multiMapConfig.addEntryListenerConfig(
  new EntryListenerConfig( "com.yourpackage.SampleEntryListener",
    false, false ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
   ...
   <multimap name="somemap">
      <value-collection-type>SET</value-collection-type>
      <entry-listeners>
         <entry-listener include-value="false" local="false">
            com.yourpackage.SampleEntryListener
         </entry-listener>
      </entry-listeners>
   </multimap>
   ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:multimap name="somemap" value-collection-type="SET">
   <hz:entry-listeners>
      <hz:entry-listener include-value="false"
         class-name="com.yourpackage.SampleEntryListener"/>
      <hz:entry-listener implementation="EntryListener" local="false"/>
   </hz:entry-listeners>
</hz:multimap>
MultiMap Listener Attributes

As you see, there are attributes of the MultiMap listeners in the above examples: include-value and local. The attribute include-value is a boolean attribute that is optional, and if you set it to true, the MultiMap event will contain the map value. Its default value is true.

The attribute local is also a boolean attribute that is optional, and if you set it to true, you can listen to the MultiMap on the local member. Its default value is false.

8.2.4. Listening for Item Events

The Item Listener is used by the Hazelcast IQueue, ISet and IList interfaces.

To write an Item Listener class, you implement the ItemListener interface and its methods itemAdded and itemRemoved. These methods are invoked when an item is added or removed.

The following is an example Item Listener class for an ISet structure.

public class SampleItemListener implements ItemListener<Price> {

    @Override
    public void itemAdded(ItemEvent<Price> event) {
        System.out.println( "Item added:  " + event );
    }

    @Override
    public void itemRemoved(ItemEvent<Price> event) {
        System.out.println( "Item removed: " + event );
    }
}
You can use ICollection when creating any of the collection (queue, set and list) data structures, as shown above. You can also use IQueue, ISet or IList instead of ICollection.
Registering Item Listeners

After you create your class, you can configure your cluster to include item listeners. Below is an example using the method addItemListener for ISet (it applies also to IQueue and IList). You can also see this portion in the above class creation.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

ICollection<Price> set = hazelcastInstance.getSet( "default" );
// or ISet<Prices> set = hazelcastInstance.getSet( "default" );
set.addItemListener( new SampleItemListener(), true );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

setConfig.addItemListenerConfig(
new ItemListenerConfig( "com.yourpackage.SampleItemListener", true ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
  <set>
    <item-listeners>
      <item-listener include-value="true">
        com.yourpackage.SampleItemListener
      </item-listener>
    </item-listeners>
  </set>
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:set name="default" >
  <hz:item-listeners>
    <hz:item-listener include-value="true"
      class-name="com.yourpackage.SampleItemListener"/>
  </hz:item-listeners>
</hz:set>
Item Listener Attributes

As you see, there is an attribute in the above examples: include-value. It is a boolean attribute that is optional, and if you set it to true, the item event will contain the item value. Its default value is true.

There is also another attribute called local, which is not shown in the above examples. It is also a boolean attribute that is optional, and if you set it to true, you can listen to the items on the local member. Its default value is false.

8.2.5. Listening for Topic Messages

The Message Listener is used by the ITopic interface. It notifies when a message is received for the registered topic.

To write a Message Listener class, you implement the MessageListener interface and its method onMessage, which is invoked when a message is received for the registered topic.

The following is an example Message Listener class.

public class SampleMessageListener implements MessageListener<MyEvent> {

    public void onMessage( Message<MyEvent> message ) {
        MyEvent myEvent = message.getMessageObject();
        System.out.println( "Message received = " + myEvent.toString() );
    }
}
Registering Message Listeners

After you create your class, you can configure your cluster to include message listeners. Below is an example using the method addMessageListener.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

ITopic topic = hazelcastInstance.getTopic( "default" );
topic.addMessageListener( new SampleMessageListener() );

With the above approach, there is the possibility of missing messaging events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register this listener in the configuration. You can register it using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

topicConfig.addMessageListenerConfig(
  new ListenerConfig( "com.yourpackage.SampleMessageListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
   ...
   <topic name="default">
      <message-listeners>
         <message-listener>
            com.yourpackage.SampleMessageListener
         </message-listener>
      </message-listeners>
   </topic>
   ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:topic name="default">
  <hz:message-listeners>
    <hz:message-listener
      class-name="com.yourpackage.SampleMessageListener"/>
  </hz:message-listeners>
</hz:topic>

8.3. Event Listeners for Hazelcast Clients

You can add event listeners to a Hazelcast Java client. You can configure the following listeners to listen to the events on the client side. Please see the respective sections under the Cluster Events section for example codes.

  • Lifecycle Listener: Notifies when the client is starting, started, shutting down and shutdown.

  • Membership Listener: Notifies when a member joins to/leaves the cluster to which the client is connected, or when an attribute is changed in a member.

  • DistributedObject Listener: Notifies when a distributed object is created or destroyed throughout the cluster to which the client is connected.

Please refer to the Configuring Client Listeners section for more information.

8.4. Global Event Configuration

  • hazelcast.event.queue.capacity: default value is 1000000

  • hazelcast.event.queue.timeout.millis: default value is 250

  • hazelcast.event.thread.count: default value is 5

A striped executor in each cluster member controls and dispatches the received events. This striped executor also guarantees the event order. For all events in Hazelcast, the order in which events are generated and the order in which they are published are guaranteed for given keys. For map and multimap, the order is preserved for the operations on the same key of the entry. For list, set, topic and queue, the order is preserved for events on that instance of the distributed data structure.

To achieve the order guarantee, you make only one thread responsible for a particular set of events (entry events of a key in a map, item events of a collection, etc.) in StripedExecutor (within com.hazelcast.util.executor).

If the event queue reaches its capacity (hazelcast.event.queue.capacity) and the last item cannot be put into the event queue for the period specified in hazelcast.event.queue.timeout.millis, these events will be dropped with a warning message, such as "EventQueue overloaded".

If event listeners perform a computation that takes a long time, the event queue can reach its maximum capacity and lose events. For map and multimap, you can configure hazelcast.event.thread.count to a higher value so that fewer collisions occur for keys, and therefore worker threads will not block each other in StripedExecutor. For list, set, topic and queue, you should offload heavy work to another thread. To preserve order guarantee, you should implement similar logic with StripedExecutor in the offloaded thread pool.

9. Hazelcast Jet

This chapter briefly describes Hazelcast Jet. For detailed information and Jet documentation, please visit jet.hazelcast.org.

9.1. Overview

Hazelcast Jet, built on top of the Hazelcast IMDG, is a distributed processing engine for fast stream and batch processing of large data sets. It reuses the features and services of Hazelcast IMDG, but it is a separate product with features not available in IMDG.

With Hazelcast IMDG providing storage functionality, Jet performs parallel execution in a Hazelcast Jet cluster, composed of Jet instances, to enable data-intensive applications to operate in near real-time. Jet uses green threads (threads that are scheduled by a runtime library or VM) to achieve this parallel execution.

Since Jet uses Hazelcast IMDG’s discovery mechanisms, it can be used both on-premises and on the cloud environments. Hazelcast Jet typically runs on several machines that form a cluster.

9.1.1. How You Can Use It

The Pipeline API is the primary high-level API of Hazelcast Jet for batch and stream processing. This API is easy-to-use and set-up providing you with the tools to compose batch computations from building blocks such as filters, aggregators and joiners - saving time and resource. With Pipeline API, you can build bounded and unbounded data pipelines on a variety of sources and sinks.

In addition to the Pipeline API, Jet also offers a distributed implementation of java.util.stream. You can express your computation over any data source Jet supports using the familiar API from the JDK 8. This distributed implementation can be used for simple transform and reduce operations on top of IMap and IList.

There is also Jet’s Core API for advanced users to build custom data sources and sinks, to have a low-level control over the data flow, to fine-tune performance and build DSLs.

Please see the Work with Jet section in the Hazelcast Jet Reference Manual to see a simple example.

9.1.2. Where You Can Use It

Hazelcast Jet is appropriate for applications that require a near real-time experience such as operations in IoT architectures (house thermostats, lighting systems, etc.), in-store e-commerce systems and social media platforms. Typical use cases include the following:

  • Real-time (low-latency) stream processing

  • Fast batch processing

  • Streaming analytics

  • Complex event processing

  • Implementing event sourcing and CQRS (Command Query Responsibility Segregation)

  • Internet-of-things (IoT) data ingestion, processing and storage

  • Data processing microservice architectures

  • Online trading

  • Social media platforms

  • System log events

The aforementioned use cases require huge amounts of data to be processed in near real-time. Hazelcast Jet achieves this by processing the incoming records as soon as possible, hence lowering the latency, and ingesting the data at high-velocity. Jet’s execution model and keeping both the computation and data storage in memory enables high application speeds.

9.1.3. Data Processing Styles

The data processing is traditionally divided into batch and stream processing.

Batch data is considered as bounded, i.e., finite, and fast batch processing typically may refer to running a job on a data set which is available in a data center. You simply provide one or more pre-existing datasets and order Hazelcast Jet to mine them for the information you need.

Stream data is considered as unbounded, i.e., infinite, and infinite stream processing deals with in-flight data before it is stored. It offers lower latency; data is processed on-the-fly and you do not have to wait for the whole data set to arrive in order to run a computation.

9.2. Relationship with Hazelcast IMDG

Hazelcast Jet leans on Hazelcast IMDG for cluster management and deployment, data partitioning and networking; all the services of IMDG are available to your Jet Jobs (units of work which are executed). A Jet instance is also a fully functional Hazelcast IMDG instance and a Jet cluster is also a Hazelcast IMDG cluster.

A Jet job is implemented as a Hazelcast IMDG proxy, similar to the other services and data structures in Hazelcast. Hazelcast operations are used for different actions that can be performed on a job. Jet can also be used with the Hazelcast Client, which uses the Hazelcast Open Binary Protocol to communicate different actions to the server instance.

In the Hazelcast Jet world, Hazelcast IMDG can be used for data ingestion prior to processing, connecting multiple Jet jobs, enriching processed events, caching the remote data, distributing Jet-processed data and running advanced data processing tasks on top of IMDG data structures.

Hazelcast Jet can use Hazelcast IMDG’s IMap, ICache and IList on the embedded cluster as sources (data structures from which Jet reads data) and sinks (data structures to which Jet writes data). IMap and ICache are partitioned data structures distributed across the cluster and Jet members can read from these structures by having each member read just its local partitions. Hazelcast IMDG’s IList is stored on a single partition; all the data will be read on the single member that owns that partition. Please refer to IMap and ICache and IList in the Hazelcast Jet Reference Manual to learn how Jet uses these IMDG data structures. In addition to these data structures, Jet can also process a stream of changes of IMap and ICache, using the Event Journal.

You can use Hazelcast Jet with embedded Hazelcast IMDG or a remote Hazelcast IMDG cluster. Benefits of using Hazelcast Jet with embedded Hazelcast IMDG are as follows:

  • Sharing the processing state among Jet Jobs.

  • Caching intermediate processing results.

  • Enriching processed events; cache remote data, e.g., fact tables from a database, on Jet members.

  • Running advanced data processing tasks on top of Hazelcast data structures.

  • Improving development processes by making start up of a Jet cluster simple and fast.

Jet Jobs use Hazelcast IMDG connector by allowing reading and writing records to/from a remote Hazelcast IMDG instance. You can use a remote Hazelcast IMDG cluster for the following cases:

  • Distributing data across IMap, ICache and IList structures.

  • Sharing state or intermediate results among more Jet clusters.

  • Isolating the processing cluster (Jet) from operational data storage cluster (IMDG).

  • Publishing intermediate results, e.g., to show real-time processing stats on a dashboard.

9.3. Hazelcast IMDG Computing vs. Hazelcast Jet

As described in the Fast-Aggregations section Hazelcast IMDG has native support for aggregation operations on the contents of its distributed data structures.

Fast-Aggregations are a good fit for simple operations (count, distinct, sum, avg, min, max, etc.). However, they may not be sufficient for operations that group data by key and produce the results of size O(keyCount). The architecture of Hazelcast aggregations is not well suited to this use case, although it will still work even for moderately sized results (up to 100 MB, as a ballpark figure). Hazelcast Jet can be the preferred choice for larger sized results and whenever something more than a single aggregation step is needed. Please see the Jet Compared with New Aggregations section.

Another Hazelcast IMDG computing feature is Entry Processors. They are used for fast mutating operations in an atomic way, in which the map entry is mutated by executing logic directly on the JVM where the data resides. And this means the network hops are reduced and atomicity is provided in a single step. Keeping this in mind, you can use Hazelcast IMDG Entry Processors when they perform bulk mutations of an IMap, where the processing function is fast and involves a single map entry per call. On the other hand, you can prefer to use Hazelcast Jet when the processing involves multiple entries (aggregations, joins, etc.), or involves multiple computing steps to be made parallel, or when the data source and sink are not a single IMap instance.

10. Distributed Computing

This chapter explains Hazelcast’s executor service, durable/scheduled executor services and entry processor implementations.

10.1. Executor Service

One of the coolest features of Java is the Executor framework, which allows you to asynchronously execute your tasks (logical units of work), such as database queries, complex calculations and image rendering.

The default implementation of this framework (ThreadPoolExecutor) is designed to run within a single JVM (cluster member). In distributed systems, this implementation is not desired since you may want a task submitted in one JVM and processed in another one. Hazelcast offers IExecutorService for you to use in distributed environments. It implements java.util.concurrent.ExecutorService to serve the applications requiring computational and data processing power.


Note that you may want to use Hazelcast Jet if you want to process batch or real-time streaming data. Please see the Fast Batch Processing and Real-Time Stream Processing use cases for Hazelcast Jet.

With IExecutorService, you can execute tasks asynchronously and perform other useful tasks. If your task execution takes longer than expected, you can cancel the task execution. Tasks should be Serializable since they will be distributed.

In the Java Executor framework, you implement tasks two ways: Callable or Runnable.

  • Callable: If you need to return a value and submit it to Executor, implement the task as java.util.concurrent.Callable.

  • Runnable: If you do not need to return a value, implement the task as java.util.concurrent.Runnable.

Note that, the distributed executor service (IExecutorService) is intended to run processing where the data is hosted: on the server members. In general, you cannot run a Java Runnable or Callable on the clients as the clients may not be Java. Also, the clients do not host any data, so they would have to fetch what data they need from the servers potentially. If you want something to run on all or some clients connected to your cluster, you could implement this using the publish/subscribe mechanism; a payload could be sent to an ITopic with the necessary execution parameters, and clients listening can act on the message.

10.1.1. Implementing a Callable Task

In Hazelcast, when you implement a task as java.util.concurrent.Callable (a task that returns a value), you implement Callable and Serializable.

Below is an example of a Callable task. SumTask prints out map keys and returns the summed map values.

public class SumTask
        implements Callable<Integer>, Serializable, HazelcastInstanceAware {

    private transient HazelcastInstance hazelcastInstance;

    public void setHazelcastInstance( HazelcastInstance hazelcastInstance ) {
        this.hazelcastInstance = hazelcastInstance;
    }

    public Integer call() throws Exception {
        IMap<String, Integer> map = hazelcastInstance.getMap( "map" );
        int result = 0;
        for ( String key : map.localKeySet() ) {
            System.out.println( "Calculating for key: " + key );
            result += map.get( key );
        }
        System.out.println( "Local Result: " + result );
        return result;
    }
}

Another example is the Echo callable below. In its call() method, it returns the local member and the input passed in. Remember that instance.getCluster().getLocalMember() returns the local member and toString() returns the member’s address (IP + port) in String form, just to see which member actually executed the code for our example. Of course, the call() method can do and return anything you like.

public class Echo implements Callable<String>, Serializable, HazelcastInstanceAware {
    String input = null;

    private transient HazelcastInstance hazelcastInstance;

    public Echo() {
    }

    public void setHazelcastInstance( HazelcastInstance hazelcastInstance ) {
        this.hazelcastInstance = hazelcastInstance;
    }

    public Echo(String input) {
        this.input = input;
    }

    public String call() {
        return hazelcastInstance.getCluster().getLocalMember().toString() + ":" + input;
    }
}
Executing a Callable Task

To execute a callable task:

  • Retrieve the Executor from HazelcastInstance.

  • Submit a task which returns a Future.

  • After executing the task, you do not have to wait for the execution to complete, you can process other things.

  • When ready, use the Future object to retrieve the result as shown in the code example below.

Below, the Echo task is executed.

public class MasterMember {

    public static void main( String[] args ) throws Exception {
        HazelcastInstance instance = Hazelcast.newHazelcastInstance();
        IExecutorService executorService = instance.getExecutorService( "executorService" );
        Future<String> future = executorService.submit( new Echo( "myinput") );
        //while it is executing, do some useful stuff
        //when ready, get the result of your execution
        String result = future.get();
    }
}

Please note that the Echo callable in the above code sample also implements a Serializable interface, since it may be sent to another member to be processed.

When a task is deserialized, HazelcastInstance needs to be accessed. To do this, the task should implement HazelcastInstanceAware interface. Please see the HazelcastInstanceAware Interface section for more information.

10.1.2. Implementing a Runnable Task

In Hazelcast, when you implement a task as java.util.concurrent.runnable (a task that does not return a value), you implement Runnable and Serializable.

Below is Runnable example code. It is a task that waits for some time and echoes a message.

public class EchoTask implements Runnable, Serializable {

    private final String msg;

    public EchoTask( String msg ) {
        this.msg = msg;
    }

    @Override
    public void run() {
        try {
            Thread.sleep( 5000 );
        } catch ( InterruptedException e ) {
        }
        System.out.println( "echo:" + msg );
    }
}
Executing a Runnable Task

To execute the runnable task:

  • Retrieve the Executor from HazelcastInstance.

  • Submit the tasks to the Executor.

Now let’s write a class that submits and executes these echo messages. Executor is retrieved from HazelcastInstance and 1000 echo tasks are submitted.

public class RunnableMasterMember {

    public static void main( String[] args ) throws Exception {
        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
        IExecutorService executor = hazelcastInstance.getExecutorService( "exec" );
        for ( int k = 1; k <= 1000; k++ ) {
            Thread.sleep( 1000 );
            System.out.println( "Producing echo task: " + k );
            executor.execute( new EchoTask( String.valueOf( k ) ) );
        }
        System.out.println( "EchoTaskMain finished!" );
    }
}

10.1.3. Scaling The Executor Service

You can scale the Executor service both vertically (scale up) and horizontally (scale out).

To scale up, you should improve the processing capacity of the cluster member (JVM). You can do this by increasing the pool-size property mentioned in Configuring Executor Service (i.e., increasing the thread count). However, please be aware of your member’s capacity. If you think it cannot handle such an additional load caused by increasing the thread count, you may want to consider improving the member’s resources (CPU, memory, etc.). As an example, set the pool-size to 5 and run the above MasterMember. You will see that EchoTask is run as soon as it is produced.

To scale out, add more members instead of increasing only one member’s capacity. In reality, you may want to expand your cluster by adding more physical or virtual machines. For example, in the EchoTask example in the Runnable section, you can create another Hazelcast instance. That instance will automatically get involved in the executions started in MasterMember and start processing.

10.1.4. Executing Code in the Cluster

The distributed executor service is a distributed implementation of java.util.concurrent.ExecutorService. It allows you to execute your code in the cluster. In this section, the code examples are based on the Echo class above (please note that the Echo class is Serializable). The code examples show how Hazelcast can execute your code (Runnable, Callable):

  • echoOnTheMember: On a specific cluster member you choose with the IExecutorService submitToMember method.

  • echoOnTheMemberOwningTheKey: On the member owning the key you choose with the IExecutorService submitToKeyOwner method.

  • echoOnSomewhere: On the member Hazelcast picks with the IExecutorService submit method.

  • echoOnMembers: On all or a subset of the cluster members with the IExecutorService submitToMembers method.

public void echoOnTheMember( String input, Member member ) throws Exception {
    Callable<String> task = new Echo( input );
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService executorService =
      hazelcastInstance.getExecutorService( "default" );

    Future<String> future = executorService.submitToMember( task, member );
    String echoResult = future.get();
}
public void echoOnTheMemberOwningTheKey( String input, Object key ) throws Exception {
    Callable<String> task = new Echo( input );
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService executorService =
      hazelcastInstance.getExecutorService( "default" );

    Future<String> future = executorService.submitToKeyOwner( task, key );
    String echoResult = future.get();
}
public void echoOnSomewhere( String input ) throws Exception {
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService executorService =
      hazelcastInstance.getExecutorService( "default" );

    Future<String> future = executorService.submit( new Echo( input ) );
    String echoResult = future.get();
}
public void echoOnMembers( String input, Set<Member> members ) throws Exception {
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService executorService =
      hazelcastInstance.getExecutorService( "default" );

    Map<Member, Future<String>> futures = executorService
      .submitToMembers( new Echo( input ), members );

    for ( Future<String> future : futures.values() ) {
        String echoResult = future.get();
        // ...
    }
}
You can obtain the set of cluster members via HazelcastInstance.getCluster().getMembers() call.

10.1.5. Canceling an Executing Task

A task in the code that you execute in a cluster might take longer than expected. If you cannot stop/cancel that task, it will keep eating your resources.

To cancel a task, you can use the standard Java executor framework’s cancel() API. This framework encourages us to code and design for cancellations, a highly ignored part of software development.

Example Task to Cancel

The Fibonacci callable class below calculates the Fibonacci number for a given number. In the calculate method, we check if the current thread is interrupted so that the code can respond to cancellations once the execution is started.

int input = 0;

public FibonacciCallable( int input ) {
    this.input = input;
}

public Long call() {
    return calculate( input );
}

private long calculate( int n ) {
    if ( Thread.currentThread().isInterrupted() ) {
        return 0;
    }
    if ( n <= 1 ) {
        return n;
    } else {
        return calculate( n - 1 ) + calculate( n - 2 );
    }
}
Example Method to Execute and Cancel the Task

The fib() method below submits the Fibonacci calculation task above for number 'n' and waits a maximum of 3 seconds for the result. If the execution does not completed in three seconds, future.get() will throw a TimeoutException and upon catching it, we cancel the execution, saving some CPU cycles.

long fib( int n ) throws Exception {
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService es = hazelcastInstance.getExecutorService("es");
    Future<Long> future = es.submit( new FibonacciCallable( n ) );
    try {
        long result = future.get( 3, TimeUnit.SECONDS );
        System.out.println(result);
    } catch ( TimeoutException e ) {
        future.cancel( true );
    }
    return -1;
}

fib(20) will probably take less than 3 seconds. However, fib(50) will take much longer. (This is not an example for writing better Fibonacci calculation code, but for showing how to cancel a running execution that takes too long.) The method future.cancel(false) can only cancel execution before it is running (executing), but future.cancel(true) can interrupt running executions provided that your code is able to handle the interruption. If you are willing to cancel an already running task, then your task should be designed to handle interruptions. If the calculate (int n) method did not have the (Thread.currentThread().isInterrupted()) line, then you would not be able to cancel the execution after it is started.

10.1.6. Callback When Task Completes

You can use the ExecutionCallback offered by Hazelcast to asynchronously be notified when the execution is done.

  • To be notified when your task completes without an error, implement the onResponse method.

  • To be notified when your task completes with an error, implement the onFailure method.

Example Task to Callback

Let’s use the Fibonacci series to explain this. The example code below is the calculation that will be executed. Note that it is Callable and Serializable.

public class Fibonacci2 implements Callable<Long>, Serializable {

    private final int input;

    public Fibonacci2(int input) {
        this.input = input;
    }

    public Long call() {
        return calculate(input);
    }

    private long calculate(int n) {
        if (Thread.currentThread().isInterrupted()) {
            System.out.println("FibonacciCallable is interrupted");
            throw new RuntimeException("FibonacciCallable is interrupted");
        }
        if (n <= 1) {
            return n;
        } else {
            return calculate(n - 1) + calculate(n - 2);
        }
    }
}
Example Method to Callback the Task

The example code below submits the Fibonacci calculation to ExecutionCallback and prints the result asynchronously. ExecutionCallback has the methods onResponse and onFailure. In this example code, onResponse is called upon a valid response and prints the calculation result, whereas onFailure is called upon a failure and prints the stacktrace.

public class MasterMemberCallback {

    public static void main(String[] args) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IExecutorService executor = hz.getExecutorService("executor");

        ExecutionCallback<Long> executionCallback = new ExecutionCallback<Long>() {
            public void onFailure(Throwable t) {
                t.printStackTrace();
            }

            public void onResponse(Long response) {
                System.out.println("Result: " + response);
            }
        };

        executor.submit(new FibonacciCallable(10), executionCallback);
        System.out.println("Fibonacci task submitted");
    }
}

10.1.7. Selecting Members for Task Execution

As previously mentioned, it is possible to indicate where in the Hazelcast cluster the Runnable or Callable is executed. Usually you execute these in the cluster based on the location of a key or a set of keys, or you allow Hazelcast to select a member.

If you want more control over where your code runs, use the MemberSelector interface. For example, you may want certain tasks to run only on certain members, or you may wish to implement some form of custom load balancing regime. The MemberSelector is an interface that you can implement and then provide to the IExecutorService when you submit or execute.

The select(Member) method is called for every available member in the cluster. Implement this method to decide if the member is going to be used or not.

In a simple example shown below, we select the cluster members based on the presence of an attribute.

public class MyMemberSelector implements MemberSelector {
    public boolean select(Member member) {
        return Boolean.TRUE.equals(member.getBooleanAttribute("my.special.executor"));
    }
}

You can use MemberSelector instances provided by the com.hazelcast.cluster.memberselector.MemberSelectors class. For example, you can select a lite member for running a task using com.hazelcast.cluster.memberselector.MemberSelectors#LITE_MEMBER_SELECTOR.

10.1.8. Configuring Executor Service

The following are example configurations for executor service.

Declarative:

<executor-service name="exec">
   <pool-size>1</pool-size>
   <queue-capacity>10</queue-capacity>
   <statistics-enabled>true</statistics-enabled>
   <quorum-ref>quorumname</quorum-ref>
</executor-service>

Programmatic:

Config config = new Config();
ExecutorConfig executorConfig = config.getExecutorConfig("exec");
executorConfig.setPoolSize( 1 ).setQueueCapacity( 10 )
        .setStatisticsEnabled( true )
        .setQuorumName( "quorumname" );

Executor service configuration has the following elements.

  • pool-size: The number of executor threads per Member for the Executor. By default, Executor is configured to have 16 threads in the pool. You can change that with this element.

  • queue-capacity: Executor’s task queue capacity; the number of tasks this queue can hold.

  • statistics-enabled: You can retrieve some statistics (such as pending operations count, started operations count, completed operations count and cancelled operations count) by setting this parameter’s value to true. The method for retrieving the statistics is getLocalExecutorStats().

  • quorum-ref: Name of quorum configuration that you want this Executor Service to use. Please see the Split-Brain Protection for IExecutorService section.

10.1.9. Split-Brain Protection for IExecutorService

IExecutorService can be configured to check for a minimum number of available members before applying its operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • execute

    • executeOnAllMembers

    • executeOnKeyOwner

    • executeOnMember

    • executeOnMembers

    • shutdown

    • shutdownNow

    • submit

    • submitToAllMembers

    • submitToKeyOwner

    • submitToMember

    • submitToMembers

Configuring Split-Brain Protection

Split-Brain protection for Executor Service can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<executor-service name="default">
   ...
   <quorum-ref>quorumname</quorum-ref>
   ...
</executor-service>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

10.2. Durable Executor Service

Hazelcast’s durable executor service is a data structure which is able to store an execution task both on the executing Hazelcast member and its backup member(s), if configured. By this way, you do not lose any tasks if a member goes down or any results if the submitter (member or client) goes down while executing the task. When using the durable executor service you can either submit or execute a task randomly or on the owner of a provided key. Note that in executor service, you can submit or execute tasks to/on the selected member(s).

Processing of the tasks when using durable executor service involves two invocations:

  1. Sending the task to primary Hazelcast member (primary partition) and to its backups, if configured, and executing the task.

  2. Retrieving the result of the task.

As you may already know, Hazelcast’s executor service returns a future representing the task to the user. With the above two-invocations approach, it is guaranteed that the task is executed before the future returns and you can track the response of a submitted task with a unique ID. Hazelcast stores the task on both primary and backup members, and starts the execution also.

With the first invocation, a Ringbuffer stores the task and a generated sequence for the task is returned to the caller as a result. In addition to the storing, the task is executed on the local execution service for the primary member. By this way, the task is now resilient to member failures and you are able to track the task with its ID.

After the first invocation has completed and the sequence of task is returned, second invocation starts to retrieve the result of task with that sequence. This retrieval waits in the waiting operations queue until notified, or it runs immediately if the result is already available.

When task execution is completed, Ringbuffer replaces the task with the result for the given task sequence. This replacement notifies the waiting operations queue.

10.2.1. Configuring Durable Executor Service

This section presents example configurations for durable executor service along with the descriptions of its configuration elements and attributes.

Declarative:

<durable-executor-service name="myDurableExecSvc">
        <pool-size>8</pool-size>
        <durability>1</durability>
        <capacity>1</capacity>
        <quorum-ref>quorumname</quorum-ref>
</durable-executor-service>

Programmatic:

Config config = new Config();
config.getDurableExecutorConfig( "myDurableExecSvc" )
        .setPoolSize ( 8 )
        .setDurability( 1 )
        .setCapacity( 1 )
        .setQuorumName( "quorumname" );

HazelcastInstance hazelcast = Hazelcast.newHazelcastInstance(config);
DurableExecutorService durableExecSvc = hazelcast.getDurableExecutorService("myDurableExecSvc");

Following are the descriptions of each configuration element and attribute:

  • name: Name of the executor task.

  • pool-size: Number of executor threads per member for the executor.

  • durability: Number of backups in the cluster for the submitted task. Its default value is 1.

  • capacity: Executor’s task queue capacity; the number of tasks this queue can hold.

  • quorum-ref: Name of quorum configuration that you want this Durable Executor Service to use. Please see the Split-Brain Protection for Durable Executor Service section.

10.2.2. Split-Brain Protection for Durable Executor Service

Durable Executor Service can be configured to check for a minimum number of available members before applying its operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • disposeResult

    • execute

    • executeOnKeyOwner

    • retrieveAndDisposeResult

    • shutdown

    • shutdownNow

    • submit

    • submitToKeyOwner

  • READ, READ_WRITE:

    • retrieveResult

Configuring Split-Brain Protection

Split-Brain protection for Durable Executor Service can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<durable-executor-service name="myDurableExecSvc">
    ...
        <quorum-ref>quorumname</quorum-ref>
        ...
</durable-executor-service>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

10.3. Scheduled Executor Service

Hazelcast’s scheduled executor service (IScheduledExecutorService) is a data structure which implements the java.util.concurrent.ScheduledExecutorService, partially. Here, partially means that it allows the scheduling of a single future execution and/or at a fixed rate execution but not at a fixed delay.

On top of the Vanilla Scheduling APIs, the scheduled executor service allows additional ones such as the following:

  • scheduleOnMember: On a specific cluster member.

  • scheduleOnKeyOwner: On the partition owning that key.

  • scheduleOnAllMembers: On all cluster members.

  • scheduleOnAllMembers: On all given members.

Please refer to the IScheduledExecutorService Javadoc for its API details.

There are two different modes of durability for the service:

  1. Upon partition specific scheduling, the future task is stored both in the primary partition and also in its N backups, N being the <durability> property in the configuration. More specifically, there are always one or more backups to take ownership of the task in the event of a lost member. If a member is lost, the task will be re-scheduled on the backup (new primary) member, which might induce further delays on the subsequent executions of the task. For example, if we schedule a task to run in 10 seconds from now, schedule(new ExampleTask(), 10, TimeUnit.SECONDS); and after 5 seconds the owner member goes down (before the execution takes place), then the backup owner will re-schedule the task in 10 seconds from now. Therefore, from the user’s perspective waiting on the result, this will be available in 10 + 5 = 15 seconds rather than 10 seconds as it is anticipated originally. If atFixedRate was used, then only the initial delay is affected in the above scenario, all subsequent executions should adhere to the given period parameter.

  2. Upon member specific scheduling, the future task is only stored in the member itself, which means that in the event of a lost member, the task will be lost as well.

To accomplish the described durability, all tasks provide a unique identity/name before the scheduling takes place. The name allows the service to reach the scheduled task even after the caller (client or member) goes down and also allows to prevent duplicate tasks. The name of the task can be user-defined if it needs to be, by implementing the com.hazelcast.scheduledexecutor.NamedTask interface (plain wrapper util is available here: com.hazelcast.scheduledexecutor.TaskUtils.named(java.lang.String, java.lang.Runnable)). If the task does not provide a name in its implementation, the service provides a random UUID for it, internally.

Upon scheduling, the service returns an IScheduledFuture, which on top of the java.util.concurrent.ScheduledFuture functionality, provides an API to get the resource handler of the task ScheduledTaskHandler and also the runtime statistics of the task.

Futures associated with a scheduled task, in order to be aware of lost partitions and/or members, act as listeners on the local member/client. Therefore, they are always strongly referenced, on the member/client side. In order to clean up their resources, once completed, you can use the method dispose(). This method will also cancel further executions of the task if scheduled at fixed rate. You can refer to the IScheduledFuture Javadoc for its API details.

The task handler is a descriptor class holding information for the scheduled future, which is used to pinpoint the actual task in the cluster. It contains the name of the task, the owner (member or partition) and the scheduler name.

The handler is always available after scheduling and can be stored in a plain string format com.hazelcast.scheduledexecutor.ScheduledTaskHandler.toUrn() and re-constructed back from that String com.hazelcast.scheduledexecutor.ScheduledTaskHandler.of(). If the handler is lost, you can still find a task under a given scheduler by using the Scheduler’s com.hazelcast.scheduledexecutor.IScheduledExecutorService.getAllScheduledFutures().

Last but not least, similar to executor service, the scheduled executor service allows Stateful tasks to be scheduled. Stateful tasks, are tasks that require any kind of state during their runtime, which must also be durable along with the task in the event of a lost partition.

Stateful tasks can be created by implementing the com.hazelcast.scheduledexecutor.StatefulTask interface, providing implementation details for saving the state and loading it back. If a partition is lost, then the re-scheduled task will load the previously saved state before its execution.

As with the tasks, Objects stored in the state Map need to be Hazelcast serializable.

10.3.1. Configuring Scheduled Executor Service

This section presents example configurations for scheduled executor service along with the descriptions of its configuration elements and attributes.

Declarative:

<scheduled-executor-service name="myScheduledExecSvc">
        <pool-size>16</pool-size>
        <durability>1</durability>
        <capacity>100</capacity>
        <quorum-ref>quorumname</quorum-ref>
</scheduled-executor-service>

Programmatic:

Config config = new Config();
config.getScheduledExecutorConfig( "myScheduledExecSvc" )
        .setPoolSize ( 16 )
        .setCapacity( 100 )
        .setDurability( 1 )
        .setQuorumName( "quorumname" );

HazelcastInstance hazelcast = Hazelcast.newHazelcastInstance(config);
IScheduledExecutorService myScheduledExecSvc = hazelcast.getScheduledExecutorService("myScheduledExecSvc");

Following are the descriptions of each configuration element and attribute:

  • name: Name of the scheduled executor.

  • pool-size: Number of executor threads per member for the executor.

  • capacity: Maximum number of tasks that a scheduler can have per partition. Attempt to schedule more, will result in RejectedExecutionException. To free up the capacity, tasks should get disposed by the user.

  • durability: Durability of the executor.

  • quorum-ref: Name of quorum configuration that you want this Scheduled Executor Service to use. Please see the Split-Brain Protection for IScheduled Executor Service section.

10.3.2. Examples

Scheduling a callable that computes the cluster size in 10 seconds from now:

static class DelayedClusterSizeTask implements Callable<Integer>, HazelcastInstanceAware, Serializable {

    private transient HazelcastInstance instance;

    @Override
    public Integer call()
            throws Exception {
        return instance.getCluster().getMembers().size();
    }

    @Override
    public void setHazelcastInstance(HazelcastInstance hazelcastInstance) {
        this.instance = hazelcastInstance;
    }
}

HazelcastInstance hazelcast = Hazelcast.newHazelcastInstance();
IScheduledExecutorService executorService = hazelcast.getScheduledExecutorService("myScheduler");
IScheduledFuture<Integer> future = executorService.schedule(
        new DelayedClusterSizeTask(), 10, TimeUnit.SECONDS);

int membersCount = future.get(); // Block until we get the result
ScheduledTaskStatistics stats = future.getStats();
future.dispose(); // Always dispose futures that are not in use any more, to release resources
long totalTaskRuns = stats.getTotalRuns(); // = 1

10.3.3. Split-Brain Protection for IScheduled Executor Service

IScheduledExecutorService can be configured to check for a minimum number of available members before applying its operations (see Split-Brain Protection). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

Following is a list of methods that now support Split-Brain Protection checks. The list is grouped by quorum type.

  • WRITE, READ_WRITE:

    • schedule

    • scheduleAtFixedRate

    • scheduleOnAllMembers

    • scheduleOnAllMembersAtFixedRate

    • scheduleOnKeyOwner

    • scheduleOnKeyOwnerAtFixedRate

    • scheduleOnMember

    • scheduleOnMemberAtFixedRate

    • scheduleOnMembers

    • scheduleOnMembersAtFixedRate

    • shutdown

  • READ, READ_WRITE:

    • getAllScheduledFutures

Configuring Split-Brain Protection

Split-Brain protection for Scheduled Executor Service can be configured programmatically using the method setQuorumName(), or declaratively using the element quorum-ref. Following is an example declarative configuration:

<scheduled-executor-service name="myScheduledExecSvc">
    ...
        <quorum-ref>quorumname</quorum-ref>
        ...
</scheduled-executor-service>

The value of quorum-ref should be the quorum configuration name which you configured under the quorum element as explained in the Split-Brain Protection section.

10.4. Entry Processor

Hazelcast supports entry processing. An entry processor is a function that executes your code on a map entry in an atomic way.

An entry processor is a good option if you perform bulk processing on an IMap. Usually you perform a loop of keys - executing IMap.get(key), mutating the value and finally putting the entry back in the map using IMap.put(key,value). If you perform this process from a client or from a member where the keys do not exist, you effectively perform two network hops for each update: the first to retrieve the data and the second to update the mutated value.

If you are doing the process described above, you should consider using entry processors. An entry processor executes a read and updates upon the member where the data resides. This eliminates the costly network hops described above.

Entry processor is meant to process a single entry per call. Processing multiple entries and data structures in an entry processor is not supported as it may result in deadlocks.
Note that Hazelcast Jet is a good fit when you want to perform processing that involves multiple entries (aggregations, joins, etc.), or involves multiple computing steps to be made parallel. Hazelcast Jet contains an Entry Processor Sink to allow you to update Hazelcast IMDG data as a result of your Hazelcast Jet computation. Please refer to Hazelcast Jet’s Reference Manual.

10.4.1. Performing Fast In-Memory Map Operations

An entry processor enables fast in-memory operations on your map without you having to worry about locks or concurrency issues. You can apply it to a single map entry or to all map entries. Entry processors support choosing target entries using predicates. You do not need any explicit lock on entry thanks to the isolated threading model: Hazelcast runs the entry processor for all entries on a partitionThread so there will NOT be any interleaving of the entry processor and other mutations.

Hazelcast sends the entry processor to each cluster member and these members apply it to map entries. Therefore, if you add more members, your processing completes faster.

Using Indexes

Entry processors can be used with predicates. Predicates help to process a subset of data by selecting eligible entries. This selection can happen either by doing a full-table scan or by using indexes. To accelerate entry selection step, you can consider to add indexes. If indexes are there, entry processor will automatically use them.

Using OBJECT In-Memory Format

If entry processing is the major operation for a map and if the map consists of complex objects, you should use OBJECT as the in-memory-format to minimize serialization cost. By default, the entry value is stored as a byte array (BINARY format). When it is stored as an object (OBJECT format), then the entry processor is applied directly on the object. In that case, no serialization or deserialization is performed. However, if there is a defined event listener, a new entry value will be serialized when passing to the event publisher service.

When in-memory-format is OBJECT, the old value of the updated entry will be null.
Processing Entries

The methods below are in the IMap interface for entry processing.

  • executeOnKey processes an entry mapped by a key.

  • executeOnKeys processes entries mapped by a collection of keys.

  • submitToKey processes an entry mapped by a key while listening to event status.

  • executeOnEntries processes all entries in a map.

  • executeOnEntries can also process all entries in a map with a defined predicate.

When using the executeOnEntries method, if the number of entries is high and you need the results, then returning null with the process() method is a good practice. This method is offered by the EntryProcessor interface. By returning null, results of the processing is not stored in the map and thus out of memory errors are eliminated.

If you want to execute a task on a single key, you can also use executeOnKeyOwner provided by IExecutorService. However, in this case you need to perform a lock and serialization.

Entry processors run via Operation Threads that are dedicated to specific partitions. Therefore, with long running entry processor executions, other partition operations such as map.put(key) cannot be processed. With this in mind, it is a good practice to make your entry processor executions as quick as possible.
Respecting Locks on Single Keys

The entry processor respects locks ONLY when its executions are performed on a single key. As explained in the above section, the entry processor has the following methods to process a single key:

Object executeOnKey(K key, EntryProcessor entryProcessor);
ICompletableFuture submitToKey(K key, EntryProcessor entryProcessor);

Therefore, if you want to to perform an entry processor execution on a single key using one of these methods and that key has a lock on it, the execution will wait until the lock on that key is removed.

Processing Backup Entries

If your code modifies the data, then you should also provide a processor for backup entries. This is required to prevent the primary map entries from having different values than the backups because it causes the entry processor to be applied both on the primary and backup entries. The interface EntryBackupProcessor offers the method processBackup for this purpose.

It is possible that an entry processor could see that a key exists though its backup processor may not find it at the run time due to an unsent backup of a previous operation, e.g., a previous put operation. In those situations, Hazelcast internally/eventually will synchronize those owner and backup partitions so you will not lose any data. When coding an EntryBackupProcessor, you should take that case into account, otherwise NullPointerException can be seen since Map.Entry.getValue() may return null.

10.4.2. Creating an Entry Processor

The class IncrementingEntryProcessor creates an entry processor to process the map entries. It implements:

  • the map interfaces EntryProcessor and EntryBackupProcessor.

  • java.io.Serializable interface.

  • EntryProcessor methods process and getBackupProcessor.

  • EntryBackupProcessor method processBackup.

public class IncrementingEntryProcessor
        implements EntryProcessor<Integer, Integer>, EntryBackupProcessor<Integer, Integer>, Serializable {

    public Object process( Map.Entry<Integer, Integer> entry ) {
        Integer value = entry.getValue();
        entry.setValue( value + 1 );
        return value + 1;
    }

    public EntryBackupProcessor<Integer, Integer> getBackupProcessor() {
        return IncrementingEntryProcessor.this;
    }

    public void processBackup( Map.Entry<Integer, Integer> entry ) {
        entry.setValue( entry.getValue() + 1 );
    }
}

A sample usage is shown below:

IMap<Integer, Integer> map = hazelcastInstance.getMap( "myMap" );
for ( int i = 0; i < 100; i++ ) {
    map.put( i, i );
}
Map<Integer, Object> res = map.executeOnEntries( new IncrementingEntryProcessor() );
You should explicitly call the setValue method of Map.Entry when modifying data in the entry processor. Otherwise, the entry processor will be accepted as read-only.
An entry processor instance is not thread safe. If you are storing a partition specific state between invocations, be sure to register this in a thread-local. An entry processor instance can be used by multiple partition threads.

10.4.3. Abstract Entry Processor

You can use the AbstractEntryProcessor class when the same processing will be performed both on the primary and backup map entries, i.e., the same logic applies to them. If you use entry processor, you need to apply the same logic to the backup entries separately. The AbstractEntryProcessor class makes this primary/backup processing easier.

You can use the AbstractEntryProcessor class to create your own abstract entry processor. The method getBackupProcessor in this class returns an EntryBackupProcessor instance. This means the same processing will be applied to both the primary and backup entries. If you want to apply the processing only upon the primary entries, make the getBackupProcessor method return null.

Beware of the null issue described above. Due to a yet unsent backup from a previous operation, an EntryBackupProcessor may temporarily receive null from Map.Entry.getValue() even though the value actually exists in the map. If you decide to use AbstractEntryProcessor, make sure your code logic is not sensitive to null values, or you may encounter NullPointerException during runtime.

10.4.4. Entry Processor Performance Optimizations

By default the entry processor executes on a partition thread. A partition thread is responsible for handling one or more partitions. The design of entry processor assumes users have fast user code execution of the process() method. In the pathological case where the code is very heavy and executes in multi-milliseconds, this may create a bottleneck.

We have a slow user code detector which can be used to log a warning controlled by the following system properties:

  • hazelcast.slow.operation.detector.enabled (default: true)

  • hazelcast.slow.operation.detector.threshold.millis (default: 10000)

The defaults catch extremely slow operations but you should set this much lower, say to 1ms, at development time to catch entry processors that could be problematic in production. These are good candidates for our optimizations.

We have two optimizations:

  • Offloadable which moves execution off the partition thread to an executor thread

  • ReadOnly which means we can avoid taking a lock on the key

These are enabled very simply by implementing these interfaces in your EntryProcessor.

As of 3.9, these optimizations apply to the following IMap methods only:

  • executeOnKey(Object, EntryProcessor)

  • submitToKey(Object, EntryProcessor)

  • submitToKey(Object, EntryProcessor, ExecutionCallback)

Offloadable Entry Processor

If an entry processor implements the Offloadable interface, the process() method will be executed in the executor specified by the Offloadable’s `getExecutorName() method.

Offloading will unblock the partition thread allowing the user to profit from much higher throughput. The key will be locked for the time span of the processing in order to not generate a write conflict.

In this case the threading looks as follows:

  1. partition thread (fetch entry & lock key)

  2. execution thread (process(entry) method)

  3. partition thread (set new value & unlock key, or just unlock key if the entry has not been modified)

The method getExecutorName() method may also return two constants defined in the Offloadable interface:

  • NO_OFFLOADING: Processing will not be offloaded if the method getExecutorName() returns this constant; it will be executed as if it does not implement the Offloadable interface.

  • OFFLOADABLE_EXECUTOR: Processing will be offloaded to the default ExecutionService.OFFLOADABLE_EXECUTOR.

Note that if the method getExecutorName() cannot find an executor whose name matches the one called by this method, then the default executor service is used. Here is the configuration for the "default" executor:

<executor-service name="default">
    <pool-size>16</pool-size>
    <queue-capacity>0</queue-capacity>
</executor-service>

An example of an Offloadable called "OffloadedInventoryEntryProcessor" would be as follows:

<executor-service name="OffloadedInventoryEntryProcessor”>
    <pool-size>30</pool-size>
    <queue-capacity>0</queue-capacity>
</executor-service>

Remember to set the pool-size (count of executor threads per member) according to your execution needs. Please refer to the Configuring Executor Service section for the configuration details.

ReadOnly Entry Processor

By default, an entry processor will not run if the key is locked. It will wait until the key has been unlocked (it applies to the executeOnKey, submitToKey methods, that were mentioned before).

If the entry processor implements the ReadOnly interface without implementing the Offloadable interface, the processing will not be offloaded to an external executor. However, the entry processor will not observe if the key of the processed entry is locked, nor will try to acquire the lock since the entry processor will not do any modifications.

If the entry processor implements ReadOnly and modifies the entry, an UnsupportedOperationException will be thrown.

ReadOnly and Offloadable Entry Processor

If the entry processor implements both ReadOnly and Offloadable interfaces, we will observe the combination of both optimizations described above.

The process() method will be executed in the executor specified by the Offloadable’s `getExecutorName() method. Also, the entry processor will not observe if the key of the processed entry is locked, nor will try to acquire the lock since the entry processor will not do any modifications.

In this case the threading looks as follows:

  1. partition thread (fetch entry)

  2. execution thread (process(entry))

In this case the EntryProcessor.getBackupProcessor() has to return null; otherwise an IllegalArgumentException exception is thrown.

If the entry processor implements ReadOnly and modifies the entry, an UnsupportedOperationException will be thrown.

Putting it all together:

public class OffloadableReadOnlyEntryProcessor implements EntryProcessor<String, Employee>,
        Offloadable, ReadOnly {

    @Override
    public Object process(Map.Entry<String, Employee> entry) {
        // heavy logic
        return null;
    }

    @Override
    public EntryBackupProcessor<String, Employee> getBackupProcessor() {
        // ReadOnly EntryProcessor has to return null, since it's just a read-only operation that will not be
        // executed on the backup
        return null;
    }

    @Override
    public String getExecutorName() {
        return OFFLOADABLE_EXECUTOR;
    }
}

11. Distributed Query

Distributed queries access data from multiple data sources stored on either the same or different members.

Hazelcast partitions your data and spreads it across cluster of members. You can iterate over the map entries and look for certain entries (specified by predicates) you are interested in. However, this is not very efficient because you will have to bring the entire entry set and iterate locally. Instead, Hazelcast allows you to run distributed queries on your distributed map.

11.1. How Distributed Query Works

  1. The requested predicate is sent to each member in the cluster.

  2. Each member looks at its own local entries and filters them according to the predicate. At this stage, key/value pairs of the entries are deserialized and then passed to the predicate.

  3. The predicate requester merges all the results coming from each member into a single set.

Distributed query is highly scalable. If you add new members to the cluster, the partition count for each member is reduced and thus the time spent by each member on iterating its entries is reduced. In addition, the pool of partition threads evaluates the entries concurrently in each member and the network traffic is also reduced since only filtered data is sent to the requester.

Hazelcast offers the following APIs for distributed query purposes:

  • Criteria API

  • Distributed SQL Query

11.1.1. Employee Map Query Example

Assume that you have an "employee" map containing values of Employee objects, as coded below.

public class Employee implements Serializable {
    private String name;
    private int age;
    private boolean active;
    private double salary;

    public Employee(String name, int age, boolean active, double salary) {
        this.name = name;
        this.age = age;
        this.active = active;
        this.salary = salary;
    }

    public Employee() {
    }

    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }

    public double getSalary() {
        return salary;
    }

    public boolean isActive() {
        return active;
    }
}

Now let’s look for the employees who are active and have an age less than 30 using the aforementioned APIs (Criteria API and Distributed SQL Query). The following subsections describe each query mechanism for this example.

When using Portable objects, if one field of an object exists on one member but does not exist on another one, Hazelcast does not throw an unknown field exception. Instead, Hazelcast treats that predicate, which tries to perform a query on an unknown field, as an always false predicate.

11.1.2. Querying with Criteria API

Criteria API is a programming interface offered by Hazelcast that is similar to the Java Persistence Query Language (JPQL). Below is the code for the above example query.

IMap<String, Employee> map = hazelcastInstance.getMap( "employee" );

EntryObject e = new PredicateBuilder().getEntryObject();
Predicate predicate = e.is( "active" ).and( e.get( "age" ).lessThan( 30 ) );

Collection<Employee> employees = map.values( predicate );

In the above example code, predicate verifies whether the entry is active and its age value is less than 30. This predicate is applied to the employee map using the map.values(predicate) method. This method sends the predicate to all cluster members and merges the results coming from them. Since the predicate is communicated between the members, it needs to be serializable.

Predicates can also be applied to keySet, entrySet and localKeySet of the Hazelcast distributed map.
Predicates Class Operators

The Predicates class offered by Hazelcast includes many operators for your query requirements. Some of them are explained below.

  • equal: Checks if the result of an expression is equal to a given value.

  • notEqual: Checks if the result of an expression is not equal to a given value.

  • instanceOf: Checks if the result of an expression has a certain type.

  • like: Checks if the result of an expression matches some string pattern. % (percentage sign) is the placeholder for many characters, (underscore) is placeholder for only one character.

  • greaterThan: Checks if the result of an expression is greater than a certain value.

  • greaterEqual: Checks if the result of an expression is greater than or equal to a certain value.

  • lessThan: Checks if the result of an expression is less than a certain value.

  • lessEqual: Checks if the result of an expression is less than or equal to a certain value.

  • between: Checks if the result of an expression is between two values (this is inclusive).

  • in: Checks if the result of an expression is an element of a certain collection.

  • isNot: Checks if the result of an expression is false.

  • regex: Checks if the result of an expression matches some regular expression.

Please see the Predicates Javadoc for all predicates provided.
Combining Predicates with AND, OR, NOT

You can combine predicates using the and, or and not operators, as shown in the below examples.

public Collection<Employee> getWithNameAndAge( String name, int age ) {
    Predicate namePredicate = Predicates.equal( "name", name );
    Predicate agePredicate = Predicates.equal( "age", age );
    Predicate predicate = Predicates.and( namePredicate, agePredicate );
    return employeeMap.values( predicate );
}
public Collection<Employee> getWithNameOrAge( String name, int age ) {
    Predicate namePredicate = Predicates.equal( "name", name );
    Predicate agePredicate = Predicates.equal( "age", age );
    Predicate predicate = Predicates.or( namePredicate, agePredicate );
    return employeeMap.values( predicate );
}
public Collection<Employee> getNotWithName( String name ) {
    Predicate namePredicate = Predicates.equal( "name", name );
    Predicate predicate = Predicates.not( namePredicate );
    return employeeMap.values( predicate );
}
Simplifying with PredicateBuilder

You can simplify predicate usage with the PredicateBuilder class, which offers simpler predicate building. Please see the below example code which selects all people with a certain name and age.

public Collection<Employee> getWithNameAndAgeSimplified( String name, int age ) {
    EntryObject e = new PredicateBuilder().getEntryObject();
    Predicate agePredicate = e.get( "age" ).equal( age );
    Predicate predicate = e.get( "name" ).equal( name ).and( agePredicate );
    return employeeMap.values( predicate );
}

11.1.3. Querying with SQL

com.hazelcast.query.SqlPredicate takes the regular SQL where clause. Here is an example:

IMap<Employee> map = hazelcastInstance.getMap( "employee" );
Set<Employee> employees = map.values( new SqlPredicate( "active AND age < 30" ) );
Supported SQL Syntax

AND/OR: `<expression> AND <expression> AND <expression>…​ `

  • active AND age>30

  • active=false OR age = 45 OR name = 'Joe'

  • active AND ( age > 20 OR salary < 60000 )

Equality: =, !=, <, ⇐, >, >=

  • <expression> = value

  • age ⇐ 30

  • name = 'Joe'

  • salary != 50000

BETWEEN: <attribute> [NOT] BETWEEN <value1> AND <value2>

  • age BETWEEN 20 AND 33 ( same as age >= 20 AND age ⇐ 33 )

  • age NOT BETWEEN 30 AND 40 ( same as age < 30 OR age > 40 )

IN: <attribute> [NOT] IN (val1, val2,…​)

  • age IN ( 20, 30, 40 )

  • age NOT IN ( 60, 70 )

  • active AND ( salary >= 50000 OR ( age NOT BETWEEN 20 AND 30 ) )

  • age IN ( 20, 30, 40 ) AND salary BETWEEN ( 50000, 80000 )

LIKE: <attribute> [NOT] LIKE "expression"

The % (percentage sign) is placeholder for multiple characters, an _ (underscore) is placeholder for only one character.

  • name LIKE 'Jo%' (true for 'Joe', 'Josh', 'Joseph' etc.)

  • name LIKE 'Jo_' (true for 'Joe'; false for 'Josh')

  • name NOT LIKE 'Jo_' (true for 'Josh'; false for 'Joe')

  • name LIKE 'J_s%' (true for 'Josh', 'Joseph'; false 'John', 'Joe')

ILIKE: <attribute> [NOT] ILIKE 'expression'

Similar to LIKE predicate but in a case-insensitive manner.

  • name ILIKE 'Jo%' (true for 'Joe', 'joe', 'jOe','Josh','joSH', etc.)

  • name ILIKE 'Jo_' (true for 'Joe' or 'jOE'; false for 'Josh')

REGEX: <attribute> [NOT] REGEX 'expression'

  • name REGEX 'abc-.*' (true for 'abc-123'; false for 'abx-123')

Querying Entry Keys with Predicates

You can use __key attribute to perform a predicated search for entry keys. Please see the following example:

IMap<String, Person> personMap = hazelcastInstance.getMap(persons);
personMap.put("Alice", new Person("Alice", 35, Gender.FEMALE));
personMap.put("Andy",  new Person("Andy",  37, Gender.MALE));
personMap.put("Bob",   new Person("Bob",   22, Gender.MALE));
[...]
Predicate predicate = new SqlPredicate("__key like A%");
Collection<Person> startingWithA = personMap.values(predicate);

In this example, the code creates a collection with the entries whose keys start with the letter "A”.

11.1.4. Filtering with Paging Predicates

Hazelcast provides paging for defined predicates. With its PagingPredicate class, you can get a collection of keys, values, or entries page by page by filtering them with predicates and giving the size of the pages. Also, you can sort the entries by specifying comparators.

In the example code below:

  • The greaterEqual predicate gets values from the "students" map. This predicate has a filter to retrieve the objects with an "age" greater than or equal to 18.

  • Then a PagingPredicate is constructed in which the page size is 5, so there will be five objects in each page. The first time the values are called creates the first page.

  • It gets subsequent pages with the nextPage() method of PagingPredicate and querying the map again with the updated PagingPredicate.

IMap<Integer, Student> map = hazelcastInstance.getMap( "students" );
Predicate greaterEqual = Predicates.greaterEqual( "age", 18 );
PagingPredicate pagingPredicate = new PagingPredicate( greaterEqual, 5 );
// Retrieve the first page
Collection<Student> values = map.values( pagingPredicate );
...
// Set up next page
pagingPredicate.nextPage();
// Retrieve next page
values = map.values( pagingPredicate );
...

If a comparator is not specified for PagingPredicate, but you want to get a collection of keys or values page by page, this collection must be an instance of Comparable (i.e., it must implement java.lang.Comparable). Otherwise, the java.lang.IllegalArgument exception is thrown.

Starting with Hazelcast 3.6, you can also access a specific page more easily with the help of the method setPage(). This way, if you make a query for the hundredth page, for example, it will get all 100 pages at once instead of reaching the hundredth page one by one using the method nextPage(). Please note that this feature tires the memory and refer to the PagingPredicate Javadoc.

Paging Predicate, also known as Order & Limit, is not supported in Transactional Context.

11.1.5. Filtering with Partition Predicate

You can run queries on a single partition in your cluster using the partition predicate (PartitionPredicate).

It takes a predicate and partition key as parameters, gets the partition ID using the key and runs that predicate only on the partition where that key belongs.

Please see the following code snippet:

...
Predicate predicate = new PartitionPredicate<String, Integer>(partitionKey, TruePredicate.INSTANCE);

Collection<Integer> values = map.values(predicate);
Collection<String> keys = map.keySet(predicate);
...

By default there are 271 partitions, and using a regular predicate, each partition needs to be accessed. However, if the partition predicate will only access a single partition, this can lead to a big performance gain.

For the partition predicate to work correctly, you need to know which partition your data belongs to so that you can send the request to the correct partition. One of the ways of doing it is to make use of the PartitionAware interface when data is inserted, thereby controlling the owning partition. Please see the PartitionAware section for more information and examples.

A concrete example may be a webshop that sells phones and accessories. To find all the accessories of a phone, a query could be executed that selects all accessories for that phone. This query is executed on all members in the cluster and therefore could generate quite a lot of load. However, if we would store the accessories in the same partition as the phone, the partition predicate could use the partitionKey of the phone to select the right partition and then it queries for the accessories for that phone; and this reduces the load on the system and get faster query results.

11.1.6. Indexing Queries

Hazelcast distributed queries will run on each member in parallel and will return only the results to the caller. Then, on the caller side, the results will be merged.

When a query runs on a member, Hazelcast will iterate through all the owned entries and find the matching ones. This can be made faster by indexing the mostly queried fields, just like you would do for your database. Indexing will add overhead for each write operation but queries will be a lot faster. If you query your map a lot, make sure to add indexes for the most frequently queried fields. For example, if you do an active and age < 30 query, make sure you add an index for the active and age fields. The following example code does that by getting the map from the Hazelcast instance and adding indexes to the map with the IMap addIndex method.

IMap map = hazelcastInstance.getMap( "employees" );
// ordered, since we have ranged queries for this field
map.addIndex( "age", true );
// not ordered, because boolean field cannot have range
map.addIndex( "active", false );
Indexing Ranged Queries

IMap.addIndex(fieldName, ordered) is used for adding index. For each indexed field, if you have ranged queries such as age>30, age BETWEEN 40 AND 60, then you should set the ordered parameter to true. Otherwise, set it to false.

Configuring IMap Indexes

Also, you can define IMap indexes in configuration. An example is shown below.

<map name="default">
  ...
  <indexes>
    <index ordered="false">name</index>
    <index ordered="true">age</index>
  </indexes>
</map>

You can also define IMap indexes using programmatic configuration, as in the example below.

mapConfig.addMapIndexConfig( new MapIndexConfig( "name", false ) );
mapConfig.addMapIndexConfig( new MapIndexConfig( "age", true ) );

The following is the Spring declarative configuration for the same sample.

<hz:map name="default">
  <hz:indexes>
    <hz:index attribute="name"/>
    <hz:index attribute="age" ordered="true"/>
  </hz:indexes>
</hz:map>
Non-primitive types to be indexed should implement Comparable.
Starting with Hazelcast 3.9, if you configure the data structure to use High-Density Memory Store and indexes, the indexes are automatically stored in the High-Density Memory Store as well. This prevents from running into full GCs, when doing a lot of updates to index.
Copying Indexes

The underlying data structures used by the indexes need to copy the query results to make sure that the results are correct. This copying process is performed either when reading the index from the data structure (on-read) or writing to it (on-write).

On-read copying means that, for each index-read operation, the result of the query is copied before it is sent to the caller. Depending on the query result’s size, this type of index copying may be slower since the result is stored in a map, i.e., all entries need to have the hash calculated before being stored. Unlike the index-read operations, each index-write operation is fast, since there will be no copying taking place. So, this option can be preferred in index-write intensive cases.

On-write copying means that each index-write operation completely copies the underlying map to provide the copy-on-write semantics and this may be a slow operation depending on the index size. Unlike index-write operations, each index-read operation is fast since the operation only includes accessing the map that stores the results and returning them to the caller.

Another option is never copying the results of a query to a separate map. This means the results backed by the underlying index-map can change after the query has been executed (such as an entry might have been added or removed from an index, or it might have been remapped). This option can be preferred if you expect "mostly correct" results, i.e., if it is not a problem when some entries returned in the query result set do not match the initial query criteria. This is the fastest option since there is no copying.

You can set one these options using the system property hazelcast.index.copy.behavior. The following values, which are explained in the above paragraphs, can be set:

  • COPY_ON_READ (the default value)

  • COPY_ON_WRITE

  • NEVER

Usage of this system property is supported for BINARY and OBJECT in-memory formats. Only in Hazelcast 3.8.7, it is also supported for NATIVE in-memory format.
Indexing Attributes with ValueExtractor

You can also define custom attributes that may be referenced in predicates, queries and indexes. Custom attributes can be defined by implementing a ValueExtractor. Please see the Custom Attributes section for details.

Using "this" as an Attribute

You can use the keyword this as an attribute name while adding an index or creating a predicate. A basic usage is shown below.

map.addIndex("this", true);
Predicate<Integer, Integer> lessEqual = Predicates.between("this", 12, 20);

Another basic sample using SqlPredicate is shown below.

new SqlPredicate("this = 'jones'")
new SqlPredicate("this.age > 33")

The special attribute this acts on the value of a map entry. Typically, you do not need to specify it while accessing a property of an entry’s value, since its presence is implicitly assumed if the special attribute __key is not specified.

11.1.7. Configuring Query Thread Pool

You can change the size of thread pool dedicated to query operations using the pool-size property. Each query consumes a single thread from a Generic Operations ThreadPool on each Hazelcast member - let’s call it the query-orchestrating thread. That thread is blocked throughout the whole execution-span of a query on the member.

The query-orchestrating thread will use the threads from the query-thread pool in two cases:

  • if you run a PagingPredicate - since each page is run as a separate task,

  • if you set the system property hazelcast.query.predicate.parallel.evaluation to true - since the predicates are evaluated in parallel.

Please see Filtering with Paging Predicates and System Properties sections for information on paging predicates and for description of the above system property.

Below is an example of that declarative configuration.

<executor-service name="hz:query">
  <pool-size>100</pool-size>
</executor-service>

Below is the equivalent programmatic configuration.

Config cfg = new Config();
cfg.getExecutorConfig("hz:query").setPoolSize(100);
Query Requests from Clients

When dealing with the query requests coming from the clients to your members, Hazelcast offers the following system properties to tune your thread pools:

  • hazelcast.clientengine.thread.count which is the number of threads to process non-partition-aware client requests, like map.size() and executor tasks. Its default value is the number of cores multiplied by 20.

  • hazelcast.clientengine.query.thread.count which is the number of threads to process query requests coming from the clients. Its default value is the number of cores.

If there are a lot of query request from the clients, you may want to increase the value of hazelcast.clientengine.query.thread.count. In addition to this tuning, you may also consider increasing the value of hazelcast.clientengine.thread.count if the CPU load in your system is not high and there is plenty of free memory.

11.2. Querying in Collections and Arrays

Hazelcast allows querying in collections and arrays. Querying in collections and arrays is compatible with all Hazelcast serialization methods, including the Portable serialization.

Let’s have a look at the following data structure expressed in pseudo-code:

class Motorbike {
    Wheel wheels[2];
}

class Wheel {
   String name;

}

In order to query a single element of a collection/array, you can execute the following query:

// it matches all motorbikes where the zero wheel's name is 'front-wheel'
Predicate p = Predicates.equal("wheels[0].name", "front-wheel");
Collection<Motorbike> result = map.values(p);

It is also possible to query a collection/array using the any semantic as shown below:

// it matches all motorbikes where any wheel's name is 'front-wheel'
Predicate p = Predicates.equal("wheels[any].name", "front-wheel");
Collection<Motorbike> result = map.values(p);

The exact same query may be executed using the SQLPredicate as shown below:

Predicate p = new SqlPredicate("wheels[any].name = 'front-wheel'");
Collection<Motorbike> result = map.values(p);

[] notation applies to both collections and arrays.

11.2.1. Indexing in Collections and Arrays

You can also create an index using a query in collections and arrays.

Please note that in order to leverage the index, the attribute name used in the query has to be the same as the one used in the index definition.

Let’s assume you have the following index definition:

<indexes>
  <index ordered="false">wheels[any].name</index>
</indexes>

The following query will use the index:

Predicate p = Predicates.equal("wheels[any].name", "front-wheel");

The following query, however, will NOT leverage the index, since it does not use exactly the same attribute name that was used in the index:

Predicates.equal("wheels[0].name", "front-wheel")

In order to use the index in the case mentioned above, you have to create another index, as shown below:

<indexes>
  <index ordered="false">wheels[0].name</index>
</indexes>

11.2.2. Corner cases

Handling of corner cases may be a bit different than in a programming language like Java.

Let’s have a look at the following examples in order to understand the differences. To make the analysis simpler, let’s assume that there is only one Motorbike object stored in a Hazelcast Map.

Id Query Data State Extraction Result Match

1

Predicates.equal("wheels[7].name", "front-wheel")

wheels.size() == 1

null

No

2

Predicates.equal("wheels[7].name", null)

wheels.size() == 1

null

Yes

3

Predicates.equal("wheels[0].name", "front-wheel")

wheels[0].name == null

null

No

4

Predicates.equal("wheels[0].name", null)

wheels[0].name == null

null

Yes

5

Predicates.equal("wheels[0].name", "front-wheel")

wheels[0] == null

null

No

6

Predicates.equal("wheels[0].name", null)

wheels[0] == null

null

Yes

7

Predicates.equal("wheels[0].name", "front-wheel")

wheels == null

null

No

8

Predicates.equal("wheels[0].name", null)

wheels == null

null

Yes

As you can see, no NullPointerException`s or `IndexOutOfBoundException`s are thrown in the extraction process, even though parts of the expression are `null.

Looking at examples 4, 6 and 8, we can also easily notice that it is impossible to distinguish which part of the expression was null. If we execute the following query wheels[1].name = null, it may be evaluated to true because:

  • wheels collection/array is null.

  • index == 1 is out of bound.

  • name attribute of the wheels[1] object is null.

In order to make the query unambiguous, extra conditions would have to be added, e.g., wheels != null AND wheels[1].name = null.

11.3. Custom Attributes

It is possible to define a custom attribute that may be referenced in predicates, queries and indexes.

A custom attribute is a "synthetic" attribute that does not exist as a field or a getter in the object that it is extracted from. Thus, it is necessary to define the policy on how the attribute is supposed to be extracted. Currently the only way to extract a custom attribute is to implement a com.hazelcast.query.extractor.ValueExtractor that encompasses the extraction logic.

Custom Attributes are compatible with all Hazelcast serialization methods, including the Portable serialization.

11.3.1. Implementing a ValueExtractor

In order to implement a ValueExtractor, extend the abstract com.hazelcast.query.extractor.ValueExtractor class and implement the extract() method. This method does not return any values since the extracted value is collected by the ValueCollector. In order to return multiple results from a single extraction, invoke the ValueCollector.collect() method multiple times, so that the collector collects all results.

Please refer to the ValueExtractor and ValueCollector Javadocs.

ValueExtractor with Portable Serialization

Portable serialization is a special kind of serialization where there is no need to have the class of the serialized object on the classpath in order to read its attributes. That is the reason why the target object passed to the ValueExtractor.extract() method will not be of the exact type that has been stored. Instead, an instance of a com.hazelcast.query.extractor.ValueReader will be passed. ValueReader enables reading the attributes of a Portable object in a generic and type-agnostic way. It contains two methods:

  • read(String path, ValueCollector<T> collector) - enables passing all results directly to the ValueCollector.

  • read(String path, ValueCallback<T> callback) - enables filtering, transforming and grouping the result of the read operation and manually passing it to the ValueCollector.

Please