Hazelcast IMDG

Hazelcast IMDG Reference Manual

Version 4.0-SNAPSHOT

Preface

Welcome to the Hazelcast IMDG (In-Memory Data Grid) Reference Manual. This manual includes concepts, instructions and examples to guide you on how to use Hazelcast and build Hazelcast IMDG applications.

As the reader of this manual, you must be familiar with the Java programming language and you should have installed your preferred Integrated Development Environment (IDE).

Hazelcast IMDG Editions

This Reference Manual covers all editions of Hazelcast IMDG. Throughout this manual:

  • Hazelcast or Hazelcast IMDG refers to the open source edition of Hazelcast in-memory data grid middleware. Hazelcast is also the name of the company (Hazelcast, Inc.) providing the Hazelcast product.

  • Hazelcast IMDG Enterprise is a commercially licensed edition of Hazelcast IMDG which provides high-value enterprise features in addition to Hazelcast IMDG.

  • Hazelcast IMDG Enterprise HD is a commercially licensed edition of Hazelcast IMDG which provides High-Density (HD) Memory Store and Hot Restart Persistence features in addition to Hazelcast IMDG Enterprise.

Hazelcast IMDG Architecture

You can see the features for all Hazelcast IMDG editions in the following architecture diagram.

Hazelcast Architecture
You can see small "HD" boxes for some features in the above diagram. Those features can use High-Density (HD) Memory Store when it is available. It means if you have Hazelcast IMDG Enterprise HD, you can use those features with HD Memory Store.

For more information on Hazelcast IMDG’s Architecture, see the white paper An Architect’s View of Hazelcast.

Hazelcast IMDG Plugins

You can extend Hazelcast IMDG’s functionality by using its plugins. These plugins have their own lifecycles. See the Plugins page to learn about Hazelcast plugins you can use. Hazelcast plugins are marked with Plugin label throughout this manual. See also the Hazelcast Plugins chapter for more information.

Licensing

Hazelcast IMDG and Hazelcast Reference Manual are free and provided under the Apache License, Version 2.0. Hazelcast IMDG Enterprise and Hazelcast IMDG Enterprise HD is commercially licensed by Hazelcast, Inc.

For more detailed information on licensing, see the License Questions appendix.

Trademarks

Hazelcast is a registered trademark of Hazelcast, Inc. All other trademarks in this manual are held by their respective owners.

Customer Support

Support for Hazelcast is provided via GitHub\, Mail Group and StackOverflow.

For information on the commercial support for Hazelcast IMDG and Hazelcast IMDG Enterprise, see hazelcast.com.

Release Notes

See the Release Notes document for the new features, enhancements and fixes performed for each Hazelcast IMDG release.

Contributing to Hazelcast IMDG

You can contribute to the Hazelcast IMDG code, report a bug, or request an enhancement. See the following resources.

Partners

Hazelcast partners with leading hardware and software technologies, system integrators, resellers and OEMs including Amazon Web Services, Vert.x, Azul Systems, C2B2. See the Partners page for the full list of and information on our partners.

Phone Home

Hazelcast uses phone home data to learn about usage of Hazelcast IMDG.

Hazelcast IMDG member instances call our phone home server initially when they are started and then every 24 hours. This applies to all the instances joined to the cluster.

What is sent in?

The following information is sent in a phone home:

  • Hazelcast IMDG version

  • Local Hazelcast IMDG member UUID

  • Download ID

  • A hash value of the cluster ID

  • Cluster size bands for 5, 10, 20, 40, 60, 100, 150, 300, 600 and > 600

  • Number of connected clients bands of 5, 10, 20, 40, 60, 100, 150, 300, 600 and > 600

  • Cluster uptime

  • Member uptime

  • Environment Information:

    • Name of operating system

    • Kernel architecture (32-bit or 64-bit)

    • Version of operating system

    • Version of installed Java

    • Name of Java Virtual Machine

  • Hazelcast IMDG Enterprise specific:

    • Number of clients by language (Java, C++, C#)

    • Flag for Hazelcast Enterprise

    • Hash value of license key

    • Native memory usage

  • Hazelcast Management Center specific:

    • Hazelcast Management Center version

    • Hash value of Hazelcast Management Center license key

Phone Home Code

The phone home code itself is open source. See the code here.

Disabling Phone Homes

Set the hazelcast.phone.home.enabled system property to false either in the config or on the Java command line. See the System Properties appendix for information on how to set a property.

You can also disable the phone home using the environment variable HZ_PHONE_HOME_ENABLED.

Simply add the following line to your .bash_profile:

export HZ_PHONE_HOME_ENABLED=false

Phone Home URLs

For versions 1.x and 2.x: http://www.hazelcast.com/version.jsp.

For versions 3.x up to 3.6: http://versioncheck.hazelcast.com/version.jsp.

For versions after 3.6: http://phonehome.hazelcast.com/ping.

1. Document Revision History

This chapter lists the changes made to this document from the previous release.

See the Release Notes for the new features, enhancements and fixes performed for each Hazelcast release.
Table 1. Revision History

Chapter

Description

All

Most of this Reference Manual has been generally improved/updated due to IMDG 4.0 API changes. Other than the specific updates listed in the following table rows, here are the ones which affected large portions of the document:

  • Renamed the group configuration element as cluster.

  • Renamed the quorum term as "split-brain protection" (also in the codes and configuration).

  • Reorganized the whole document after the legacy concurrency APIs are removed.

  • Reorganized the entry processor and WAN replication content.

  • Renamed max-size configuration element for eviction as size.

Understanding Configuration

Added content related to newly introduced complete example configurations for Hazelcast Java client and client failover.

[setting-up-clusters<]

Rolling Member Upgrades

Added Enabling Auto-Upgrading as a new section.

Distributed Data Structures

Added new content related to EntryStore and EntryLoader to the Loading and Storing Persistent Data section.

Distributed Events

Updated the Listening for Migration Events according to the changes in the MigrationListener API.

Distributed Query

Removed the MapReduce and Aggregators content. (Previously Section 11.4 and 11.5, respectively.)

Renamed the Fast-Aggregations section as [aggregations].

CP Subsystem

Added CP Subsystem Persistence as a new section.

Added the description of initial-permits configuration element which specifies the number of permits to initialize a Semaphore.

Storage

Added Using Persistent Memory as a new section.

Added Encryption at Rest as a new section.

Hazelcast Clients

Added Checking the Name of the Instance for REST Client as a new section describing how to check your Hazelcast instance names.

Added Configuring Backup Acknowledgment as a new section.

Management

Updated the Using the cluster.sh Script and Using the healthcheck.sh Script sections to include HTTPS support.

Security

The whole chapter has been updated with the latest changes. These include the following:

  • Member security configuration

  • LDAP and X.509 certificate authentication

  • JAAS authentication

Added Security Debugging as a new parent section for Java security and TLS debugging.

Extending Hazelcast

Removed the User Defined Services section since it is not supported anymore.

Migration Guides

Added the migration guide for IMDG 4.0 release.

System Properties

Added the descriptions for the following new system properties:

  • hazelcast.member.naming.moby.enabled

  • hazelcast.client.cleanup.period.millis

  • hazelcast.client.cleanup.timeout.millis

  • hazelcast.client.protocol.max.message.bytes

  • hazelcast.clientengine.query.thread.count

2. Getting Started

This chapter explains how to install Hazelcast and start a Hazelcast member and client. It describes the executable files in the download package and also provides the fundamentals for configuring Hazelcast and its deployment options.

2.1. Installation

The following sections explain the installation of Hazelcast IMDG and Hazelcast IMDG Enterprise. It also includes notes and changes to consider when upgrading Hazelcast.

2.1.1. Installing Hazelcast IMDG

You can find Hazelcast in standard Maven repositories. If your project uses Maven, you do not need to add additional repositories to your pom.xml or add hazelcast-4.0-SNAPSHOT.jar file into your classpath (Maven does that for you). Just add the following lines to your pom.xml:

<dependencies>
    <dependency>
        <groupId>com.hazelcast</groupId>
        <artifactId>hazelcast</artifactId>
        <version>4.0-SNAPSHOT</version>
    </dependency>
</dependencies>

As an alternative, you can download and install Hazelcast IMDG yourself. You only need to:

  • download the package hazelcast-4.0-SNAPSHOT.zip or hazelcast-4.0-SNAPSHOT.tar.gz from hazelcast.org

  • extract the downloaded hazelcast-4.0-SNAPSHOT.zip or hazelcast-4.0-SNAPSHOT.tar.gz

  • and add the file hazelcast-4.0-SNAPSHOT.jar to your classpath.

2.1.2. Installing Hazelcast IMDG Enterprise

There are two Maven repositories defined for Hazelcast IMDG Enterprise:

<repository>
    <id>Hazelcast Private Snapshot Repository</id>
    <url>https://repository.hazelcast.com/snapshot/</url>
</repository>
<repository>
    <id>Hazelcast Private Release Repository</id>
    <url>https://repository.hazelcast.com/release/</url>
</repository>

Hazelcast IMDG Enterprise customers may also define dependencies. See the following example:

<dependency>
    <groupId>com.hazelcast</groupId>
    <artifactId>hazelcast-enterprise</artifactId>
    <version>4.0-SNAPSHOT</version>
</dependency>
<dependency>
    <groupId>com.hazelcast</groupId>
    <artifactId>hazelcast-enterprise-all</artifactId>
    <version>4.0-SNAPSHOT</version>
</dependency>
<dependency>
    <groupId>com.hazelcast</groupId>
    <artifactId>hazelcast-enterprise-all</artifactId>
    <version>4.0-SNAPSHOT</version>
    <classifier>javadoc</classifier>
</dependency>

2.1.3. Setting the License Key

Hazelcast IMDG Enterprise offers you two types of licenses: Enterprise and Enterprise HD. The supported features differ in your Hazelcast setup according to the license type you own.

  • Enterprise license: In addition to the open source edition of Hazelcast, Enterprise features are the following:

    • Security

    • WAN Replication

    • Clustered REST

    • Clustered JMX

    • Striim Hot Cache

    • Rolling Upgrades

  • Enterprise HD license: In addition to the Enterprise features, Enterprise HD features are the following:

    • High-Density Memory Store

    • Hot Restart Persistence

To use Hazelcast IMDG Enterprise, you need to set the provided license key using one of the configuration methods shown below.

Hazelcast IMDG Enterprise license keys are required only for members. You do not need to set a license key for your Java clients for which you want to use IMDG Enterprise features.

Declarative Configuration:

Add the below line to any place you like in the file hazelcast.xml. This XML file offers you a declarative way to configure your Hazelcast. It is included in the Hazelcast download package. When you extract the downloaded package, you will see the file hazelcast.xml under the /bin directory.

<hazelcast>
    ...
    <license-key>Your Enterprise License Key</license-key>
    ...
</hazelcast>

Programmatic Configuration:

Alternatively, you can set your license key programmatically as shown below.

Config config = new Config();
config.setLicenseKey( "Your Enterprise License Key" );

Spring XML Configuration:

If you are using Spring with Hazelcast, then you can set the license key using the Spring XML schema, as shown below.

<hz:config>
    ...
    <hz:license-key>Your Enterprise License Key</hz:license-key>
    ...
</hz:config>

JVM System Property:

As another option, you can set your license key using the below command (the "-D" command line option).

-Dhazelcast.enterprise.license.key=Your Enterprise License Key
License Key Format

License keys have the following format:

<Name of the Hazelcast edition>#<Count of the Members>#<License key>

The strings before the <License key> is the human readable part. You can use your license key with or without this human readable part. So, both the following example license keys are valid:

HazelcastEnterpriseHD#2Nodes#1q2w3e4r5t
1q2w3e4r5t

2.1.4. License Information

License information is available through the following Hazelcast APIs.

JMX

The MBean HazelcastInstance.LicenseInfo holds all the relative license details and can be accessed through Hazelcast’s JMX port (if enabled). The following parameters represent these details:

  • maxNodeCountAllowed: Maximum members allowed to form a cluster under the current license.

  • expiryDate: Expiration date of the current license.

  • typeCode: Type code of the current license.

  • type: Type of the current license.

  • ownerEmail: Email of the current license’s owner.

  • companyName: Company name on the current license.

Following is the list of license types and typeCodes:

MANAGEMENT_CENTER(1, "Management Center"),
ENTERPRISE(0, "Enterprise"),
ENTERPRISE_SECURITY_ONLY(2, "Enterprise only with security"),
ENTERPRISE_HD(3, "Enterprise HD"),
CUSTOM(4, "Custom");
REST

You can access the license details by issuing a GET request through the REST API (if enabled; see the Using the REST Endpoint Groups section) on the /license resource, as shown below.

curl -v http://localhost:5701/hazelcast/rest/license

Its output is similar to the following:

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 5701 (#0)
> GET /hazelcast/rest/license HTTP/1.1
> Host: localhost:5701
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Content-Length: 187
<
licenseInfo{"expiryDate":1560380399161,"maxNodeCount":10,"type":-1,
"companyName":"ExampleCompany","ownerEmail":"info@example.com",
"keyHash":"ml/u6waTNQ+T4EWxnDRykJpwBmaV9uj+skZzv0SzDhs="}

To update the license of a running cluster, you can issue a POST request through the REST API (if enabled; see the Using the REST Endpoint Groups section) on the /license as shown below:

curl --data "${CLUSTERNAME}&${PASSWORD}&${LICENSE}" http://localhost:5001/hazelcast/rest/license
The request parameters must be properly URL-encoded as described in the REST Client section.

The above command updates the license on all running Hazelcast members of the cluster. If successful, the response looks as follows:

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 5001 (#0)
> POST /hazelcast/rest/license HTTP/1.1
> Host: 127.0.0.1:5001
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Length: 164
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 164 out of 164 bytes
< HTTP/1.1 200 OK
< Content-Type: application/javascript
< Content-Length: 364
<
* Connection #0 to host 127.0.0.1 left intact
{"status":"success","licenseInfo":{"expiryDate":1560380399161,"maxNodeCount":10,
"type":-1,"companyName":"ExampleCompany","ownerEmail":"info@example.com",
"keyHash":"ml/u6waTNQ+T4EWxnDRykJpwBmaV9uj+skZzv0SzDhs="},
"message":"License updated at run time - please make sure to update the license
in the persistent configuration to avoid losing the changes on restart."}

As the message in the above example indicates, the license is updated only at runtime. The persistent configuration of each member needs to be updated manually to ensure that the license change is not lost on restart. The same message is logged as a warning in each member’s log.

It is only possible to update to a license that expires at the same time or after the current license. The new license must allow the exact same list of features and the same number of members.

If, for any reason, updating the license fails on some members (member does not respond, license is not compatible, etc.), the whole operation fails, leaving the cluster in a potentially inconsistent state (some members have been switched to the new license while some have not). It is up to you to resolve this situation manually.

Logs

Besides the above approaches (JMX and REST) to access the license details, Hazelcast also starts to log a license information banner into the log files when the license expiration is approaching.

During the last two months prior to the expiration, this license information banner is logged daily, as a reminder to renew your license to avoid any interruptions. Once the expiration is due to a month, the frequency of logging this banner becomes hourly (instead of daily). Lastly, when the expiration is due in a week, this banner is printed every 30 minutes.

Similar alerts are also present on the Hazelcast Management Center.

The banner has the following format:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ WARNING @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
HAZELCAST LICENSE WILL EXPIRE IN 29 DAYS.
Your Hazelcast cluster will stop working after this time.

Your license holder is customer@example-company.com, you should have them contact
our license renewal department, urgently on info@hazelcast.com
or call us on +1 (650) 521-5453

Please quote license id CUSTOM_TEST_KEY

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Please pay attention to the license warnings to prevent any possible interruptions in the operation of your Hazelcast applications.

2.2. Supported Java Virtual Machines

Following table summarizes the version compatibility between Hazelcast IMDG and various vendors' Java Virtual Machines (JVMs).

Table 2. Supported JVMs
Hazelcast IMDG Version JDK Version Oracle JDK IBM SDK, Java Technology Edition Azul Zing JDK OpenJDK

Up to 3.11

(JDK 6 support is dropped with the release of Hazelcast IMDG 3.12)

6

Up to 3.11

(JDK 7 support is dropped with the release of Hazelcast IMDG 3.12)

7

Up to current

8

  • 3.11 and newer: Fully supported.

  • 3.10 and older: Partially supported.

9

(JDK not available yet)

(JDK not available yet)

  • 3.11 and newer: Fully supported.

  • 3.10 and older: Partially supported.

10

(JDK not available yet)

(JDK not available yet)

  • 3.11 and newer: Fully supported.

  • 3.10 and older: Partially supported.

11

(JDK not available yet)

(JDK not available yet)

(JDK not available yet)

Hazelcast IMDG 3.10 and older releases are not fully tested on JDK 9 and newer, so there may be some features that are not working properly.

See the following sections for the details of Hazelcast IMDG supporting JDK 9 and newer:

2.3. Running in Modular Java

Java project Jigsaw brought a new Module System into Java 9 and newer. Hazelcast supports running in the modular environment. If you want to run your application with Hazelcast libraries on the modulepath, use the following module name:

  • com.hazelcast.core for hazelcast-4.0-SNAPSHOT.jar and hazelcast-enterprise-4.0-SNAPSHOT.jar

Don’t use hazelcast-all-4.0-SNAPSHOT.jar or hazelcast-enterprise-all-4.0-SNAPSHOT.jar on the modulepath as it could lead to problems in module dependencies for your application. You can still use them on the classpath.

The Java Module System comes with stricter visibility rules. It affects Hazelcast which uses internal Java API to reach the best performance results.

Hazelcast needs the java.se module and access to the following Java packages for a proper work:

  • java.base/jdk.internal.ref

  • java.base/java.nio (reflective access)

  • java.base/sun.nio.ch (reflective access)

  • java.base/java.lang (reflective access)

  • jdk.management/com.ibm.lang.management.internal (reflective access)

  • jdk.management/com.sun.management.internal (reflective access)

  • java.management/sun.management (reflective access)

You can provide the access to the above mentioned packages by using --add-exports and --add-opens (for the reflective access) Java arguments.

Example: Running a member on the classpath

java --add-modules java.se \
  --add-exports java.base/jdk.internal.ref=ALL-UNNAMED \
  --add-opens java.base/java.lang=ALL-UNNAMED \
  --add-opens java.base/java.nio=ALL-UNNAMED \
  --add-opens java.base/sun.nio.ch=ALL-UNNAMED \
  --add-opens java.management/sun.management=ALL-UNNAMED \
  --add-opens jdk.management/com.ibm.lang.management.internal=ALL-UNNAMED \
  --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED \
  -jar hazelcast-4.0-SNAPSHOT.jar

Example: Running a member on the modulepath

java --add-modules java.se \
  --add-exports java.base/jdk.internal.ref=com.hazelcast.core \
  --add-opens java.base/java.lang=com.hazelcast.core \
  --add-opens java.base/java.nio=com.hazelcast.core \
  --add-opens java.base/sun.nio.ch=com.hazelcast.core \
  --add-opens java.management/sun.management=com.hazelcast.core \
  --add-opens jdk.management/com.ibm.lang.management.internal=com.hazelcast.core \
  --add-opens jdk.management/com.sun.management.internal=com.hazelcast.core \
  --module-path lib \
  --module com.hazelcast.core/com.hazelcast.core.server.HazelcastMemberStarter

This example expects hazelcast-4.0-SNAPSHOT.jar placed in the lib directory.

2.4. Starting the Member and Client

Having installed Hazelcast, you can get started.

In this short tutorial, you perform the following activities:

  1. Create a simple Java application using the Hazelcast distributed map and queue.

  2. Run our application twice to have a cluster with two members (JVMs).

  3. Connect to our cluster from another Java application by using the Hazelcast Native Java Client API.

Let’s begin.

  • The following code starts the first Hazelcast member and creates and uses the customers map and queue.

    Config cfg = new Config();
    HazelcastInstance instance = Hazelcast.newHazelcastInstance(cfg);
    Map<Integer, String> mapCustomers = instance.getMap("customers");
    mapCustomers.put(1, "Joe");
    mapCustomers.put(2, "Ali");
    mapCustomers.put(3, "Avi");
    
    System.out.println("Customer with key 1: "+ mapCustomers.get(1));
    System.out.println("Map Size:" + mapCustomers.size());
    
    Queue<String> queueCustomers = instance.getQueue("customers");
    queueCustomers.offer("Tom");
    queueCustomers.offer("Mary");
    queueCustomers.offer("Jane");
    System.out.println("First customer: " + queueCustomers.poll());
    System.out.println("Second customer: "+ queueCustomers.peek());
    System.out.println("Queue size: " + queueCustomers.size());
  • Run this GettingStarted class a second time to get the second member started. The members form a cluster and the output is similar to the following.

    Members {size:2, ver:2} [
        Member [127.0.0.1]:5701 - e40081de-056a-4ae5-8ffe-632caf8a6cf1 this
        Member [127.0.0.1]:5702 - 93e82109-16bf-4b16-9c87-f4a6d0873080
    ]

    Here, you can see the size of your cluster (size) and member list version (ver). The member list version is incremented when changes happen to the cluster, e.g., a member leaving from or joining to the cluster.

    The above member list format is introduced with Hazelcast 3.9. You can enable the legacy member list format, which was used for the releases before Hazelcast 3.9, using the system property hazelcast.legacy.memberlist.format.enabled. See the System Properties appendix. The following is an example for the legacy member list format:

    Members [2] {
        Member [127.0.0.1]:5701 - c1ccc8d4-a549-4bff-bf46-9213e14a9fd2 this
        Member [127.0.0.1]:5702 - 33a82dbf-85d6-4780-b9cf-e47d42fb89d4
    }
  • Now, add the hazelcast-client-4.0-SNAPSHOT.jar library to your classpath. This is required to use a Hazelcast client.

  • The following code starts a Hazelcast Client, connects to our cluster, and prints the size of the customers map.

    public class GettingStartedClient {
        public static void main( String[] args ) {
            ClientConfig clientConfig = new ClientConfig();
            HazelcastInstance client = HazelcastClient.newHazelcastClient( clientConfig );
            IMap map = client.getMap( "customers" );
            System.out.println( "Map Size:" + map.size() );
        }
    }
  • When you run it, you see the client properly connecting to the cluster and printing the map size as 3.

Hazelcast also offers a tool, Management Center, that enables you to monitor your cluster. It is included in your Hazelcast IMDG download package and can also be downloaded from the Hazelcast website’s download page. You can use this tool to monitor your cluster, cluster members, clients, data structures and WAN replications. See the documentation for details on Hazelcast Management Center.

By default, Hazelcast uses multicast to discover other members that can form a cluster. If you are working with other Hazelcast developers on the same network, you may find yourself joining their clusters under the default settings. Hazelcast provides a way to segregate clusters within the same network when using multicast. See the Creating Cluster Groups section for more information. Alternatively, if you do not wish to use the default multicast mechanism, you can provide a fixed list of IP addresses that are allowed to join. See the Join configuration section for more information.

Multicast mechanism is not recommended for production since UDP is often blocked in production environments and other discovery mechanisms are more definite. See the Discovery Mechanisms section.
You can also check the video tutorials here.

2.5. Using the Scripts In The Package

When you download and extract the Hazelcast ZIP or TAR.GZ package, you will see the following scripts under the /bin folder that provide basic functionalities for member and cluster management.

The following are the names and descriptions of each script:

  • start.sh / start.bat: Starts a Hazelcast member with default configuration in the working directory.

  • stop.sh / stop.bat: Stops the Hazelcast member that was started in the current working directory.

    start.sh / start.bat scripts lets you start one Hazelcast instance per folder. To start a new instance, please unzip Hazelcast ZIP or TAR.GZ package in a new folder.
    You can also use the start scripts to deploy your own library to a Hazelcast member. See the Adding User Library to CLASSPATH section.
  • cluster.sh: Provides basic functionalities for cluster management, such as getting and changing the cluster state, shutting down the cluster or forcing the cluster to clean its persisted data and make a fresh start. See the Using the Script cluster.sh section to learn the usage of this script.

  • cp-subsystem.sh: Provides access to the CP subsystem management APIs using the REST interface. See the CP Subsystem Management APIs section.

  • healthcheck.sh: Provides basic information about your clusters such as the state and size. See the Using the healthcheck.sh Script section.

2.6. Deploying using Hazelcast Cloud

A simple option for deploying Hazelcast is Hazelcast Cloud. It delivers enterprise-grade Hazelcast software in the cloud. You can deploy, scale and update your Hazelcast easily using Hazelcast Cloud; it maintains the clusters for you. You can use Hazelcast Cloud as a low-latency high-performance caching or data layer for your microservices, and it is also a nice solution for state management of serverless functions (AWS Lambda).

Hazelcast Cloud uses Docker and Kubernetes, and is powered by Hazelcast IMDG Enterprise HD. It is initially available on Amazon Web Services (AWS), to be followed by Microsoft Azure and Google Cloud Platform (GCP). Since it is based on Hazelcast IMDG Enterprise HD, it features advanced functionality such as TLS, multi-region, persistence, and high availability.

2.7. Deploying On Amazon EC2

You can easily deploy your Hazelcast projects on Amazon EC2 instances. For this, you can use Hazelcast’s AWS cloud discovery module. This module helps the cluster members discover each other and form a cluster on EC2. It has easy-to-apply features including tagging, IAM roles, and connections to the cluster from clients outside the cloud.

See the Hazelcast AWS cloud discovery module’s documentation to learn more about this module.

Alternative to the discovery module mentioned above, if you are a Vagrant and/or Chef user, you can also check our sample project that uses these 3rd party tools to deploy a Hazelcast cluster on EC2.

2.8. Deploying On Microsoft Azure

You can deploy your Hazelcast cluster onto a Microsoft Azure environment. For this, your cluster should make use of Hazelcast Discovery Plugin for Microsoft Azure. You can find information about this plugin on its GitHub repository at Hazelcast Azure.

For information on how to automatically deploy your cluster onto Azure, see the Deployment section of the Hazelcast Azure plugin repository.

2.9. Deploying On Pivotal Cloud Foundry

You can deploy your Hazelcast cluster onto Pivotal Cloud Foundry. It is available as a Pivotal Cloud Foundry Tile which you can download at here. You can find the installation and usage instructions and the release notes documents here.

2.10. Deploying using Docker

You can deploy your Hazelcast projects using the Docker containers. Hazelcast has the following images on Docker:

  • Hazelcast IMDG

  • Hazelcast IMDG Enterprise

  • Hazelcast Management Center

  • Hazelcast OpenShift

After you pull an image from the Docker registry, you can run your image to start the Management Center or a Hazelcast instance with Hazelcast’s default configuration. All repositories provide the latest stable releases but you can pull a specific release, too. You can also specify environment variables when running the image.

If you want to start a customized Hazelcast instance, you can extend the Hazelcast image by providing your own configuration file.

This feature is provided as a Hazelcast plugin. See its own GitHub repo at Hazelcast Docker for details on configurations and usages.

3. Hazelcast Overview

Hazelcast is an open source In-Memory Data Grid (IMDG). It provides elastically scalable distributed In-Memory computing, widely recognized as the fastest and most scalable approach to application performance. Hazelcast does this in open source. More importantly, Hazelcast makes distributed computing simple by offering distributed implementations of many developer-friendly interfaces from Java such as Map, Queue, ExecutorService, Lock and JCache. For example, the Map interface provides an In-Memory Key Value store which confers many of the advantages of NoSQL in terms of developer friendliness and developer productivity.

In addition to distributing data In-Memory, Hazelcast provides a convenient set of APIs to access the CPUs in your cluster for maximum processing speed. Hazelcast is designed to be lightweight and easy to use. Since Hazelcast is delivered as a compact library (JAR) and since it has no external dependencies other than Java, it easily plugs into your software solution and provides distributed data structures and distributed computing utilities.

Hazelcast is highly scalable and available. Distributed applications can use Hazelcast for distributed caching, synchronization, clustering, processing, pub/sub messaging, etc. Hazelcast is implemented in Java and has clients for Java, C/C++, .NET, REST, Python, Go and Node.js. Hazelcast also speaks Memcached protocol. It plugs into Hibernate and can easily be used with any existing database system.

If you are looking for in-memory speed, elastic scalability and the developer friendliness of NoSQL, Hazelcast is a great choice.

Hazelcast is Simple

Hazelcast is written in Java with no other dependencies. It exposes the same API from the familiar Java util package, exposing the same interfaces. Just add hazelcast.jar to your classpath and you can quickly enjoy JVMs clustering and start building scalable applications.

Hazelcast is Peer-to-Peer

Unlike many NoSQL solutions, Hazelcast is peer-to-peer. There is no master and slave; there is no single point of failure. All members store equal amounts of data and do equal amounts of processing. You can embed Hazelcast in your existing application or use it in client and server mode where your application is a client to Hazelcast members.

Hazelcast is Scalable

Hazelcast is designed to scale up to hundreds and thousands of members. Simply add new members; they automatically discover the cluster and linearly increase both the memory and processing capacity. The members maintain a TCP connection between each other and all communication is performed through this layer.

Hazelcast is Fast

Hazelcast stores everything in-memory. It is designed to perform very fast reads and updates.

Hazelcast is Redundant

Hazelcast keeps the backup of each data entry on multiple members. On a member failure, the data is restored from the backup and the cluster continues to operate without downtime.

3.1. Why Hazelcast?

A Glance at Traditional Data Persistence

Data is at the core of software systems. In conventional architectures, a relational database persists and provides access to data. Applications are talking directly with a database which has its backup as another machine. To increase performance, tuning or a faster machine is required. This can cost a large amount of money or effort.

There is also the idea of keeping copies of data next to the database, which is performed using technologies like external key-value stores or second level caching that help offload the database. However, when the database is saturated or the applications perform mostly "put" operations (writes), this approach is of no use because it insulates the database only from the "get" loads (reads). Even if the applications are read-intensive there can be consistency problems - when data changes, what happens to the cache and how are the changes handled? This is when concepts like time-to-live (TTL) or write-through come in.

In the case of TTL, if the access is less frequent than the TTL, the result is always a cache miss. On the other hand, in the case of write-through caches, if there are more than one of these caches in a cluster, it means there are consistency issues. This can be avoided by having the members communicate with each other so that entry invalidations can be propagated.

We can conclude that an ideal cache would combine TTL and write-through features. There are several cache servers and in-memory database solutions in this field. However, these are stand-alone single instances with a distribution mechanism that is provided by other technologies to an extent. So, we are back to square one; we experience saturation or capacity issues if the product is a single instance or if consistency is not provided by the distribution.

And, there is Hazelcast

Hazelcast, a brand new approach to data, is designed around the concept of distribution. Hazelcast shares data around the cluster for flexibility and performance. It is an in-memory data grid for clustering and highly scalable data distribution.

One of the main features of Hazelcast is that it does not have a master member. Each cluster member is configured to be the same in terms of functionality. The oldest member (the first member created in the cluster) automatically performs the data assignment to cluster members. If the oldest member dies, the second oldest member takes over.

You can come across with the term "master" or "master member" in some sections of this manual. They are used for contextual clarification purposes; please remember that they refer to the "oldest member" which is explained in the above paragraph.

Another main feature of Hazelcast is that the data is held entirely in-memory. This is fast. In the case of a failure, such as a member crash, no data is lost since Hazelcast distributes copies of the data across all the cluster members.

As shown in the feature list in the Distributed Data Structures chapter, Hazelcast supports a number of distributed data structures and computing utilities. These provide powerful ways of accessing distributed clustered memory and accessing CPUs for true distributed computing.

Hazelcast’s Distinctive Strengths

  • Hazelcast is open source.

  • Hazelcast is only a JAR file. You do not need to install software.

  • Hazelcast is a library, it does not impose an architecture on Hazelcast users.

  • Hazelcast provides out-of-the-box distributed data structures, such as Map, Queue, MultiMap, Topic, Lock and Executor.

  • There is no "master," meaning no single point of failure in a Hazelcast cluster; each member in the cluster is configured to be functionally the same.

  • When the size of your memory and compute requirements increase, new members can be dynamically joined to the Hazelcast cluster to scale elastically.

  • Data is resilient to member failure. Data backups are distributed across the cluster. This is a big benefit when a member in the cluster crashes as data is not lost.

  • Members are always aware of each other unlike in traditional key-value caching solutions.

  • You can build your own custom-distributed data structures using the Service Programming Interface (SPI) if you are not happy with the data structures provided.

Finally, Hazelcast has a vibrant open source community enabling it to be continuously developed.

Hazelcast is a fit when you need:

  • analytic applications requiring big data processing by partitioning the data

  • to retain frequently accessed data in the grid

  • a cache, particularly an open source JCache provider with elastic distributed scalability

  • a primary data store for applications with utmost performance, scalability and low-latency requirements

  • an In-Memory NoSQL Key Value Store

  • publish/subscribe communication at highest speed and scalability between applications

  • applications that need to scale elastically in distributed and cloud environments

  • a highly available distributed cache for applications

  • an alternative to Coherence and Terracotta.

3.2. Use Cases

Hazelcast can be used:

  • to share server configuration/information to see how a cluster performs

  • to cluster highly changing data with event notifications, e.g., user based events, and to queue and distribute background tasks

  • as a simple Memcache with Near Cache

  • as a cloud-wide scheduler of certain processes that need to be performed on some members

  • to share information (user information, queues, maps, etc.) on the fly with multiple members in different installations under OSGI environments

  • to share thousands of keys in a cluster where there is a web service interface on an application server and some validation

  • as a distributed topic (publish/subscribe server) to build scalable chat servers for smartphones

  • as a front layer for a Cassandra back-end

  • to distribute user object states across the cluster, to pass messages between objects and to share system data structures (static initialization state, mirrored objects, object identity generators)

  • as a multi-tenancy cache where each tenant has its own map

  • to share datasets, e.g., table-like data structure, to be used by applications

  • to distribute the load and collect status from Amazon EC2 servers where the front-end is developed using, for example, Spring framework

  • as a real-time streamer for performance detection

  • as storage for session data in web applications (enables horizontal scalability of the web application).

3.3. Hazelcast Topology

You can deploy a Hazelcast cluster in two ways: Embedded or Client/Server.

If you have an application whose main focal point is asynchronous or high performance computing and lots of task executions, then Embedded deployment is the preferred way. In Embedded deployment, members include both the application and Hazelcast data and services. The advantage of the Embedded deployment is having a low-latency data access.

See the below illustration.

Embedded Deployment

In the Client/Server deployment, Hazelcast data and services are centralized in one or more server members and they are accessed by the application through clients. You can have a cluster of server members that can be independently created and scaled. Your clients communicate with these members to reach to Hazelcast data and services on them.

See the below illustration.

Client/Server Deployment

Hazelcast provides native clients (Java, .NET and C++), Memcache and REST clients, Scala, Python and Node.js client implementations.

Client/Server deployment has advantages including more predictable and reliable Hazelcast performance, easier identification of problem causes and, most importantly, better scalability. When you need to scale in this deployment type, just add more Hazelcast server members. You can address client and server scalability concerns separately.

Note that Hazelcast member libraries are available only in Java. Therefore, embedding a member to a business service, it is only possible with Java. Applications written in other languages (.NET, C++, Node.js, etc.) can use Hazelcast client libraries to access the cluster. See the Hazelcast Clients chapter for information on the clients and other language implementations.

If you want low-latency data access, as in the Embedded deployment, and you also want the scalability advantages of the Client/Server deployment, you can consider defining Near Caches for your clients. This enables the frequently used data to be kept in the client’s local memory. See the Configuring Client Near Cache section.

3.4. Sharding in Hazelcast

Hazelcast shards are called partitions. By default, Hazelcast has 271 partitions. Given a key, we serialize, hash and mod it with the number of partitions to find the partition which the key belongs to. The partitions themselves are distributed equally among the members of the cluster. Hazelcast also creates the backups of partitions and distributes them among members for redundancy.

See the Data Partitioning section for more information on how Hazelcast partitions your data.

3.5. Data Partitioning

As you read in the Sharding in Hazelcast section, Hazelcast shards are called Partitions. Partitions are memory segments that can contain hundreds or thousands of data entries each, depending on the memory capacity of your system. Each Hazelcast partition can have multiple replicas, which are distributed among the cluster members. One of the replicas becomes the primary and other replicas are called backups. Cluster member which owns primary replica of a partition is called partition owner. When you read or write a particular data entry, you transparently talk to the owner of the partition that contains the data entry.

By default, Hazelcast offers 271 partitions. When you start a cluster with a single member, it owns all of 271 partitions (i.e., it keeps primary replicas for 271 partitions). The following illustration shows the partitions in a Hazelcast cluster with single member.

Single Member with Partitions

When you start a second member on that cluster (creating a Hazelcast cluster with two members), the partition replicas are distributed as shown in the illustration here.

Partition distributions in the below illustrations are shown for the sake of simplicity and for descriptive purposes. Normally, the partitions are not distributed in any order, as they are shown in these illustrations, but are distributed randomly (they do not have to be sequentially distributed to each member). The important point here is that Hazelcast equally distributes the partition primaries and their backup replicas among the members.
Cluster with Two Members - Backups are Created

In the illustration, the partition replicas with black text are primaries and the partition replicas with blue text are backups. The first member has primary replicas of 135 partitions (black) and each of these partitions are backed up in the second member (i.e., the second member owns the backup replicas) (blue). At the same time, the first member also has the backup replicas of the second member’s primary partition replicas.

As you add more members, Hazelcast moves some of the primary and backup partition replicas to the new members one by one, making all members equal and redundant. Thanks to the consistent hashing algorithm, only the minimum amount of partitions are moved to scale out Hazelcast. The following is an illustration of the partition replica distributions in a Hazelcast cluster with four members.

Cluster with Four Members

Hazelcast distributes partitions' primary and backup replicas equally among the members of the cluster. Backup replicas of the partitions are maintained for redundancy.

Your data can have multiple copies on partition primaries and backups, depending on your backup count. See the Backing Up Maps section.

Hazelcast also offers lite members. These members do not own any partition. Lite members are intended for use in computationally-heavy task executions and listener registrations. Although they do not own any partitions, they can access partitions that are owned by other members in the cluster.

3.5.1. How the Data is Partitioned

Hazelcast distributes data entries into the partitions using a hashing algorithm. Given an object key (for example, for a map) or an object name (for example, for a topic or list):

  • the key or name is serialized (converted into a byte array)

  • this byte array is hashed

  • the result of the hash is mod by the number of partitions.

The result of this modulo - MOD(hash result, partition count) - is the partition in which the data will be stored, that is the partition ID. For ALL members you have in your cluster, the partition ID for a given key is always the same.

3.5.2. Partition Table

When you start a member, a partition table is created within it. This table stores the partition IDs and the cluster members to which they belong. The purpose of this table is to make all members (including lite members) in the cluster aware of this information, making sure that each member knows where the data is.

The oldest member in the cluster (the one that started first) periodically sends the partition table to all members. In this way each member in the cluster is informed about any changes to partition ownership. The ownerships may be changed when, for example, a new member joins the cluster, or when a member leaves the cluster.

If the oldest member of the cluster goes down, the next oldest member sends the partition table information to the other ones.

You can configure the frequency (how often) that the member sends the partition table the information by using the hazelcast.partition.table.send.interval system property. The property is set to every 15 seconds by default.

3.5.3. Repartitioning

Repartitioning is the process of redistribution of partition ownerships. Hazelcast performs the repartitioning when a member joins or leaves the cluster.

In these cases, the partition table in the oldest member is updated with the new partition ownerships. Note that if a lite member joins or leaves a cluster, repartitioning is not triggered since lite members do not own any partitions.

3.6. Resources

4. Understanding Configuration

This chapter describes the options to configure your Hazelcast applications and explains the utilities which you can make use of while configuring. You can configure Hazelcast using one or mix of the following options:

  • Declarative way

  • Programmatic way

  • Using Hazelcast system properties

  • Within the Spring context

  • Dynamically adding configuration on a running cluster

4.1. Configuring Declaratively

This is the configuration option where you use an XML or a YAML configuration file. When you download and unzip hazelcast-4.0-SNAPSHOT .zip, you see the following files present in the /bin folder, which are standard configuration files:

  • hazelcast.xml: Default declarative XML configuration file for Hazelcast. The configuration for the distributed data structures in this XML file should be fine for most of the Hazelcast users. If not, you can tailor this XML file according to your needs by adding/removing/modifying properties. Also see the Setting Up Clusters chapter for the network related configurations.

  • hazelcast.yaml: Default YAML configuration file identical to hazelcast.xml in content.

  • hazelcast-full-example.xml: Configuration file which includes all Hazelcast configuration elements and attributes with their descriptions. It is the "superset" of hazelcast.xml. You can use hazelcast-full-example.xml as a reference document to learn about any element or attribute, or you can change its name to hazelcast.xml and start to use it as your Hazelcast configuration file.

  • hazelcast-full-example.yaml: YAML configuration file identical to hazelcast-full-example.xml in content.

  • hazelcast-client-full-example.xml: Complete Hazelcast Java client example configuration file which includes all configuration elements and attributes with their descriptions. Read more about Java client configuration here.

  • hazelcast-client-full-example.yaml: YAML configuration file identical to hazelcast-client-full-example.xml in content.

  • hazelcast-client-failover-full-example.xml: Complete Hazelcast client failover example configuration file which includes all Hazelcast client failover configuration elements and attributes with their descriptions. Read about Blue-Green Deployment and Disaster Recovery here.

  • hazelcast-client-failover-full-example.yaml: YAML configuration file identical to hazelcast-client-failover-full-example.xml in content.

A part of hazelcast.xml is shown as an example below.

<hazelcast>
    ...
    <cluster-name>dev</cluster-name>
    <management-center enabled="false">http://localhost:8080/mancenter</management-center>
    <network>
        <port auto-increment="true" port-count="100">5701</port>
        <outbound-ports>
        <!--
        Allowed port range when connecting to other members.
        0 or * means the port provided by the system.
        -->
            <ports>0</ports>
        </outbound-ports>
        <join>
            <multicast enabled="true">
                <multicast-group>224.2.2.3</multicast-group>
                <multicast-port>54327</multicast-port>
            </multicast>
            <tcp-ip enabled="false">
                <interface>127.0.0.1</interface>
                <member-list>
                    <member>127.0.0.1</member>
                </member-list>
            </tcp-ip>
        </join>
    </network>
    <map name="default">
        <time-to-live-seconds>0</time-to-live-seconds>
    </map>
    ...
</hazelcast>

The identical part of the configuration extracted from hazelcast.yaml is shown as below.

hazelcast:
  ...
  cluster:
    name: dev
    password: dev-pass
  management-center:
    enabled: false
    url: http://localhost:8080/hazelcast-mancenter
  network:
    port:
      auto-increment: true
      port-count: 100
      port: 5701
    outbound-ports:
      # Allowed port range when connecting to other nodes.
      # 0 or * means use system provided port.
      - 0
    join:
      multicast:
        enabled: true
        multicast-group: 224.2.2.3
        multicast-port: 54327
      tcp-ip:
        enabled: false
        interface: 127.0.0.1
        member-list:
          - 127.0.0.1
  map:
    default:
      time-to-live-seconds: 0
    ...

4.1.1. Composing Declarative Configuration

You can compose the declarative configuration of your Hazelcast member or Hazelcast client from multiple declarative configuration snippets. In order to compose a declarative configuration, you can import different declarative configuration files. Composing configuration files is supported both in XML and YAML configurations with the limitation that only configuration files written in the same language can be composed.

Let’s say you want to compose the declarative configuration for Hazelcast out of two XML configurations: development-cluster-config.xml and development-network-config.xml. These two configurations are shown below.

development-cluster-config.xml:

<hazelcast>
    <cluster-name>dev</cluster-name>
</hazelcast>

development-network-config.xml:

<hazelcast>
    <network>
        <port auto-increment="true" port-count="100">5701</port>
        <join>
            <multicast enabled="true">
                <multicast-group>224.2.2.3</multicast-group>
                <multicast-port>54327</multicast-port>
            </multicast>
        </join>
    </network>
</hazelcast>

To get your example Hazelcast declarative configuration out of the above two, use the <import/> element as shown below.

<hazelcast>
    <import resource="development-group-config.xml"/>
    <import resource="development-network-config.xml"/>
</hazelcast>

The above example using the YAML configuration files looks like the following:

development-cluster-config.yaml:

hazelcast:
  cluster:
    name: dev

development-network-config.yaml:

hazelcast:
  network:
    port:
      auto-increment: true
      port-count: 100
      port: 5701
    join:
      multicast:
        enabled: true
        multicast-group: 224.2.2.3
        multicast-port: 54327

Composing the above two YAML configuration files needs them to be imported as shown below.

hazelcast:
  import:
    - development-group-config.yaml
    - development-network-config.yaml

This feature also applies to the declarative configuration of Hazelcast client. See the following examples.

client-cluster-config.xml:

<hazelcast-client>
    <cluster-name>dev</cluster-name>
</hazelcast-client>

client-network-config.xml:

<hazelcast-client>
    <network>
        <cluster-members>
            <address>127.0.0.1:7000</address>
        </cluster-members>
    </network>
</hazelcast-client>

To get a Hazelcast client declarative configuration from the above two examples, use the <import/> element as shown below.

<hazelcast-client>
    <import resource="client-cluster-config.xml"/>
    <import resource="client-network-config.xml"/>
</hazelcast>

The same client configuration using the YAML language is shown below.

client-cluster-config.yaml:

hazelcast-client:
  cluster:
    name: dev

client-network-config.yaml:

hazelcast-client:
  network:
    cluster-members:
      - 127.0.0.1:7000

Composing a Hazelcast client declarative configuration by importing the above two examples is shown below.

hazelcast-client:
  import:
    - client-cluster-config.yaml
    - client-network-config.yaml
Use <import/> element on top level of the XML hierarchy.
Use the import mapping on top level of the YAML hierarchy.

Resources from the classpath and file system may also be used to compose a declarative configuration:

<hazelcast>
    <import resource="file:///etc/hazelcast/development-cluster-config.xml"/> <!-- loaded from filesystem -->
    <import resource="classpath:development-network-config.xml"/>  <!-- loaded from classpath -->
</hazelcast>
hazelcast:
  import:
    # loaded from filesystem
    - file:///etc/hazelcast/development-cluster-config.yaml
    # loaded from classpath
    - classpath:development-network-config.yaml

Importing resources with variables in their names is also supported. See the following example snippets:

<hazelcast>
    <import resource="${environment}-cluster-config.xml"/>
    <import resource="${environment}-network-config.xml"/>
</hazelcast>
hazelcast:
  import:
    - ${environment}-cluster-config.yaml
    - ${environment}-network-config.yaml
See the Using Variables section to learn how you can set the configuration elements with variables.

4.1.2. Configuring Declaratively with YAML

You can configure the Hazelcast members and Java clients declaratively with YAML configuration files in installations of Hazelcast running on Java runtime version 8 or above.

The structure of the YAML configuration follows the structure of the XML configuration. Therefore, you can rewrite the existing XML configurations in YAML easily. There are some differences between the XML and YAML languages that make the two declarative configurations to slightly derive as the the following examples show.

In the YAML declarative configuration, mappings are used in which the name of the mapping node needs to be unique within its enclosing mapping. See the following example with configuring two maps in the same configuration file.

In the XML configuration files, we have two <map> elements as shown below.

<hazelcast>
    ...
    <map name="map1">
        <!-- map1 configuration -->
    </map>
    <map name="map2">
        <!-- map2 configuration -->
    </map>
    ...
</hazelcast>

In the YAML configuration, the map can be configured under a mapping map as shown in the following example.

hazelcast:
    ...
    map:
        map1:
          # map1 configuration
        map2:
          # map2 configuration
    ...

The XML and YAML configurations above define the same maps map1 and map2. Please note that in the YAML configuration file there is no name node, instead, the name of the map is used as the name of the mapping for configuring the given map.

There are other configuration entries that have no unique names and are listed in the same enclosing entry. Examples to this kind of configurations are listing the member addresses, interfaces in the networking configurations and defining listeners. The following example configures listeners to illustrate this.

<hazelcast>
    ...
    <listeners>
        <listener>com.hazelcast.examples.MembershipListener</listener>
        <listener>com.hazelcast.examples.InstanceListener</listener>
        <listener>com.hazelcast.examples.MigrationListener</listener>
        <listener>com.hazelcast.examples.PartitionLostListener</listener>
    </listeners>
    ...
</hazelcast>

In the YAML configuration, the listeners are defined as a sequence.

hazelcast:
  ...
  listeners:
    - com.hazelcast.examples.MembershipListener
    - com.hazelcast.examples.InstanceListener
    - com.hazelcast.examples.MigrationListener
    - com.hazelcast.examples.PartitionLostListener
  ...

Another notable difference between XML and YAML is the lack of the attributes in the case of YAML. Everything that can be configured with an attribute in the XML configuration is a scalar node in the YAML configuration with the same name. See the following example.

hazelcast:
<hazelcast>
    ...
    <network>
        <join>
            <multicast enabled="true">
                <multicast-group>1.2.3.4</multicast-group>
                <!-- other multicast configuration options -->
            </multicast>
        </join>
    </network>
    ...
</hazelcast>

In the identical YAML configuration, the enabled attribute of the XML configuration is a scalar node on the same level with the other items of the multicast configuration.

hazelcast:
  ...
  network:
    join:
      multicast:
        enabled: true
        multicast-group: 1.2.3.4
        # other multicast configuration options
  ...

You can refer to the full example YAML configuration files placed in the /bin folder of the downloadable hazelcast-4.0-SNAPSHOT.zip after unzipping it. Please see the complete list of the full example YAML configurations here.

4.2. Configuring Programmatically

Besides declarative configuration, you can configure your cluster programmatically. For this you can create a Config object, set/change its properties and attributes and use this Config object to create a new Hazelcast member. Following is an example code which configures some network and Hazelcast Map properties.

Config config = new Config();
config.getNetworkConfig().setPort( 5900 )
        .setPortAutoIncrement( false );

MapConfig mapConfig = new MapConfig();
mapConfig.setName( "testMap" )
        .setBackupCount( 2 )
        .setTimeToLiveSeconds( 300 );

To create a Hazelcast member with the above example configuration, pass the configuration object as shown below:

HazelcastInstance hazelcast = Hazelcast.newHazelcastInstance( config );
The Config must not be modified after the Hazelcast instance is started. In other words, all configuration must be completed before creating the HazelcastInstance. Certain additional configuration elements can be added at runtime as described in the Dynamically Adding Data Structure Configuration on a Cluster section.

You can also create a named Hazelcast member. In this case, you should set instanceName of Config object as shown below:

Config config = new Config();
config.setInstanceName( "my-instance" );
Hazelcast.newHazelcastInstance( config );

To retrieve an existing Hazelcast member by its name, use the following:

Hazelcast.getHazelcastInstanceByName( "my-instance" );

To retrieve all existing Hazelcast members, use the following:

Hazelcast.getAllHazelcastInstances();
Hazelcast performs schema validation through the file hazelcast-config-4.0-SNAPSHOT.xsd which comes with your Hazelcast libraries. Hazelcast throws a meaningful exception if there is an error in the declarative or programmatic configuration.

If you want to specify your own configuration file to create Config, Hazelcast supports several ways including filesystem, classpath, InputStream and URL.

Building Config from the XML declarative configuration:

  • Config cfg = new XmlConfigBuilder(xmlFileName).build();

  • Config cfg = new XmlConfigBuilder(inputStream).build();

  • Config cfg = new ClasspathXmlConfig(xmlFileName);

  • Config cfg = new FileSystemXmlConfig(configFilename);

  • Config cfg = new UrlXmlConfig(url);

  • Config cfg = new InMemoryXmlConfig(xml);

Building Config from the YAML declarative configuration:

  • Config cfg = new YamlConfigBuilder(yamlFileName).build();

  • Config cfg = new YamlConfigBuilder(inputStream).build();

  • Config cfg = new ClasspathYamlConfig(yamlFileName);

  • Config cfg = new FileSystemYamlConfig(configFilename);

  • Config cfg = new UrlYamlConfig(url);

  • Config cfg = new InMemoryYamlConfig(yaml);

4.3. Configuring with System Properties

You can use system properties to configure some aspects of Hazelcast. You set these properties as name and value pairs through declarative configuration, programmatic configuration or JVM system property. Following are examples for each option.

Declarative Configuration:

<hazelcast>
    ...
    <properties>
        <property name="hazelcast.property.foo">value</property>
    </properties>
    ...
</hazelcast>
hazelcast:
    ...
    properties:
      hazelcast.property.foo: value
    ...

Programmatic Configuration:

Config config = new Config() ;
config.setProperty( "hazelcast.property.foo", "value" );

Using JVM’s System class or -D argument:

System.setProperty( "hazelcast.property.foo", "value" );

or

java -Dhazelcast.property.foo=value

You will see Hazelcast system properties mentioned throughout this Reference Manual as required in some of the chapters and sections. All Hazelcast system properties are listed in the System Properties appendix with their descriptions, default values and property types as a reference for you.

4.4. Configuring within Spring Context

If you use Hazelcast with Spring you can declare beans using the namespace hazelcast. When you add the namespace declaration to the element beans in the Spring context file, you can start to use the namespace shortcut hz to be used as a bean declaration. Following is an example Hazelcast configuration when integrated with Spring:

<hz:hazelcast id="instance">
    <hz:config>
        <hz:cluster name="dev"/>
        <hz:network port="5701" port-auto-increment="false">
            <hz:join>
                <hz:multicast enabled="false"/>
                <hz:tcp-ip enabled="true">
                    <hz:members>10.10.1.2, 10.10.1.3</hz:members>
                </hz:tcp-ip>
            </hz:join>
        </hz:network>
    </hz:config>
</hz:hazelcast>

See the Integration with Spring section for more information on Hazelcast-Spring integration.

4.5. Dynamically Adding Data Structure Configuration on a Cluster

As described above, Hazelcast can be configured in a declarative or programmatic way; configuration must be completed before starting a Hazelcast member and this configuration cannot be altered at runtime, thus we refer to this as static configuration.

It is possible to dynamically add configuration for certain data structures at runtime; these can be added by invoking one of the Config.add*Config methods on the Config object obtained from a running member’s HazelcastInstance.getConfig() method. For example:

Config config = new Config();
MapConfig mapConfig = new MapConfig("sessions");
config.addMapConfig(mapConfig);
HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
MapConfig noBackupsMap = new MapConfig("dont-backup").setBackupCount(0);
instance.getConfig().addMapConfig(noBackupsMap);

Dynamic configuration elements must be fully configured before the invocation of add*Config method: at that point, the configuration object is delivered to every member of the cluster and added to each member’s dynamic configuration, so mutating the configuration object after the add*Config invocation does not have an effect.

As dynamically added data structure configuration is propagated across all cluster members, failures may occur due to conditions such as timeout and network partition. The configuration propagation mechanism internally retries adding the configuration whenever a membership change is detected. However if an exception is thrown from add*Config method, the configuration may have been partially propagated to some cluster members and adding the configuration should be retried by the user.

Adding a new dynamic configuration is supported for all add*Config methods except the following:

  • JobTracker: It has been deprecated since Hazelcast 3.8.

  • SplitBrainProtectionConfig: A new split-brain protection configuration cannot be dynamically added but other configuration can reference split-brain protections configured in the existing static configuration.

  • WanReplicationConfig: A new WAN replication configuration cannot be dynamically added, however existing static ones can be referenced from other configurations, e.g., a new dynamic MapConfig may include a WanReplicationRef to a statically configured WAN replication.

  • ListenerConfig: Listeners can be instead added at runtime via other API such as HazelcastInstance.getCluster().addMembershipListener and HazelcastInstance.getPartitionService().addMigrationListener.

Keep in mind that this feature also works for Hazelcast Java clients. See the following example:

HazelcastInstance client = HazelcastClient.newHazelcastClient();
MapConfig mCfg = new MapConfig("test");
mCfg.setTimeToLiveSeconds(15);
client.getConfig().addMapConfig(mCfg);
HazelcastClient.shutdownAll();
If your cluster has data structures with configurations added during runtime, those configurations are lost when a cluster restart occurs due to any reason since they are not persisted. This will be improved in the future Hazelcast IMDG releases.

4.5.1. Handling Configuration Conflicts

Attempting to add a dynamic configuration, when a static configuration for the same element already exists, throws ConfigurationException. For example, assuming we start a member with the following fragment in hazelcast.xml configuration:

<hazelcast>
    ...
    <map name="sessions">
        ...
    </map>
    ...
</hazelcast>

Then adding a dynamic configuration for a map with the name sessions throws a ConfigurationException:

HazelcastInstance instance = Hazelcast.newHazelcastInstance();

MapConfig sessionsMapConfig = new MapConfig("sessions");

// this will throw ConfigurationException:
instance.getConfig().addMapConfig(sessionsMapConfig);

When attempting to add dynamic configuration for an element for which dynamic configuration has already been added, then if a configuration conflict is detected a ConfigurationException is thrown. For example:

HazelcastInstance instance = Hazelcast.newHazelcastInstance();

MapConfig sessionsMapConfig = new MapConfig("sessions").setBackupCount(0);
instance.getConfig().addMapConfig(sessionsMapConfig);

MapConfig sessionsWithBackup = new MapConfig("sessions").setBackupCount(1);
// throws ConfigurationException because the new MapConfig conflicts with existing one
instance.getConfig().addMapConfig(sessionsWithBackup);

MapConfig sessionsWithoutBackup = new MapConfig("sessions").setBackupCount(0);
// does not throw exception: new dynamic config is equal to existing dynamic config of same name
instance.getConfig().addMapConfig(sessionsWithoutBackup);

4.5.2. Dynamic Data Structure Configuration and User Customizations

Dynamically added data structure configuration may reference user customizations, such as a user-provided MapLoader implementation referenced by a MapConfig. User customizations can be usually configured using either of the following:

  • by specifying a class or factory class name, e.g., MapStoreConfig.setClassName, and letting the Hazelcast members instantiate the object

  • by providing an existing instance, e.g., MapStoreConfig.setImplementation.

When dynamically adding new a data structure configuration with user customizations, take the following considerations into account:

  • For the user customizations submitted as a class name or factory class name, the referenced classes are resolved lazily. Therefore, they should be either already on each member’s local classpath or resolvable via user code deployment.

  • When the user customizations are submitted as instances (or similarly factory instances), the instances themselves have to be serializable. This is because the entire configuration needs to be sent over the network to all cluster members, and their classes have to be available on each member’s local classpath.

4.6. Checking Configuration

When you start a Hazelcast member without passing a Config object, as explained in the Configuring Programmatically section, Hazelcast checks the member’s configuration as follows:

  • First, it looks for the hazelcast.config system property. If it is set, its value is used as the path. This is useful if you want to be able to change your Hazelcast configuration; you can do this because it is not embedded within the application. You can set the config option with the following command:

    -Dhazelcast.config=`*`<path to the hazelcast.xml or hazelcast.yaml>

    The suffix of the filename is used to determine the language of the configuration. If the suffix is .xml the configuration file is parsed as an XML configuration file. If it is .yaml, the configuration file is parsed as a YAML configuration file.

    The path can be a regular one or a classpath reference with the prefix classpath:.

  • If the above system property is not set, Hazelcast then checks whether there is a hazelcast.xml file in the working directory.

  • If not, it then checks whether hazelcast.xml exists on the classpath.

  • If not, it then checks whether hazelcast.yaml exists in the working directory.

  • If not, it then checks whether hazelcast.yaml exists on the classpath.

  • If none of the above works, Hazelcast loads the default configuration (hazelcast.xml) that comes with your Hazelcast package.

Before configuring Hazelcast, please try to work with the default configuration to see if it works for you. This default configuration should be fine for most of the users. If not, you can consider to modify the configuration to be more suitable for your environment.

4.7. Configuration Pattern Matcher

You can give a custom strategy to match an item name to a configuration pattern. By default Hazelcast uses a simplified wildcard matching. See Using Wildcards section for this. A custom configuration pattern matcher can be given by using either member or client config objects, as shown below:

// Setting a custom config pattern matcher via member config object
Config config = new Config();
config.setConfigPatternMatcher(new ExampleConfigPatternMatcher());

And the following is an example pattern matcher:

class ExampleConfigPatternMatcher extends MatchingPointConfigPatternMatcher {

    @Override
    public String matches(Iterable<String> configPatterns, String itemName) throws InvalidConfigurationException {
        String matches = super.matches(configPatterns, itemName);
        if (matches == null) throw new InvalidConfigurationException("No config found for " + itemName);
        return matches;
    }
}

4.8. Using Wildcards

Hazelcast supports wildcard configuration for all distributed data structures that can be configured using Config, that is, for all except IAtomicLong, IAtomicReference. Using an asterisk (*) character in the name, different instances of maps, queues, topics, semaphores, etc. can be configured by a single configuration.

A single asterisk (*) can be placed anywhere inside the configuration name.

For instance, a map named com.hazelcast.test.mymap can be configured using one of the following configurations:

<hazelcast>
    ...
    <map name="com.hazelcast.test.*">
        ...
    </map>

    <!-- OR -->

    <map name="com.hazel*">
        ...
    </map>

    <!-- OR -->

    <map name="*.test.mymap">
        ...
    </map>

    <!-- OR -->

    <map name="com.*test.mymap">
        ...
    </map>
    ...
</hazelcast>

A queue named com.hazelcast.test.myqueue can be configured using one of the following configurations:

<hazelcast>
    ...
    <queue name="*hazelcast.test.myqueue">
        ...
    </queue>

    <!-- OR -->

    <queue name="com.hazelcast.*.myqueue">
        ...
    </queue>
    ...
</hazelcast>
  • You can use only a single asterisk as a wildcard for each data structure configuration.

  • If you have matching wildcard configurations for a data structure, the most specific (longest) one is used when configuring it. Let’s say you have a map named mymap.customer.name and you have map configurations mymap.* and mymap.customer.*. Hazelcast uses mymap.customer.* to configure this map.

    As another example, assume that you have a map named mymap.customer.name, and map configurations mymap.*.name and mymap.customer.*. Hazelcast uses mymap.customer.* to configure this map. As you see, the longest character length before the asterisk makes it the most specific, so it wins the configuration.

4.9. Using Variables

In your Hazelcast and/or Hazelcast Client declarative configuration, you can use variables to set the values of the elements. This is valid when you set a system property programmatically or you use the command line interface. You can use a variable in the declarative configuration to access the values of the system properties you set.

For example, see the following command that sets two system properties.

-Dcluster.name=dev

Let’s get the values of these system properties in the declarative configuration of Hazelcast, as shown below.

In the XML configuration:

<hazelcast>
    <cluster-name>${cluster.name}</cluster-name>
</hazelcast>

In the YAML configuration:

hazelcast:
  cluster:
    name: ${cluster.name}

This also applies to the declarative configuration of Hazelcast Java Client, as shown below.

<hazelcast-client>
    <cluster-name>${cluster.name}</cluster-name>
</hazelcast-client>
hazelcast-client:
  cluster:
    name: ${cluster.name}

If you do not want to rely on the system properties, you can use the XmlConfigBuilder or YamlConfigBuilder and explicitly set a Properties instance, as shown below.

Properties properties = new Properties();

// fill the properties, e.g., from database/LDAP, etc.

XmlConfigBuilder builder = new XmlConfigBuilder();
builder.setProperties(properties);
Config config = builder.build();
HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);

4.10. Variable Replacers

Variable replacers are used to replace custom strings during loading the configuration, e.g., they can be used to mask sensitive information such as usernames and passwords. Of course their usage is not limited to security related information.

Variable replacers implement the interface com.hazelcast.config.replacer.spi.ConfigReplacer and they are configured only declaratively: in the Hazelcast’s declarative configuration files, i.e., hazelcast.xml, hazelcast.yaml and hazelcast-client .xml, hazelcast-client.yaml. See the ConfigReplacers Javadoc for basic information on how a replacer works.

Variable replacers are configured within the element <config-replacers> under <hazelcast>, as shown below.

In the XML configuration:

<hazelcast>
    ...
    <config-replacers fail-if-value-missing="false">
        <replacer class-name="com.acme.MyReplacer">
            <properties>
                <property name="propName">value</property>
                ...
            </properties>
        </replacer>
        <replacer class-name="example.AnotherReplacer"/>
    </config-replacers>
    ...
</hazelcast>

In the YAML configuration:

hazelcast:
    ...
    config-replacers:
      fail-if-value-missing: false
      replacers:
        - class-name: com.acme.MyReplacer
          properties:
            propName: value
            ...
        - class-name: example.AnotherReplacer
    ...

As you can see, <config-replacers> is the parent element for your replacers, which are declared using the <replacer> sub-elements. You can define multiple replacers under the <config-replacers>. Here are the descriptions of elements and attributes used for the replacer configuration:

  • fail-if-value-missing: Specifies whether the loading configuration process stops when a replacement value is missing. It is an optional attribute and its default value is true.

  • class-name: Full class name of the replacer.

  • <properties>: Contains names and values of the properties used to configure a replacer. Each property is defined using the <property> sub-element. All of the properties are explained in the upcoming sections.

The following replacer classes are provided by Hazelcast as example implementations of the ConfigReplacer interface. Note that you can also implement your own replacers.

  • EncryptionReplacer

  • PropertyReplacer

There is also a ExecReplacer which runs an external command and uses its standard output as the value for the variable. See its code sample.

Each example replacer is explained in the below sections.

4.10.1. EncryptionReplacer

This example EncryptionReplacer replaces encrypted variables by its plain form. The secret key for encryption/decryption is generated from a password which can be a value in a file and/or environment specific values, such as MAC address and actual user data.

Its full class name is com.hazelcast.config.replacer.EncryptionReplacer and the replacer prefix is ENC. The following are the properties used to configure this example replacer:

  • cipherAlgorithm: Cipher algorithm used for the encryption/decryption. Its default value is AES.

  • keyLengthBits: Length of the secret key to be generated in bits. Its default value is 128 bits.

  • passwordFile: Path to a file whose content should be used as a part of the encryption password. When the property is not provided no file is used as a part of the password. Its default value is null.

  • passwordNetworkInterface: Name of network interface whose MAC address should be used as a part of the encryption password. When the property is not provided no network interface property is used as a part of the password. Its default value is null.

  • passwordUserProperties: Specifies whether the current user properties (user.name and user.home) should be used as a part of the encryption password. Its default value is true.

  • saltLengthBytes: Length of a random password salt in bytes. Its default value is 8 bytes.

  • secretKeyAlgorithm: Name of the secret-key algorithm to be associated with the generated secret key. Its default value is AES.

  • secretKeyFactoryAlgorithm: Algorithm used to generate a secret key from a password. Its default value is PBKDF2WithHmacSHA256.

  • securityProvider: Name of a Java Security Provider to be used for retrieving the configured secret key factory and the cipher. Its default value is null.

Older Java versions may not support all the algorithms used as defaults. Please use the property values supported your Java version.

As a usage example, let’s create a password file and generate the encrypted string out of this file as instructed below:

  1. Create the password file: echo '/Za-uG3dDfpd,5.-' > /opt/master-password

  2. Define the encrypted variables:

    java -cp hazelcast-*.jar \
        -DpasswordFile=/opt/master-password \
        -DpasswordUserProperties=false \
        com.hazelcast.config.replacer.EncryptionReplacer \
        "aCluster"
    $ENC{Gw45stIlan0=:531:yVN9/xQpJ/Ww3EYkAPvHdA==}
  3. Configure the replacer and put the encrypted variables into the configuration:

    <hazelcast>
        <config-replacers>
            <replacer class-name="com.hazelcast.config.replacer.EncryptionReplacer">
                <properties>
                    <property name="passwordFile">/opt/master-password</property>
                    <property name="passwordUserProperties">false</property>
                </properties>
            </replacer>
        </config-replacers>
        <cluster-name>$ENC{Gw45stIlan0=:531:yVN9/xQpJ/Ww3EYkAPvHdA==}</cluster-name>
    </hazelcast>
  4. Check if the decryption works:

    java -jar hazelcast-*.jar
    Apr 06, 2018 10:15:43 AM com.hazelcast.config.XmlConfigLocator
    INFO: Loading 'hazelcast.xml' from working directory.
    Apr 06, 2018 10:15:44 AM com.hazelcast.instance.AddressPicker
    INFO: [LOCAL] [aCluster] [3.10-SNAPSHOT] Prefer IPv4 stack is true.

As you can see in the logs, the correctly decrypted cluster name value ("aCluster") is used.

4.10.2. PropertyReplacer

The PropertyReplacer replaces variables by properties with the given name. Usually the system properties are used, e.g., ${user.name}. There is no need to define it in the declarative configuration files.

Its full class name is com.hazelcast.config.replacer.PropertyReplacer and the replacer prefix is empty string ("").

4.10.3. Implementing Custom Replacers

You can also provide your own replacer implementations. All replacers have to implement the interface com.hazelcast.config.replacer.spi.ConfigReplacer. A simple snippet is shown below.

public interface ConfigReplacer {
    void init(Properties properties);
    String getPrefix();
    String getReplacement(String maskedValue);
}

5. Setting Up Clusters

This chapter describes Hazelcast clusters and the methods cluster members and native clients use to form a Hazelcast cluster.

5.1. Discovery Mechanisms

A Hazelcast cluster is a network of cluster members that run Hazelcast. Cluster members automatically join together to form a cluster. This automatic joining takes place with various discovery mechanisms that the cluster members use to find each other.

Please note that, after a cluster is formed, communication between cluster members is always via TCP/IP, regardless of the discovery mechanism used.

Hazelcast uses the following discovery mechanisms.

See the Hazelcast IMDG Deployment and Operations Guide for advices on the best discovery mechanism to use.

5.1.1. TCP

You can configure Hazelcast to be a full TCP/IP cluster. See the Discovering Members by TCP section for configuration details.

5.1.2. Multicast

Multicast mechanism is not recommended for production since UDP is often blocked in production environments and other discovery mechanisms are more definite.

With this mechanism, Hazelcast allows cluster members to find each other using multicast communication. See the Discovering Members by Multicast section.

5.1.3. AWS Cloud Discovery

Hazelcast supports EC2 auto-discovery. It is useful when you do not want to provide or you cannot provide the list of possible IP addresses. This discovery feature is provided as a Hazelcast plugin. See its documentation for information on configuring and using it.

5.1.4. GCP Cloud Discovery

Hazelcast supports discovering members in the GCP Compute Engine environment. You can easily configure Hazelcast members discovery, WAN replication, and Hazelcast Client to work seamlessly on the native GCP VM Instances. This discovery feature is provided as a Hazelcast plugin. See its documentation for information on configuring and using it.

5.1.5. Apache jclouds® Cloud Discovery

Hazelcast members and native clients support jclouds® for discovery. This mechanism allows applications to be deployed in various cloud infrastructure ecosystems in an infrastructure-agnostic way. This discovery feature is provided as a Hazelcast plugin. See its documentation for information on configuring and using it.

5.1.6. Azure Cloud Discovery

Hazelcast offers a discovery strategy for your Hazelcast applications running on Azure. This strategy provides all of your Hazelcast instances by returning the virtual machines within your Azure resource group that are tagged with a specified value. This discovery feature is provided as a Hazelcast plugin. See its documentation for information on configuring and using it.

5.1.7. Zookeeper Cloud Discovery

This discovery mechanism provides a service based discovery strategy by using Apache Curator to communicate with your Zookeeper server. You can use this plugin with Discovery SPI enabled applications. This is provided as a Hazelcast plugin. See its documentation for information on configuring and using it.

5.1.8. Consul Cloud Discovery

Consul is a highly available and distributed service discovery and key-value store designed with support for the modern data center to make distributed systems and configuration easy. This mechanism provides a Consul based discovery strategy for Hazelcast enabled applications and enables Hazelcast members to dynamically discover one another via Consul. This discovery feature is provided as a Hazelcast plugin. See its documentation for information on configuring and using it.

5.1.9. etcd Cloud Discovery

This mechanism provides an etcd based discovery strategy for Hazelcast enabled applications. This is an easy to configure plug-and-play Hazelcast discovery strategy that optionally registers each of your Hazelcast members with etcd and enables Hazelcast members to dynamically discover one another via etcd. This discovery feature is provided as a Hazelcast plugin. See its documentation for information on configuring and using it.

5.1.10. Hazelcast for PCF

Using a clickable Hazelcast Tile for Pivotal Cloud Foundry (PCF), you can deploy your Hazelcast cluster on PCF. This feature is provided as a Hazelcast plugin. See its documentation on how to install, configure and use the plugin Hazelcast for PCF.

5.1.11. Hazelcast OpenShift Integration

Hazelcast can run inside OpenShift benefiting from its cluster management software Kubernetes for discovery of members. Using Hazelcast Docker images, templates and default configuration files, you can deploy Hazelcast IMDG, Hazelcast IMDG Enterprise and Management Center onto OpenShift. See the following related documentation:

See also the Hazelcast for OpenShift guide, which presents how to set up the local OpenShift environment, start a Hazelcast cluster, configure the Management Center and finally run a sample client application.

5.1.12. Eureka Cloud Discovery

Eureka is a REST based service that is primarily used in the AWS cloud for locating services for the purpose of load balancing and failover of middle-tier servers. Hazelcast supports Eureka V1 discovery; Hazelcast members within EC2 Virtual Private Cloud can discover each other using this mechanism. This discovery feature is provided as a Hazelcast plugin. See its documentation.

5.1.13. Heroku Cloud Discovery

Heroku is a platform as a service (PaaS) with which you can build, run and operate applications entirely in the cloud. It is a cloud platform based on a managed container system, with integrated data services and a powerful ecosystem. Hazelcast offers a discovery plugin that looks for IP addresses of other members by resolving service names against the Heroku DNS Discovery in Heroku Private Spaces. This discovery feature is provided as a Hazelcast plugin. See its documentation.

5.1.14. Kubernetes Cloud Discovery

Kubernetes is an open source system for automating deployment, scaling and management of containerized applications. Hazelcast provides Kubernetes discovery mechanism that looks for IP addresses of other members by resolving the requests against a Kubernetes Service Discovery system. It supports two different options of resolving against the discovery registry: (i) a request to the REST API, (ii) DNS Lookup against a given DNS service name. This discovery feature is provided as a Hazelcast plugin. See its documentation for information on configuring and using it.

5.2. Discovering Members by TCP

If multicast is not the preferred way of discovery for your environment, then you can configure Hazelcast to be a full TCP/IP cluster. When you configure Hazelcast to discover members by TCP/IP, you must list all or a subset of the members' host names and/or IP addresses as cluster members. You do not have to list all of these cluster members, but at least one of the listed members has to be active in the cluster when a new member joins.

To configure your Hazelcast to be a full TCP/IP cluster, set the following configuration elements. See the tcp-ip element section for the full descriptions of the TCP/IP discovery configuration elements.

  • Set the enabled attribute of the multicast element to false.

  • Set the enabled attribute of the aws element to false.

  • Set the enabled attribute of the tcp-ip element to true.

  • Provide your member elements within the tcp-ip element.

The following is an example declarative configuration.

<hazelcast>
    ...
    <network>
        <join>
            <multicast enabled="false">
            </multicast>
            <tcp-ip enabled="true">
                <member>machine1</member>
                <member>machine2</member>
                <member>machine3:5799</member>
                <member>192.168.1.0-7</member>
                <member>192.168.1.21</member>
            </tcp-ip>
        </join>
    </network>
    ...
</hazelcast>

As shown above, you can provide IP addresses or host names for member elements. You can also give a range of IP addresses, such as 192.168.1.0-7.

Instead of providing members line-by-line as shown above, you also have the option to use the members element and write comma-separated IP addresses, as shown below.

<members>192.168.1.0-7,192.168.1.21</members>

If you do not provide ports for the members, Hazelcast automatically tries the ports 5701, 5702 and so on.

By default, Hazelcast binds to all local network interfaces to accept incoming traffic. You can change this behavior using the system property hazelcast.socket.bind.any. If you set this property to false, Hazelcast uses the interfaces specified in the interfaces element (see the Interfaces Configuration section). If no interfaces are provided, then it tries to resolve one interface to bind from the member elements.

5.3. Discovering Members by Multicast

With the multicast auto-discovery mechanism, Hazelcast allows cluster members to find each other using multicast communication. The cluster members do not need to know the concrete addresses of the other members, as they just multicast to all the other members for listening. Whether multicast is possible or allowed depends on your environment.

To set your Hazelcast to multicast auto-discovery, set the following configuration elements. See the multicast element section for the full description of the multicast discovery configuration elements.

  • Set the enabled attribute of the multicast element to "true".

  • Set multicast-group, multicast-port, multicast-time-to-live, etc. to your multicast values.

  • Set the enabled attribute of both tcp-ip and aws elements to "false".

The following is an example declarative configuration.

<hazelcast>
    ...
    <network>
        <join>
            <multicast enabled="true">
                <multicast-group>224.2.2.3</multicast-group>
                <multicast-port>54327</multicast-port>
                <multicast-time-to-live>32</multicast-time-to-live>
                <multicast-timeout-seconds>2</multicast-timeout-seconds>
                <trusted-interfaces>
                    <interface>192.168.1.102</interface>
                </trusted-interfaces>
            </multicast>
            <tcp-ip enabled="false">
            </tcp-ip>
            <aws enabled="false">
            </aws>
        </join>
    </network>
    ...
</hazelcast>

Pay attention to the multicast-timeout-seconds element. multicast-timeout-seconds specifies the time in seconds that a member should wait for a valid multicast response from another member running in the network before declaring itself the leader member (the first member joined to the cluster) and creating its own cluster. This only applies to the startup of members where no leader has been assigned yet. If you specify a high value to multicast-timeout-seconds, such as 60 seconds, it means that until a leader is selected, each member waits 60 seconds before moving on. Be careful when providing a high value. Also, be careful not to set the value too low, or the members might give up too early and create their own cluster.

Multicast auto-discovery is not supported for Hazelcast native clients yet. However, we offer Multicast Discovery Plugin for this purpose. See the Discovering Native Clients section.

5.4. Discovering Native Clients

Hazelcast members and native Java clients can find each other with multicast discovery plugin. This plugin is implemented using Hazelcast Discovery SPI. You should configure the plugin both at Hazelcast members and Java clients in order to use multicast discovery.

To configure your cluster to have the multicast discovery plugin, follow these steps:

  • Disable the multicast and TCP/IP join mechanisms. To do this, set the enabled attributes of the multicast and tcp-ip elements to false in your hazelcast.xml configuration file

  • Set the enabled attribute of the hazelcast.discovery.enabled property to true.

  • Add multicast discovery strategy configuration to your XML file, i.e., <discovery-strategies> element.

The following is an example declarative configuration.

<hazelcast>
    ...
    <properties>
        <property name="hazelcast.discovery.enabled">true</property>
    </properties>
    <network>
        <join>
            <multicast enabled="false">
            </multicast>
            <tcp-ip enabled="false">
            </tcp-ip>
            <discovery-strategies>
                <discovery-strategy class="com.hazelcast.spi.discovery.multicast.MulticastDiscoveryStrategy" enabled="true">
                    <properties>
                        <property name="group">224.2.2.3</property>
                        <property name="port">54327</property>
                    </properties>
                </discovery-strategy>
            </discovery-strategies>
        </join>
    </network>
    ...
</hazelcast>

The following are the multicast discovery plugin configuration properties with their descriptions:

  • group: String value that is used to set the multicast group, so that you can isolate your clusters.

  • port: Integer value that is used to set the multicast port.

5.5. Creating Clusters

You can create clusters using the cluster-name configuration element.

You can separate and group your clusters in a simple way by specifying cluster names. Example groupings can be by development, production, test, app, etc. The following is an example declarative configuration.

<hazelcast>
    <cluster-name>production</cluster-name>
</hazelcast>

You can also define the cluster configuration programmatically. A JVM can host multiple Hazelcast instances. Each Hazelcast instance can only participate in one group. Each Hazelcast instance only joins to its own group and does not interact with other groups. The following code example creates three separate Hazelcast instances--h1 belongs to the production cluster, while h2 and h3 belong to the development cluster.

Config configProd = new Config();
configProd.setClusterName( "production" );

Config configDev = new Config();
configDev.setClusterName( "development" );

HazelcastInstance h1 = Hazelcast.newHazelcastInstance( configProd );
HazelcastInstance h2 = Hazelcast.newHazelcastInstance( configDev );
HazelcastInstance h3 = Hazelcast.newHazelcastInstance( configDev );

5.5.1. Cluster Groups before Hazelcast 3.8.2

If you have a Hazelcast release older than 3.8.2, you need to provide also a group password along with the group name. The following are the configuration examples with the password element:

<hazelcast>
    <group>
        <name>production</name>
        <password>prod-pass</password>
    </group>
</hazelcast>
Config configProd = new Config();
configProd.setClusterName( "production" );

Config configDev = new Config();
configDev.setClusterName( "development" );

HazelcastInstance h1 = Hazelcast.newHazelcastInstance( configProd );
HazelcastInstance h2 = Hazelcast.newHazelcastInstance( configDev );
HazelcastInstance h3 = Hazelcast.newHazelcastInstance( configDev );
Starting with 3.8.2, members no longer perform a password check during the cluster join process. Starting with 3.11, members no longer perform a password check when a client connects to the cluster.

5.6. Deploying User Code on the Member

Hazelcast can dynamically load your custom classes or domain classes from other members. A lite member can be designated as a class repository, but any member can provide classes to other members. For this purpose Hazelcast offers a distributed dynamic class loader.

The following is a brief working mechanism of the User Code Deployment feature:

  1. A new dynamic class loader is created to handle each operation.

  2. It first checks locally available classes, i.e. the member’s classpath. If the class is found, it is used.

  3. Then it checks the cache of classes loaded from remote members or clients (if caching is enabled on your local member, see the Configuring User Code Deployment section). If your class is found there, it is used.

  4. Finally, the dynamic class loader checks configured remote members, one by one. If some member returns the class, it will be used. It can also put this class into the local class cache as mentioned in the previous step.

  5. If the class is not found, ClassNotFoundException is thrown.

  6. The dynamic class loader is released after the operation is handled. A next operation will load the class from the cache or re-fetch it.

5.6.1. Configuring User Code Deployment

User Code Deployment feature is not enabled by default. You can control local caching of the classes loaded from other members, control classes to be provided to other members and create blacklists and whitelists of classes and packages.

Following are example configuration snippets:

Declarative Configuration:

<hazelcast>
    ...
    <user-code-deployment enabled="true">
        <class-cache-mode>ETERNAL</class-cache-mode>
        <provider-mode>LOCAL_AND_CACHED_CLASSES</provider-mode>
        <blacklist-prefixes>com.foo,com.bar</blacklist-prefixes>
        <whitelist-prefixes>com.bar.MyClass</whitelist-prefixes>
        <provider-filter>HAS_ATTRIBUTE:lite</provider-filter>
    </user-code-deployment>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
UserCodeDeploymentConfig distCLConfig = config.getUserCodeDeploymentConfig();
distCLConfig.setEnabled( true )
        .setClassCacheMode( UserCodeDeploymentConfig.ClassCacheMode.ETERNAL )
        .setProviderMode( UserCodeDeploymentConfig.ProviderMode.LOCAL_AND_CACHED_CLASSES )
        .setBlacklistedPrefixes( "com.foo,com.bar" )
        .setWhitelistedPrefixes( "com.bar.MyClass" )
        .setProviderFilter( "HAS_ATTRIBUTE:lite" );

User Code Deployment on the member has the following configuration:

  • enabled: Specifies whether dynamic class loading is enabled or not. Its default value is "false" and it’s a mandatory attribute. If feature is disabled, the member will never load classes from other members or clients.

  • <class-cache-mode>: Controls the local caching behavior for the classes loaded from remote members (classes loaded from clients are always cached). Available values are:

    • ETERNAL: Cache the loaded classes locally. This is the default value and suitable when you load long-living objects, such as domain objects stored in a map.

    • OFF: Do not cache the loaded classes locally. It is suitable for loading runnables, callables, entry processors, etc.

  • <provider-mode>: Controls which classes are served to other cluster members. Available values are:

    • LOCAL_AND_CACHED_CLASSES: Serve classes loaded from both local classpath and from other members. This is the default value.

    • LOCAL_CLASSES_ONLY: Serve classes from the local classpath only. Classes loaded from other members are used locally, but they are not served to other members.

    • OFF: Never serve classes to other members.

  • <blacklist-prefixes>: Comma separated class/package name prefixes that the member will never attempt to load from other members and that the client won’t be allowed to upload. For example, if you set it to "com.foo", remote loading of all classes from the "com.foo" package is prevented, including the classes from all its sub-packages. If you set it to "com.foo.Class", then "Class" and all classes starting with "Class" in the "com.foo" package are blacklisted. There are built-in prefixes which are always blacklisted. These are as follows:

    • javax.

    • java.

    • sun.

    • com.hazelcast.

  • <whitelist-prefixes>: Comma separated name prefixes of classes/packages only from which the classes are allowed to be loaded. It allows to quickly configure remote loading only for classes from selected packages. It can be used together with blacklisting. For example, you can whitelist the prefix "com.foo" and blacklist the prefix "com.foo.secret". If the list is empty, all classes are allowed.

  • <provider-filter>: Filter to constrain members that can be used for a class loading request when a class is not available locally. The value is in the format "HAS_ATTRIBUTE:foo". When it is set to "HAS_ATTRIBUTE:foo", the class loading request is only sent to the members which have "foo" as a member attribute. Setting this to null allows loading of classes from all members. See an example in the next section.

5.6.2. Example for Filtering of Members

As described above, the configuration element provider-filter is used to limit members that can be used to load classes. The attribute required in the provider-filter must be set as a member attribute on the members from which the classes are to be loaded. See the following examples provided as programmatic configurations.

The example configuration below allows the Hazelcast member to load classes only from members with the class-provider attribute set. It prevents from asking any other member to provide a locally unavailable class:

Config hazelcastConfig = new Config();
UserCodeDeploymentConfig ucdConfig = hazelcastConfig.getUserCodeDeploymentConfig();
ucdConfig.setProviderFilter("HAS_ATTRIBUTE:class-provider");

HazelcastInstance instance = Hazelcast.newHazelcastInstance(hazelcastConfig);

The example configuration below sets the attribute class-provider for a member. Therefore the above member will be able to load classes from this member:

Config hazelcastConfig = new Config();
MemberAttributeConfig memberAttributes = hazelcastConfig.getMemberAttributeConfig();
memberAttributes.setAttribute("class-provider", "true");

HazelcastInstance instance = Hazelcast.newHazelcastInstance(hazelcastConfig);

5.7. Deploying User Code from Clients

You can also deploy your code from the client side for the following situations:

  1. You have objects that run on the cluster via the clients such as Runnable, Callable and EntryProcessor.

  2. You have new user domain objects which need to be deployed into the cluster.

When this feature is enabled on the client, the client will deploy the classes to the members when connecting. This way, when a client adds a new class, the members do not require a restart to include it in their classpath.

You can also use the client permission policy to specify which clients are permitted to use User Code Deployment. See the Permissions section.

5.7.1. Configuring Client User Code Deployment

Client User Code Deployment feature is not enabled by default. You can configure this feature declaratively or programmatically. Following are example configuration snippets:

Declarative Configuration:

In your hazelcast-client.xml:

<hazelcast>
    ...
    <user-code-deployment enabled="true">
        <jarPaths>
            <jarPath>/User/example/example.jar</jarPath>
            <jarPath>example.jar</jarPath> <!--from class path -->
            <jarPath>https://com.example.com/example.jar</jarPath>
            <jarPath>file://Users/example/example.jar</jarPath>
        </jarPaths>
        <classNames>
            <!-- for classes available in client's class path -->
            <className>example.ClassName</className>
            <className>example.ClassName2</className>
        </classNames>
    </user-code-deployment>
    ...
</hazelcast>

Programmatic Configuration:

ClientConfig clientConfig = new ClientConfig();
ClientUserCodeDeploymentConfig clientUserCodeDeploymentConfig = new ClientUserCodeDeploymentConfig();

clientUserCodeDeploymentConfig.addJar("/User/example/example.jar");
clientUserCodeDeploymentConfig.addJar("https://com.example.com/example.jar");
clientUserCodeDeploymentConfig.addClass("example.ClassName");
clientUserCodeDeploymentConfig.addClass("example.ClassName2");

clientUserCodeDeploymentConfig.setEnabled(true);
clientConfig.setUserCodeDeploymentConfig(clientUserCodeDeploymentConfig);
Important to Know

The members have to be configured in a specific way for the feature to work correctly:

  • User Code Deployment must be enabled on the members. Otherwise, the classes from the client will be ignored. Also blacklisted and non-whitelisted classes will be ignored.

  • All members must be providers, provider-mode must be set to LOCAL_AND_CACHED_CLASSES on all members.

  • No provider-filter must be configured.

The client uploads the classes only to one member. If the members don’t load classes from each other, other members won’t see the class.

Here’s a programmatic configuration of the members that will work with client user code deployment:

Config config = new Config();
UserCodeDeploymentConfig ucdConfig = config.getUserCodeDeploymentConfig();
ucdConfig.setEnabled(true);
// following two configs are defaults, we show them for clarity
ucdConfig.setProviderMode(ProviderMode.LOCAL_AND_CACHED_CLASSES);
ucdConfig.setProviderFilter(null);

See the Member User Code Deployment section for more information on enabling it on the member side and the configuration properties.

Classes deployed from clients are always cached on the members, no matter whether ETERNAL or OFF is configured on the members.

Performance Considerations

The client always uploads all added classes and jars to one of the members, whether it has them or not. So avoid adding large jar files for each connection - if configured properly, the member will have the class the next time the client connects.

Two Versions of a Class

If the client uploads a class and the member already has that class, an exception is thrown if the byte code is different. If byte code is same, it is ignored. Therefore classes uploaded from the client can’t be updated with a new version.

5.7.2. Adding User Library to CLASSPATH

When you want to use a Hazelcast feature in a non-Java client, you need to make sure that the Hazelcast member recognizes it. For this, you can use the /user-lib directory that comes with the Hazelcast package and deploy your own library to the member. Let’s say you use Hazelcast Node.js client and want to use an entry processor. This processor should be IdentifiedDataSerializable or Portable in the Node.js client. You need to implement the Java equivalents of the processor and its factory on the member side, and put these compiled class or JAR files into the /user-lib directory. Then you can run the start.sh script which adds them to the classpath.

The following is an example code which can be the Java equivalent of entry processor in the Node.js client:

public class IdentifiedEntryProcessor implements EntryProcessor<String, String, String>, IdentifiedDataSerializable {
    static final int CLASS_ID = 1;
    private String value;
    public IdentifiedEntryProcessor() {
    }
    @Override
    public int getFactoryId() {
        return IdentifiedFactory.FACTORY_ID;
    }
    @Override
    public int getClassId() {
        return CLASS_ID;
    }
    @Override
    public void writeData(ObjectDataOutput out) throws IOException {
        out.writeUTF(value);
    }
    @Override
    public void readData(ObjectDataInput in) throws IOException {
        value = in.readUTF();
    }
    @Override
    public String process(Map.Entry<String, String> entry) {
        entry.setValue(value);
        return value;
    }
}

You can implement the above processor’s factory as follows:

public class IdentifiedFactory implements DataSerializableFactory {
    public static final int FACTORY_ID = 5;
    @Override
    public IdentifiedDataSerializable create(int typeId) {
        if (typeId == IdentifiedEntryProcessor.CLASS_ID) {
            return new IdentifiedEntryProcessor();
        }
        return null;
    }
}

And the following is the configuration for the above factory:

<hazelcast>
    <serialization>
        <data-serializable-factories>
            <data-serializable-factory factory-id="5">
                IdentifiedFactory
            </data-serializable-factory>
        </data-serializable-factories>
    </serialization>
</hazelcast>

Then, you can start your Hazelcast member by using the start scripts (start.sh or start.bat) in the /bin directory. The start scripts automatically adds your class and JAR files to the classpath.

5.8. Partition Group Configuration

Hazelcast distributes key objects into partitions using the consistent hashing algorithm. Multiple replicas are created for each partition and those partition replicas are distributed among Hazelcast members. An entry is stored in the members that own replicas of the partition to which the entry’s key is assigned. The total partition count is 271 by default; you can change it with the configuration property hazelcast.partition.count. See the System Properties appendix.

Hazelcast member that owns the primary replica of a partition is called as the partition owner. Other replicas are called backups. Based on the configuration, a key object can be kept in multiple replicas of a partition. A member can hold at most one replica of a partition (ownership or backup).

By default, Hazelcast distributes partition replicas randomly and equally among the cluster members, assuming all members in the cluster are identical. But what if some members share the same JVM or physical machine or chassis and you want backups of these members to be assigned to members in another machine or chassis? What if processing or memory capacities of some members are different and you do not want an equal number of partitions to be assigned to all members?

To deal with such scenarios, you can group members in the same JVM (or physical machine) or members located in the same chassis. Or you can group members to create identical capacity. We call these groups partition groups. Partitions are assigned to those partition groups instead of individual members. Backup replicas of a partition which is owned by a partition group are located in other partition groups.

5.8.1. Grouping Types

When you enable partition grouping, Hazelcast presents the following choices for you to configure partition groups.

HOST_AWARE

You can group members automatically using the IP addresses of members, so members sharing the same network interface are grouped together. All members on the same host (IP address or domain name) form a single partition group. This helps to avoid data loss when a physical server crashes, because multiple replicas of the same partition are not stored on the same host. But if there are multiple network interfaces or domain names per physical machine, this assumption is invalid.

The following are declarative and programmatic configuration snippets that show how to enable HOST_AWARE grouping:

<partition-group enabled="true" group-type="HOST_AWARE" />
Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.HOST_AWARE );
CUSTOM

You can do custom grouping using Hazelcast’s interface matching configuration. This way, you can add different and multiple interfaces to a group. You can also use wildcards in the interface addresses. For example, the users can create rack-aware or data warehouse partition groups using custom partition grouping.

The following are declarative and programmatic configuration examples that show how to enable and use CUSTOM grouping:

<hazelcast>
    ...
    <partition-group enabled="true" group-type="CUSTOM">
        <member-group>
            <interface>10.10.0.*</interface>
            <interface>10.10.3.*</interface>
            <interface>10.10.5.*</interface>
        </member-group>
        <member-group>
            <interface>10.10.10.10-100</interface>
            <interface>10.10.1.*</interface>
            <interface>10.10.2.*</interface>
        </member-group>
    </partition-group>
    ...
</hazelcast>
Config config = new Config();
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
        .setGroupType( PartitionGroupConfig.MemberGroupType.CUSTOM );

MemberGroupConfig memberGroupConfig = new MemberGroupConfig();
memberGroupConfig.addInterface( "10.10.0.*" )
        .addInterface( "10.10.3.*" ).addInterface("10.10.5.*" );

MemberGroupConfig memberGroupConfig2 = new MemberGroupConfig();
memberGroupConfig2.addInterface( "10.10.10.10-100" )
        .addInterface( "10.10.1.*").addInterface( "10.10.2.*" );

partitionGroupConfig.addMemberGroupConfig( memberGroupConfig );
partitionGroupConfig.addMemberGroupConfig( memberGroupConfig2 );
While your cluster was forming, if you configured your members to discover each other by their IP addresses, you should use the IP addresses for the <interface> element. If your members discovered each other by their host names, you should use host names.
PER_MEMBER

You can give every member its own group. Each member is a group of its own and primary and backup partitions are distributed randomly (not on the same physical member). This gives the least amount of protection and is the default configuration for a Hazelcast cluster. This grouping type provides good redundancy when Hazelcast members are on separate hosts. However, if multiple instances run on the same host, this type is not a good option.

The following are declarative and programmatic configuration snippets that show how to enable PER_MEMBER grouping:

<partition-group enabled="true" group-type="PER_MEMBER" />
Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.PER_MEMBER );
ZONE_AWARE

You can use ZONE_AWARE configuration with Hazelcast Kubernetes, Hazelcast AWS, Hazelcast GCP, Hazelcast jclouds or Hazelcast Azure Discovery Service plugins.

As discovery services, these plugins put zone information to the Hazelcast member attributes map during the discovery process. When ZONE_AWARE is configured as partition group type, Hazelcast creates the partition groups with respect to member attributes map entries that include zone information. That means backups are created in the other zones and each zone is accepted as one partition group.

When using the ZONE_AWARE partition grouping, a Hazelcast cluster spanning multiple AZs should have an equal number of members in each AZ. Otherwise, it results in uneven partition distribution among the members.

The following is the list of supported attributes which is set by the Discovery Service plugins during a Hazelcast member start-up:

  • hazelcast.partition.group.zone: For the zones in the same area.

  • hazelcast.partition.group.rack: For different racks in the same zone.

  • hazelcast.partition.group.host: For a shared physical member if virtualization is used.

Hazelcast jclouds plugin offers rack or host information in addition to zone information based on the cloud provider. In such cases, Hazelcast looks for zone, rack and host information in the given order and create partition groups with available information.

The following are declarative and programmatic configuration snippets that show how to enable ZONE_AWARE grouping:

<partition-group enabled="true" group-type="ZONE_AWARE" />
Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.ZONE_AWARE );
SPI

You can provide your own partition group implementation using the SPI configuration. To create your partition group implementation, you need to first extend the DiscoveryStrategy class of the discovery service plugin, override the method public PartitionGroupStrategy getPartitionGroupStrategy() and return the PartitionGroupStrategy configuration in that overridden method.

The following code covers the implementation steps mentioned in the above paragraph:

public class CustomDiscovery extends AbstractDiscoveryStrategy {

    public CustomDiscovery(ILogger logger, Map<String, Comparable> properties) {
        super(logger, properties);
    }

    @Override
    public Iterable<DiscoveryNode> discoverNodes() {
        Iterable<DiscoveryNode> iterable = //your implementation
        return iterable;
    }

    @Override
    public PartitionGroupStrategy getPartitionGroupStrategy() {
        return new CustomPartitionGroupStrategy();
    }

    private class CustomPartitionGroupStrategy implements PartitionGroupStrategy {
        @Override
        public Iterable<MemberGroup> getMemberGroups() {
            Iterable<MemberGroup> iterable = //your implementation
            return iterable;
        }
    }
}

5.9. Logging Configuration

Hazelcast has a flexible logging configuration and does not depend on any logging framework except JDK logging. It has built-in adapters for a number of logging frameworks and it also supports custom loggers by providing logging interfaces.

To use the built-in adapters, set the hazelcast.logging.type property to one of the predefined types below:

  • jdk: JDK logging (default)

  • log4j: Log4j

  • log4j2: Log4j2

  • slf4j: Slf4j

  • none: disable logging

You can set hazelcast.logging.type through declarative configuration, programmatic configuration or JVM system property.

If you choose to use log4j, log4j2, or slf4j, you should include the proper dependencies in the classpath.

Declarative Configuration:

<hazelcast>
    ...
    <properties>
        <property name="hazelcast.logging.type">log4j</property>
    </properties>
    ...
</hazelcast>

Programmatic Configuration

Config config = new Config() ;
config.setProperty( "hazelcast.logging.type", "log4j" );

System Property

  • using the java -Dhazelcast.logging.type=slf4j JVM parameter

  • using System.setProperty( "hazelcast.logging.type", "none" ); System class

If the provided logging mechanisms are not satisfactory, you can implement your own using the custom logging feature. To use it, implement the com.hazelcast.logging.LoggerFactory and com.hazelcast.logging.ILogger interfaces and set the system property hazelcast.logging.class as your custom LoggerFactory class name.

-Dhazelcast.logging.class=foo.bar.MyLoggingFactory

You can also listen to logging events generated by Hazelcast runtime by registering LogListeners to LoggingService.

LogListener listener = new LogListener() {
  public void log( LogEvent logEvent ) {
    // do something
  }
};
HazelcastInstance instance = Hazelcast.newHazelcastInstance();
LoggingService loggingService = instance.getLoggingService();
loggingService.addLogListener( Level.INFO, listener );

Through the LoggingService, you can get the currently used ILogger implementation and log your own messages too.

If you are not using command line for configuring logging, you should be careful about Hazelcast classes. They may be defaulted to jdk logging before newly configured logging is read. When logging mechanism is selected, it will not change.

Below are example configurations for Log4j2 and Log4j. Note that Hazelcast does not recommend any specific logging library, these examples are provided only to demonstrate how to configure the logging. You can use your custom logging as explained above.

5.9.1. Example Log4j2 Configuration

Specify the logging type as Log4j2 and a separate logging configuration file as shown below.

Using JVM arguments:

-Dhazelcast.logging.type=log4j2
-Dlog4j.configurationFile=/path/to/properties/log4j2.properties

Using declarative configuration (hazelcast.xml):

<hazelcast>
    ...
    <properties>
        <property name="hazelcast.logging.type">log4j2</property>
        <property name="log4j2.configuration">/path/to/properties/log4j2.properties</property>
    </properties>
    ...
</hazelcast>

Following is an example log4j2.properties file:

rootLogger=file
rootLogger.level=info
property.filepath=/path/to/log/files
property.filename=hazelcast

appender.file.type=RollingFile
appender.file.name=RollingFile
appender.file.fileName=${filepath}/${filename}.log
appender.file.filePattern=${filepath}/${filename}-%d{yyyy-MM-dd}-%i.log.gz
appender.file.layout.type=PatternLayout
appender.file.layout.pattern = %d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
appender.file.policies.type=Policies
appender.file.policies.time.type=TimeBasedTriggeringPolicy
appender.file.policies.time.interval=1
appender.file.policies.time.modulate=true
appender.file.policies.size.type=SizeBasedTriggeringPolicy
appender.file.policies.size.size=50MB
appender.file.strategy.type=DefaultRolloverStrategy
appender.file.strategy.max=100

rootLogger.appenderRefs=file
rootLogger.appenderRef.file.ref=RollingFile

#Hazelcast specific logs.

#log4j.logger.com.hazelcast=debug

#log4j.logger.com.hazelcast.cluster=debug
#log4j.logger.com.hazelcast.partition=debug
#log4j.logger.com.hazelcast.partition.InternalPartitionService=debug
#log4j.logger.com.hazelcast.nio=debug
#log4j.logger.com.hazelcast.hibernate=debug

To enable the debug logs for all Hazelcast operations uncomment the below line in the above configuration file:

log4j.logger.com.hazelcast=debug

If you do not need detailed logs, the default settings is enough. Using the Hazelcast specific lines in the above configuration file, you can select to see specific logs (cluster, partition, hibernate, etc.) in desired levels.

5.9.2. Example Log4j Configuration

Its configuration is similar to that of Log4j2. Below is the JVM argument way of specifying the logging type and configuration file:

-Dhazelcast.logging.type=log4j
-Dlog4j.configuration=file:/path/to/properties/log4j.properties

Following is an example log4j.properties file:

log4j.rootLogger=INFO,file

log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.File=/path/to/log/files/hazelcast.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %p [%c{1}] - %m%n
log4j.appender.file.maxFileSize=50MB
log4j.appender.file.maxBackupIndex=100
log4j.appender.file.threshold=DEBUG

#log4j.logger.com.hazelcast=debug

#log4j.logger.com.hazelcast.cluster=debug
#log4j.logger.com.hazelcast.partition=debug
#log4j.logger.com.hazelcast.partition.InternalPartitionService=debug
#log4j.logger.com.hazelcast.nio=debug
#log4j.logger.com.hazelcast.hibernate=debug

5.10. Other Network Configurations

All network related configurations are performed via the network element in the Hazelcast XML configuration file or the class NetworkConfig when using programmatic configuration. Following subsections describe the available configurations that you can perform under the network element.

5.10.1. Public Address

public-address overrides the public address of a member. By default, a member selects its socket address as its public address. But behind a network address translation (NAT), two endpoints (members) may not be able to see/access each other. If both members set their public addresses to their defined addresses on NAT, then that way they can communicate with each other. In this case, their public addresses are not an address of a local network interface but a virtual address defined by NAT. It is optional to set and useful when you have a private cloud. Note that, the value for this element should be given in the format host IP address:port number. See the following examples.

Declarative Configuration:

<hazelcast>
    ...
    <network>
        <public-address>11.22.33.44:5555</public-address>
    </network>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
config.getNetworkConfig()
    .setPublicAddress( "11.22.33.44:5555" );

5.10.2. Port

You can specify the ports that Hazelcast uses to communicate between cluster members. Its default value is 5701. The following are example configurations.

Declarative Configuration:

<hazelcast>
    ...
    <network>
        <port port-count="20" auto-increment="true">5701</port>
    </network>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
config.getNetworkConfig().setPort( 5701 )
    .setPortAutoIncrement( true ).setPortCount( 20 );

According to the above example, Hazelcast tries to find free ports between 5701 and 5720.

port has the following attributes.

  • port-count: By default, Hazelcast tries 100 ports to bind. Meaning that, if you set the value of port as 5701, as members are joining to the cluster, Hazelcast tries to find ports between 5701 and 5801. You can choose to change the port count in the cases like having large instances on a single machine or willing to have only a few ports to be assigned. The parameter port-count is used for this purpose, whose default value is 100.

  • auto-increment: In some cases you may want to choose to use only one port. In that case, you can disable the auto-increment feature of port by setting auto-increment to false. The port-count attribute is not used when auto-increment feature is disabled.

5.10.3. Outbound Ports

By default, Hazelcast lets the system pick up an ephemeral port during socket bind operation. But security policies/firewalls may require you to restrict outbound ports to be used by Hazelcast-enabled applications. To fulfill this requirement, you can configure Hazelcast to use only defined outbound ports. The following are example configurations.

Declarative Configuration:

<hazelcast>
    ...
    <network>
        <outbound-ports>
            <!-- ports between 33000 and 35000 -->
            <ports>33000-35000</ports>
            <!-- comma separated ports -->
            <ports>37000,37001,37002,37003</ports>
            <ports>38000,38500-38600</ports>
        </outbound-ports>
    </network>
    ...
</hazelcast>

Programmatic Configuration:

...
NetworkConfig networkConfig = config.getNetworkConfig();
// ports between 35000 and 35100
networkConfig.addOutboundPortDefinition("35000-35100");
// comma separated ports
networkConfig.addOutboundPortDefinition("36001, 36002, 36003");
networkConfig.addOutboundPort(37000);
networkConfig.addOutboundPort(37001);
...
You can use port ranges and/or comma separated ports.

As shown in the programmatic configuration, you use the method addOutboundPort to add only one port. If you need to add a group of ports, then use the method addOutboundPortDefinition.

In the declarative configuration, the element ports can be used for both single and multiple port definitions. When you set this element to 0 or *, your operating system (not Hazelcast) selects a free port from the ephemeral range.

5.10.4. Reuse Address

When you shutdown a cluster member, the server socket port goes into the TIME_WAIT state for the next couple of minutes. If you start the member right after shutting it down, you may not be able to bind it to the same port because it is in the TIME_WAIT state. If you set the reuse-address element to true, the TIME_WAIT state is ignored and you can bind the member to the same port again.

The following are example configurations.

Declarative Configuration:

<hazelcast>
    ...
    <network>
        <reuse-address>true</reuse-address>
    </network>
    ...
</hazelcast>

Programmatic Configuration:

...
NetworkConfig networkConfig = config.getNetworkConfig();

networkConfig.setReuseAddress( true );
...

5.10.5. Join

The join configuration element is used to discover Hazelcast members and enable them to form a cluster. Hazelcast provides multicast, TCP/IP, EC2 and jclouds® discovery mechanisms. These mechanisms are explained the Discovery Mechanisms section. This section describes all the sub-elements and attributes of join element. The following are example configurations.

Declarative Configuration:

<hazelcast>
    ...
    <network>
        <join>
            <multicast enabled="true">
                <multicast-group>224.2.2.3</multicast-group>
                <multicast-port>54327</multicast-port>
                <multicast-time-to-live>32</multicast-time-to-live>
                <multicast-timeout-seconds>2</multicast-timeout-seconds>
                <trusted-interfaces>
                    <interface>192.168.1.102</interface>
                </trusted-interfaces>
            </multicast>
            <tcp-ip enabled="false">
                <required-member>192.168.1.104</required-member>
                <member>192.168.1.104</member>
                <members>192.168.1.105,192.168.1.106</members>
            </tcp-ip>
            <aws enabled="false">
                <access-key>my-access-key</access-key>
                <secret-key>my-secret-key</secret-key>
                <region>us-west-1</region>
                <host-header>ec2.amazonaws.com</host-header>
                <security-group-name>hazelcast-sg</security-group-name>
                <tag-key>type</tag-key>
                <tag-value>hz-members</tag-value>
            </aws>
            <discovery-strategies>
                <discovery-strategy ... />
            </discovery-strategies>
        </join>
    </network>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
NetworkConfig network = config.getNetworkConfig();
JoinConfig join = network.getJoin();
join.getMulticastConfig().setEnabled( false )
            .addTrustedInterface( "192.168.1.102" );
join.getTcpIpConfig().addMember( "10.45.67.32" ).addMember( "10.45.67.100" )
            .setRequiredMember( "192.168.10.100" ).setEnabled( true );

The join element has the following sub-elements and attributes.

multicast element

The multicast element includes parameters to fine tune the multicast join mechanism.

  • enabled: Specifies whether the multicast discovery is enabled or not, true or false.

  • multicast-group: The multicast group IP address. Specify it when you want to create clusters within the same network. Values can be between 224.0.0.0 and 239.255.255.255. Its default value is 224.2.2.3.

  • multicast-port: The multicast socket port that the Hazelcast member listens to and sends discovery messages through. Its default value is 54327.

  • multicast-time-to-live: Time-to-live value for multicast packets sent out to control the scope of multicasts. See more information here.

  • multicast-timeout-seconds: Only when the members are starting up, this timeout (in seconds) specifies the period during which a member waits for a multicast response from another member. For example, if you set it as 60 seconds, each member waits for 60 seconds until a leader member is selected. Its default value is 2 seconds.

  • trusted-interfaces: Includes IP addresses of trusted members. When a member wants to join to the cluster, its join request is rejected if it is not a trusted member. You can give an IP addresses range using the wildcard (*) on the last digit of IP address, e.g., 192.168.1.* or 192.168.1.100-110.

Multicast mechanism is not recommended for production since UDP is often blocked in production environments and other join mechanisms are more definite.
tcp-ip element

The tcp-ip element includes parameters to fine tune the TCP/IP join mechanism.

  • enabled: Specifies whether the TCP/IP discovery is enabled or not. Values can be true or false.

  • required-member: IP address of the required member. Cluster is only formed if the member with this IP address is found.

  • member: IP address(es) of one or more well known members. Once members are connected to these well known ones, all member addresses are communicated with each other. You can also give comma separated IP addresses using the members element.

    tcp-ip element also accepts the interface parameter. See the Interfaces element description.
  • connection-timeout-seconds: Defines the connection timeout in seconds. This is the maximum amount of time Hazelcast is going to try to connect to a well known member before giving up. Setting it to a too low value could mean that a member is not able to connect to a cluster. Setting it to a too high value means that member startup could slow down because of longer timeouts, for example when a well known member is not up. Increasing this value is recommended if you have many IPs listed and the members cannot properly build up the cluster. Its default value is 5 seconds.

aws element

The aws element includes parameters to allow the members to form a cluster on the Amazon EC2 environment.

  • enabled: Specifies whether the EC2 discovery is enabled or not, true or false.

  • access-key, secret-key: Access and secret keys of your account on EC2.

  • region: The region where your members are running. Its default value is us-east-1. You need to specify this if the region is other than the default one.

  • host-header: The URL that is the entry point for a web service. It is optional.

  • security-group-name: Name of the security group you specified at the EC2 management console. It is used to narrow the Hazelcast members to be within this group. It is optional.

  • tag-key, tag-value: To narrow the members in the cloud down to only Hazelcast members, you can set these parameters as the ones you specified in the EC2 console. They are optional.

  • connection-timeout-seconds: The maximum amount of time, in seconds, Hazelcast tries to connect to a well known member before giving up. Setting this value too low could mean that a member is not able to connect to a cluster. Setting the value too high means that member startup could slow down because of longer timeouts (for example, when a well known member is not up). Increasing this value is recommended if you have many IPs listed and the members cannot properly build up the cluster. Its default value is 5 seconds.

If you are using a cloud provider other than AWS, you can use the programmatic configuration to specify a TCP/IP cluster. The members need to be retrieved from that provider, e.g., jclouds.

discovery-strategies element

The discovery-strategies element configures internal or external discovery strategies based on the Hazelcast Discovery SPI. For further information, see the Discovery SPI section and the vendor documentation of the used discovery strategy.

5.10.6. AWSClient Configuration

To make sure EC2 instances are found correctly, you can use the AWSClient class. It determines the private IP addresses of EC2 instances to be connected. Give the AWSClient class the values for the parameters that you specified in the aws element, as shown below. You will see whether your EC2 instances are found.

public static void main( String[] args )throws Exception{
  AwsConfig config = new AwsConfig();
  config.setSecretKey( ... ) ;
  config.setSecretKey( ... );
  config.setRegion( ... );
  config.setSecurityGroupName( ... );
  config.setTagKey( ... );
  config.setTagValue( ... );
  config.setEnabled( true );
  AWSClient client = new AWSClient( config );
  Collection<String> ipAddresses = client.getPrivateIpAddresses();
  System.out.println( "addresses found:" + ipAddresses );
  for ( String ip: ipAddresses ) {
    System.out.println( ip );
  }
}

5.10.7. Interfaces

You can specify which network interfaces that Hazelcast should use. Servers mostly have more than one network interface, so you may want to list the valid IPs. Range characters (* and -) can be used for simplicity. For instance, 10.3.10.* refers to IPs between 10.3.10.0 and 10.3.10.255. Interface 10.3.10.4-18 refers to IPs between 10.3.10.4 and 10.3.10.18 (4 and 18 included). If network interface configuration is enabled (it is disabled by default) and if Hazelcast cannot find a matching interface, then it prints a message on the console and does not start on that member.

The following are example configurations.

Declarative Configuration:

<hazelcast>
    ...
    <network>
        <interfaces enabled="true">
            <interface>10.3.16.*</interface>
            <interface>10.3.10.4-18</interface>
            <interface>192.168.1.3</interface>
        </interfaces>
    </network>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
NetworkConfig network = config.getNetworkConfig();
InterfacesConfig interfaceConfig = network.getInterfaces();
interfaceConfig.setEnabled( true )
            .addInterface( "192.168.1.3" );

5.10.8. IPv6 Support

Hazelcast supports IPv6 addresses seamlessly (This support is switched off by default, see the note at the end of this section).

All you need is to define IPv6 addresses or interfaces in the network configuration. The only current limitation is that you cannot define wildcard IPv6 addresses in the TCP/IP join configuration (tcp-ip element). Interfaces configuration does not have this limitation, you can configure wildcard IPv6 interfaces in the same way as IPv4 interfaces.

<hazelcast>
    ...
    <network>
        <port auto-increment="true">5701</port>
        <join>
            <multicast enabled="false">
                <multicast-group>FF02:0:0:0:0:0:0:1</multicast-group>
                <multicast-port>54327</multicast-port>
            </multicast>
            <tcp-ip enabled="true">
                <member>[fe80::223:6cff:fe93:7c7e]:5701</member>
                <interface>192.168.1.0-7</interface>
                <interface>192.168.1.*</interface>
                <interface>fe80:0:0:0:45c5:47ee:fe15:493a</interface>
            </tcp-ip>
        </join>
        <interfaces enabled="true">
            <interface>10.3.16.*</interface>
            <interface>10.3.10.4-18</interface>
            <interface>fe80:0:0:0:45c5:47ee:fe15:*</interface>
            <interface>fe80::223:6cff:fe93:0-5555</interface>
        </interfaces>
    </network>
    ...
</hazelcast>

JVM has two system properties for setting the preferred protocol stack (IPv4 or IPv6) as well as the preferred address family types (inet4 or inet6). On a dual stack machine, IPv6 stack is preferred by default, you can change this through the java.net.preferIPv4Stack=<true|false> system property. When querying name services, JVM prefers IPv4 addresses over IPv6 addresses and returns an IPv4 address if possible. You can change this through java.net.preferIPv6Addresses=<true|false> system property.

See also additional details on IPv6 support in Java.

IPv6 support has been switched off by default, since some platforms have issues using the IPv6 stack. Some other platforms such as Amazon AWS have no support at all. To enable IPv6 support, just set configuration property hazelcast.prefer.ipv4.stack to false. See the System Properties appendix for details.

5.10.9. Member Address Provider SPI

This SPI is not intended to provide addresses of other cluster members with which the Hazelcast instance forms a cluster. To do that, see the previous sections above.

By default, Hazelcast chooses the public and bind address. You can influence on the choice by defining a public-address in the configuration or by using other properties mentioned above. In some cases, though, these properties are not enough and the default address picking strategy chooses wrong addresses. This may be the case when deploying Hazelcast in some cloud environments, such as AWS, when using Docker or when the instance is deployed behind a NAT and the public-address property is not enough (see the Public Address section).

In these cases, it is possible to configure the bind and public address in a more advanced way. You can provide an implementation of the com.hazelcast.spi.MemberAddressProvider interface which provides the bind and public address. The implementation may then choose these addresses in any way - it may read from a system property or file or even invoke a web service to retrieve the public and private address.

The details of the implementation depend heavily on the environment in which Hazelcast is deployed. As such, we now demonstrate how to configure Hazelcast to use a simplified custom member address provider SPI implementation. An example implementation is shown below:

public static final class SimpleMemberAddressProvider implements MemberAddressProvider {
    @Override
    public InetSocketAddress getBindAddress() {
        // determine the address using some configuration, calling an API, ...
        return new InetSocketAddress(hostname, port);
    }

    @Override
    public InetSocketAddress getPublicAddress() {
        // determine the address using some configuration, calling an API, ...
        return new InetSocketAddress(hostname, port);
    }
}

Note that if the bind address port is 0 then it uses a port as configured in the Hazelcast network configuration (see the Port section). If the public address port is set to 0 then it broadcasts the same port that it is bound to. If you wish to bind to any local interface, you may return new InetSocketAddress((InetAddress) null, port) from the getBindAddress() address.

The following configuration examples contain properties that are provided to the constructor of the provider class. If you do not provide any properties, the class may have either a no-arg constructor or a constructor accepting a single java.util.Properties instance. On the other hand, if you do provide properties in the configuration, the class must have a constructor accepting a single java.util.Properties instance.

Declarative Configuration:

<hazelcast>
    ...
    <network>
        <member-address-provider enabled="true">
            <class-name>SimpleMemberAddressProvider</class-name>
            <properties>
                <property name="prop1">prop1-value</property>
                <property name="prop2">prop2-value</property>
            </properties>
        </member-address-provider>
        <!-- other network configurations -->
    </network>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
MemberAddressProviderConfig memberAddressProviderConfig = config.getNetworkConfig().getMemberAddressProviderConfig();
memberAddressProviderConfig
      .setEnabled(true)
      .setClassName(MemberAddressProviderWithStaticProperties.class.getName());
Properties properties = memberAddressProviderConfig.getProperties();
properties.setProperty("prop1", "prop1-value");
properties.setProperty("prop2", "prop2-value");

config.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);

// perform other configuration

Hazelcast.newHazelcastInstance(config);

5.11. Failure Detector Configuration

A failure detector is responsible to determine if a member in the cluster is unreachable or crashed. The most important problem in failure detection is to distinguish whether a member is still alive but slow or has crashed. But according to the famous FLP result, it is impossible to distinguish a crashed member from a slow one in an asynchronous system. A workaround to this limitation is to use unreliable failure detectors. An unreliable failure detector allows a member to suspect that others have failed, usually based on liveness criteria but it can make mistakes to a certain degree.

Hazelcast has the following built-in failure detectors: Deadline Failure Detector and Phi Accrual Failure Detector.

There is also a Ping Failure Detector, that, if enabled, works in parallel with the above ones, but identifies the failures on OSI Layer 3 (Network Layer). This detector is by default disabled.

Note that, Hazelcast also offers failure detectors for its Java client. See the Client Failure Detectors section for more information.

5.11.1. Deadline Failure Detector

Deadline Failure Detector uses an absolute timeout for missing/lost heartbeats. After timeout, a member is considered as crashed/unavailable and marked as suspected.

Deadline Failure Detector has the following configuration properties:

  • hazelcast.heartbeat.interval.seconds: This is the interval at which member heartbeat messages are sent to each other.

  • hazelcast.max.no.heartbeat.seconds: This is the timeout which defines when a cluster member is suspected because it has not sent any heartbeats.

To use Deadline Failure Detector configuration property hazelcast.heartbeat.failuredetector.type should be set to "deadline".

<hazelcast>
    ...
    <properties>
        <property name="hazelcast.heartbeat.failuredetector.type">deadline</property>
        <property name="hazelcast.heartbeat.interval.seconds">5</property>
        <property name="hazelcast.max.no.heartbeat.seconds">120</property>
    </properties>
    ...
</hazelcast>
Config config = ...;
config.setProperty("hazelcast.heartbeat.failuredetector.type", "deadline");
config.setProperty("hazelcast.heartbeat.interval.seconds", "5");
config.setProperty("hazelcast.max.no.heartbeat.seconds", "120");
[...]
Deadline Failure Detector is the default failure detector in Hazelcast.

5.11.2. Phi Accrual Failure Detector

This is the failure detector based on The Phi Accrual Failure Detector' by Hayashibara et al.

Phi Accrual Failure Detector keeps track of the intervals between heartbeats in a sliding window of time and measures the mean and variance of these samples and calculates a value of suspicion level (Phi). The value of phi increases when the period since the last heartbeat gets longer. If the network becomes slow or unreliable, the resulting mean and variance increase, there needs to be a longer period for which no heartbeat is received before the member is suspected. 

The hazelcast.heartbeat.interval.seconds and hazelcast.max.no.heartbeat.seconds properties still can be used as period of heartbeat messages and deadline of heartbeat messages. Since Phi Accrual Failure Detector is adaptive to network conditions, a much lower hazelcast.max.no.heartbeat.seconds can be defined than Deadline Failure Detector's timeout.

In addition to the above two properties, Phi Accrual Failure Detector has the following configuration properties:

  • hazelcast.heartbeat.phiaccrual.failuredetector.threshold: This is the phi threshold for suspicion. After calculated phi exceeds this threshold, a member is considered as unreachable and marked as suspected. A low threshold allows to detect member crashes/failures faster but can generate more mistakes and cause wrong member suspicions. A high threshold generates fewer mistakes but is slower to detect actual crashes/failures.

    phi = 1 means likeliness that we will make a mistake is about 10%. The likeliness is about 1% with phi = 2, 0.1% with phi = 3 and so on. Default phi threshold is 10.

  • hazelcast.heartbeat.phiaccrual.failuredetector.sample.size: Number of samples to keep for history. Its default value is 200.

  • hazelcast.heartbeat.phiaccrual.failuredetector.min.std.dev.millis: Minimum standard deviation to use for the normal distribution used when calculating phi. Too low standard deviation might result in too much sensitivity.

To use Phi Accrual Failure Detector, configuration property hazelcast.heartbeat.failuredetector.type should be set to "phi-accrual".

<hazelcast>
    ...
    <properties>
        <property name="hazelcast.heartbeat.failuredetector.type">phi-accrual</property>
        <property name="hazelcast.heartbeat.interval.seconds">1</property>
        <property name="hazelcast.max.no.heartbeat.seconds">60</property>
        <property name="hazelcast.heartbeat.phiaccrual.failuredetector.threshold">10</property>
        <property name="hazelcast.heartbeat.phiaccrual.failuredetector.sample.size">200</property>
        <property name="hazelcast.heartbeat.phiaccrual.failuredetector.min.std.dev.millis">100</property>
    </properties>
    ...
</hazelcast>
Config config = ...;
config.setProperty("hazelcast.heartbeat.failuredetector.type", "phi-accrual");
config.setProperty("hazelcast.heartbeat.interval.seconds", "1");
config.setProperty("hazelcast.max.no.heartbeat.seconds", "60");
config.setProperty("hazelcast.heartbeat.phiaccrual.failuredetector.threshold", "10");
config.setProperty("hazelcast.heartbeat.phiaccrual.failuredetector.sample.size", "200");
config.setProperty("hazelcast.heartbeat.phiaccrual.failuredetector.min.std.dev.millis", "100");
[...]

5.11.3. Ping Failure Detector

The Ping Failure Detector may be configured in addition to one of Deadline and Phi Accrual Failure Detectors. It operates at Layer 3 of the OSI protocol and provides much quicker and more deterministic detection of hardware and other lower level events. This detector may be configured to perform an extra check after a member is suspected by one of the other detectors, or it can work in parallel, which is the default. This way hardware and network level issues are detected more quickly.

This failure detector is based on InetAddress.isReachable(). When the JVM process has enough permissions to create RAW sockets, the implementation chooses to rely on ICMP Echo requests. This is preferred.

If there are not enough permissions, it can be configured to fallback on attempting a TCP Echo on port 7. In the latter case, both a successful connection or an explicit rejection is treated as "Host is Reachable". Or, it can be forced to use only RAW sockets. This is not preferred as each call creates a heavy weight socket and moreover the Echo service is typically disabled.

For the Ping Failure Detector to rely only on ICMP Echo requests, there are some criteria that need to be met.

Requirements and Linux/Unix Configuration
  • Supported OS: as of Java 1.8 only Linux/Unix environments are supported. This detector relies on ICMP, i.e., the protocol behind the ping command. It tries to issue the ping attempts periodically, and their responses are used to determine the reachability of the remote member. However, you cannot simply create an ICMP Echo Request because these type of packets do not rely on any of the preexisting transport protocols such as TCP. In order to create such a request, you must have the privileges to create RAW sockets (see https://linux.die.net/man/7/raw). Most operating systems allow this to the root users, however Unix based ones are more flexible and allow the use of custom privileges per process instead of requiring root access. Therefore, this detector is supported only on Linux.

  • The Java executable must have the cap_net_raw capability. As described in the above requirement, on Linux, you have the ability to define extra capabilities to a single process, which would allow the process to interact with the RAW sockets. This interaction is achieved via the capability cap_net_raw (see https://linux.die.net/man/7/capabilities). To enable this capability run the following command:

    sudo setcap cap_net_raw=+ep <JDK_HOME>/jre/bin/java

  • When running with custom capabilities, the dynamic linker on Linux rejects loading the libs from untrusted paths. Since you have now cap_net_raw as a custom capability for a process, it becomes suspicious to the dynamic linker and throws an error: java: error while loading shared libraries: libjli.so: cannot open shared object file: No such file or directory

    • To overcome this rejection, the <JDK_HOME>/jre/lib/amd64/jli/ path needs to be added in the ld.conf. Run the following command to do this: echo "<JDK_HOME>/jre/lib/amd64/jli/" >> /etc/ld.so.conf.d/java.conf && sudo ldconfig

  • ICMP Echo Requests must not be blocked by the receiving hosts. /proc/sys/net/ipv4/icmp_echo_ignore_all set to 0. Run the following command:

    echo 0 > /proc/sys/net/ipv4/icmp_echo_ignore_all

If any of the above criteria isn’t met, then the isReachable always falls back on TCP Echo attempts on port 7.

To be able to use the Ping Failure Detector, you can configure it using the icmp element in your Hazelcast IMDG declarative configuration file, e.g., hazelcast.xml. An example is shown below:

<hazelcast>
    <network>
    ...
        <failure-detector>
            <icmp enabled="true">
                <timeout-milliseconds>1000</timeout-milliseconds>
                <fail-fast-on-startup>true</fail-fast-on-startup>
                <interval-milliseconds>1000</interval-milliseconds>
                <max-attempts>3</max-attempts>
                <parallel-mode>true</parallel-mode>
                <ttl>0</ttl>
            </icmp>
        </failure-detector>
    </network>
    ...
</hazelcast>

The following are the element and attribute descriptions:

  • enabled: Specifies whether the legacy ICMP detection mode is enabled; works cooperatively with the existing failure detector and only kicks-in after a pre-defined period has passed with no heartbeats from a member. Its default value is false.

  • parallel-mode: Specifies whether the parallel ping detector is enabled; works separately from the other detectors. Its default value is true.

  • timeout-milliseconds: Number of milliseconds until a ping attempt is considered failed if there was no reply. Its default value is 1000 milliseconds.

  • max-attempts: Maximum number of ping attempts before the member/node gets suspected by the detector. Its default value is 2.

  • interval-milliseconds: Interval, in milliseconds, between each ping attempt. 1000ms (1 sec) is also the minimum interval allowed. Its default value is 1000 milliseconds.

  • ttl: Maximum number of hops the packets should go through. Its default value is 0.

  • fail-fast-on-startup: Specifies whether the cluster member fails to start if it is unable to action an ICMP ping command when ICMP is enabled. Failure is usually due to OS level restrictions.

In the above example configuration, the Ping detector attempts 3 pings, one every second and waits up to 1 second for each to complete. If after 3 seconds, there was no successful ping, the member gets suspected.

Until Hazelcast IMDG 3.10, the following system properties (corresponding to the above configuration elements) have been used to configure Ping Failure Detector. These have become deprecated starting with Hazelcast IMDG 3.10:

<hazelcast>
    ...
    <properties>
        <property name="hazelcast.icmp.enabled">true</property>
        <property name="hazelcast.icmp.parallel.mode">true</property>
        <property name="hazelcast.icmp.timeout">1000</property>
        <property name="hazelcast.icmp.max.attempts">3</property>
        <property name="hazelcast.icmp.interval">1000</property>
        <property name="hazelcast.icmp.ttl">0</property>
    </properties>
    ...
</hazelcast>

To enforce the Requirements, the property hazelcast.icmp.echo.fail.fast.on.startup can also be set to true, in which case, if any of the requirements isn’t met, Hazelcast fails to start.

Below is a summary table of all possible configuration combinations of the ping failure detector.

Table 3. Ping Failure Detector Possible Configuration Combinations
ICMP Parallel Fail-Fast Description Linux Windows macOS

false

false

false

Completely disabled

N/A

N/A

N/A

true

false

false

Legacy ping mode. This works hand-to-hand with the OSI Layer 7 failure detector (see. phi or deadline in the sections above). Ping in this mode only kicks in after a period when there are no heartbeats received, in which case the remote Hazelcast member is pinged up to a configurable count of attempts. If all those attempts fail, the member gets suspected. You can configure this attempt count using the hazelcast.icmp.max.attempts system property.

Supported ICMP Echo if available - Falls back on TCP Echo on port 7

Supported TCP Echo on port 7

Supported ICMP Echo if available - Falls back on TCP Echo on port 7

true

true

false

Parallel ping detector, works in parallel with the configured failure detector. Checks periodically if members are live (OSI Layer 3) and suspects them immediately, regardless of the other detectors.

Supported ICMP Echo if available - Falls back on TCP Echo on port 7

Supported TCP Echo on port 7

Supported ICMP Echo if available - Falls back on TCP Echo on port 7

true

true

true

Parallel ping detector, works in parallel with the configured failure detector. Checks periodically if members are live (OSI Layer 3) and suspects them immediately, regardless of the other detectors.

Supported - Requires OS Configuration Enforcing ICMP Echo if available - No start up if not available

Not Supported

Not Supported - Requires root privileges

5.12. Advanced Network Configuration

Up to and including Hazelcast 3.11, Hazelcast members use a single server socket for all kinds of connections: cluster members, Hazelcast clients implementing the Open Binary Client Protocol and HTTP protocol clients connect to a single server socket that handles all the protocols.

Starting with Hazelcast 3.12, it is possible to configure the Hazelcast members with separate server sockets using a different network configuration for different protocols. This configuration scheme allows more flexibility when deploying Hazelcast as described in the following cases:

  • For security, it is possible to bind the member protocol server socket on a protected internal network interface, while the client connections can be established on another network interface accessible by the Hazelcast clients.

  • Different kinds of network connections can be established with different socket options. For example varying send/receive window size to optimize the network usage, TLS for connections over WAN while member-to-member connections may remain unencrypted, etc.

In the following example we introduce the advanced network configuration for a member to listen for member-to-member connections on the default port 5701 while listening for client connections on the port 9090:

Config config = new Config();
config.getAdvancedNetworkConfig().setEnabled(true);
config.getAdvancedNetworkConfig().setClientEndpointConfig(
        new ServerSocketEndpointConfig().setPort(9090)
);
HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
System.out.println(instance.getCluster().getLocalMember().getAddressMap());

Running this example prints something similar to the following output, indicating that the member listens for the specified protocols on the respective configured ports:

{EndpointQualifier{type='CLIENT'}=[10.212.134.156]:9090, EndpointQualifier{type='MEMBER'}=[10.212.134.156]:5701}

The following is the equivalent declarative configuration:

<hazelcast>
    ...
    <advanced-network enabled="true">
        <member-server-socket-endpoint-config>
            <port>5701</port>
        </member-server-socket-endpoint-config>
        <client-server-socket-endpoint-config>
            <port>9090</port>
        </client-server-socket-endpoint-config>
    </advanced-network>
    ...
</hazelcast>

5.12.1. Setting Up Cluster Members for Advanced Network Configuration

Advanced network configuration and single-socket network configuration are mutually exclusive: either an enabled AdvancedNetworkConfig or the NetworkConfig object is used to configure a member’s networking, including the joiner, discovery, failure detectors, etc. as described in the previous sections of this chapter.

You cannot define both elements in the declarative configuration, i.e., the network and advanced-network elements cannot be configured at the same time. In the programmatic configuration, an enabled AdvancedNetworkConfig takes precedence over the NetworkConfig. AdvancedNetworkConfig is disabled by default, therefore the unisocket member configuration under NetworkConfig is used in the default case.

When using the advanced network configuration, the following configurations are defined member-wide:

  • Joiner and cluster discovery (Multicast, TCP/IP, AWS, Eureka, etc.)

  • MemberAddressProvider configuration

  • Failure detector configuration

In addition to the above, the advanced network configuration allows the configuration of multiple endpoints: each endpoint configuration applies for a specific protocol, e.g., MEMBER and CLIENT. An additional optional identifier can be configured to separate the configuration of multiple WAN protocol endpoints.

The supported protocols are as follows:

  • MEMBER: A member server socket is required for Hazelcast to operate. The default advanced network configuration defines a member endpoint configuration listening on port 5701 (same as the single-socket Hazelcast member configuration).

  • CLIENT: A single server socket handling the Hazelcast Open Binary Client Protocol can be optionally configured. If no such endpoint is configured, then the clients will not be able to connect to the Hazelcast member.

  • REST: A REST server socket is optional.

  • MEMCACHE: When accessing a Hazelcast cluster over the Memcache text protocol, an endpoint listening to MEMCACHE protocol must be defined.

  • WAN: Multiple WAN endpoint configurations can be defined to determine the network settings of outgoing connections (from the members of a source cluster to the target WAN cluster members) or to establish server sockets on which a target WAN member can listen for the incoming connections from the source cluster.

5.12.2. Server Socket Endpoint Configuration

The server socket endpoint configuration is common for all protocols. The elements comprising a server socket endpoint configuration are identical to their single-socket network configuration counterparts.

The following declarative configuration example includes all the common server socket endpoint elements:

<hazelcast>
   ...
   <advanced-network enabled="true">
       <member-server-socket-endpoint-config>
           <port auto-increment="true" port-count="100">5701</port>
           <outbound-ports>
               <ports>33000-35000</ports>
               <ports>37000,37001,37002,37003</ports>
               <ports>38000,38500-38600</ports>
           </outbound-ports>
           <interfaces enabled="true">
               <interface>10.10.1.*</interface>
           </interfaces>
           <ssl enabled="true">
               <factory-class-name>com.hazelcast.examples.MySSLContextFactory</factory-class-name>
               <properties>
                   <property name="foo">bar</property>
               </properties>
           </ssl>
           <symmetric-encryption>
               <algorithm>ALGO</algorithm>
               <salt>SALT</salt>
               <password>PASS</password>
               <iteration-count>10000</iteration-count>
           </symmetric-encryption>
           <socket-interceptor enabled="true">
               <class-name>com.hazelcast.examples.MySocketInterceptor</class-name>
               <properties>
                   <property name="foo">bar</property>
               </properties>
           </socket-interceptor>
           <socket-options>
               <buffer-direct>true</buffer-direct>
               <tcp-no-delay>true</tcp-no-delay>
               <keep-alive>true</keep-alive>
               <connect-timeout-seconds>64</connect-timeout-seconds>
               <send-buffer-size-kb>25</send-buffer-size-kb>
               <receive-buffer-size-kb>33</receive-buffer-size-kb>
               <linger-seconds>99</linger-seconds>
           </socket-options>
           <public-address>dummy</public-address>
           <reuse-address>true</reuse-address>
        </member-server-socket-endpoint-config>
    </advanced-network>
    ...
</hazelcast>

When using the declarative configuration, specific element names introduce the server socket endpoint configuration for each protocol:

  • member-server-socket-endpoint-config for MEMBER protocol

  • client-server-socket-endpoint-config for CLIENT protocol

  • rest-server-socket-endpoint-config for REST endpoint

  • memcache-server-socket-endpoint-config for MEMCACHE endpoint

  • wan-server-socket-endpoint-config for WAN endpoints

When using the programmatic configuration, corresponding methods set the respective server socket endpoint configuration:

config.getAdvancedNetworkConfig().setMemberEndpointConfig(
        new ServerSocketEndpointConfig()
            .setPort(5701)
            .setPortAutoIncrement(false)
            .setSSLConfig(new SSLConfig())
            .setReuseAddress(true)
            .setSocketTcpNoDelay(true)
);

5.12.3. Setting Up REST Server Socket Endpoint Configuration

In addition to the common server socket configuration described above, the REST endpoint configuration includes certain additional elements which are used to enable/disable the REST functionality groups.

config.getAdvancedNetworkConfig().setRestEndpointConfig(
        new RestServerEndpointConfig()
            .setPort(8080)
            .setPortAutoIncrement(false)
            .enableGroups(WAN, CLUSTER_READ, HEALTH_CHECK)
);

The following is the equivalent declarative configuration:

<hazelcast>
    ...
    <advanced-network enabled="true">
        <rest-server-socket-endpoint-config>
            <port auto-increment="false">8080</port>
            <endpoint-groups>
                <endpoint-group name="WAN" enabled="true"/>
                <endpoint-group name="CLUSTER_READ" enabled="true"/>
                <endpoint-group name="HEALTH_CHECK" enabled="true"/>
            </endpoint-groups>
        </rest-server-socket-endpoint-config>
    </advanced-network>
    ...
</hazelcast>

5.12.4. Setting Up WAN Endpoints Configuration

Multiple WAN endpoint configurations can be defined to configure the outgoing connections and server sockets, depending on the role of the member in the WAN replication. The configuration examples are provided in the following sections for both active and passive side of the WAN replication.

Configuring the WAN Active Side

The members on the active cluster initiate connections to the target cluster members, so there is no need to create a server socket. A plain EndpointConfig is created that supplies the configuration for the client side of connections that the active members will create:

config.getAdvancedNetworkConfig().addWanEndpointConfig(
        new EndpointConfig().setName("tokyo")
                .setSSLConfig(new SSLConfig()
                                    .setEnabled(true)
                                    .setFactoryClassName("com.hazelcast.examples.MySSLContextFactory")
                                    .setProperty("foo", "bar"))
);
WanReplicationConfig wanReplicationConfig = new WanReplicationConfig();
WanBatchReplicationPublisherConfig publisherConfig = new WanBatchReplicationPublisherConfig()
                        .setEndpoint("tokyo")
                        .setTargetEndpoints("tokyo.hazelcast.com:8765");
wanReplicationConfig.addWanBatchReplicationPublisherConfig(publisherConfig);
config.addWanReplicationConfig(wanReplicationConfig);

config.getMapConfig("customers").setWanReplicationRef(
        new WanReplicationRef("replicate-to-tokyo", "com.company.MergePolicy", emptyList(), false)
);

The following is the equivalent declarative configuration:

<hazelcast>
    ...
    <advanced-network enabled="true">
        <wan-endpoint-config name="tokyo">
            <ssl enabled="true">
                <factory-class-name>com.hazelcast.examples.MySSLContextFactory</factory-class-name>
                <properties>
                    <property name="endpoints">tokyo.example.com:11010</property>
                </properties>
            </ssl>
        </wan-endpoint-config>
    </advanced-network>
    ...
    <wan-replication name="replicate-to-tokyo">
        <batch-publisher>
            <cluster-name>clusterB</cluster-name>
            <target-endpoints>...</target-endpoints>
        </batch-publisher>
    </wan-replication>
    ...
    <map name="customer">
        <wan-replication-ref name="replicate-to-tokyo">
            <merge-policy>...</merge-policy>
        </wan-replication-ref>
    </map>
    ...
</hazelcast>

The wan-endpoint-config element contains the same sub-elements as the member-server-socket-endpoint-config element described above except port, public-address and reuse-address

Configuring the WAN Passive Side

On the passive cluster, a server socket is configured on the members to listen for the incoming WAN connections, matching the network configuration (SSL configuration, etc.) configured on the active side of the WAN replication.

config.getAdvancedNetworkConfig().addWanEndpointConfig(
        new ServerSocketEndpointConfig()
                .setName("tokyo")
                .setPort(11010)
                .setPortAutoIncrement(false)
                .setSSLConfig(new SSLConfig()
                        .setEnabled(true)
                        .setFactoryClassName("com.hazelcast.examples.MySSLContextFactory")
                        .setProperty("foo", "bar")
                ));

The following is the equivalent declarative configuration:

<hazelcast>
    ...
    <advanced-network enabled="true">
        <wan-server-socket-endpoint-config name="tokyo">
            <port auto-increment="false">11010</port>
            <ssl enabled="true">
                <factory-class-name>com.hazelcast.examples.MySSLContextFactory</factory-class-name>
                <properties>
                    <property name="foo">bar</property>
                </properties>
            </ssl>
        </wan-server-socket-endpoint-config>
    </advanced-network>
    ...
</hazelcast>

5.12.5. Advanced Network Configuration FAQ

  1. Can I multiplex protocols on a single advanced network endpoint? For example, can I use a single server socket to listen for MEMBER and CLIENT protocols?

    No, each endpoint configuration that defines a server socket must bind to a different socket address.

  2. Can I mix unisocket and advanced network members in the same cluster?

    No, the results will be undefined.

  3. Can I configure multiple server socket endpoints for the same protocol?

    You can only configure multiple server socket endpoints for WAN protocol. For other protocols (MEMBER, CLIENT, REST, MEMCACHE), a single server socket can be configured.

6. Rolling Member Upgrades

Hazelcast IMDG Enterprise

This chapter explains the procedure of upgrading the version of Hazelcast members in a running cluster without interrupting the operation of the cluster.

6.1. Terminology

  • Minor version: A version change after the decimal point, e.g., 3.12 and 3.13.

  • Patch version: A version change after the second decimal point, e.g., 3.12.1 and 3.12.2.

  • Member codebase version: The major.minor.patch version of the Hazelcast binary on which the member executes. For example, when running on hazelcast-3.12.jar, your member’s codebase version is 3.12.0.

  • Cluster version: The major.minor version at which the cluster operates. This ensures that cluster members are able to communicate using the same cluster protocol and determines the feature set exposed by the cluster.

6.2. Hazelcast Members Compatibility Guarantees

Hazelcast members operating on binaries of the same major and minor version numbers are compatible regardless of patch version. For example, in a cluster with members running on version 3.11.1, it is possible to perform a rolling upgrade to 3.11.2 by shutting down, upgrading to hazelcast-3.11.2.jar binary and starting each member one by one. Patch level compatibility applies to both Hazelcast IMDG and Hazelcast IMDG Enterprise.

Also, each minor version is compatible with the previous one (back until Hazelcast IMDG 3.8). For example, it is possible to perform a rolling upgrade on a cluster running Hazelcast IMDG Enterprise 3.11 to Hazelcast IMDG Enterprise 3.12. Rolling upgrades across minor versions is a Hazelcast IMDG Enterprise feature.

The compatibility guarantees described above are given in the context of rolling member upgrades and only apply to GA (general availability) releases. It is never advisable to run a cluster with members running on different patch or minor versions for prolonged periods of time.

6.3. Rolling Upgrade Procedure

The version numbers used in the paragraph below are only used as an example.

Let’s assume a cluster with four members running on codebase version 3.12.0 with cluster version 3.12, that should be upgraded to codebase version 3.13.0 and cluster version 3.13. The rolling upgrade process for this cluster, i.e., replacing existing 3.12.0 members one by one with an upgraded one at version 3.13.0, includes the following steps which should be repeated for each member:

  • Gracefully shut down an existing 3.12.0 member.

  • Wait until all partition migrations are completed; during migrations, membership changes (member joins or removals) are not allowed.

  • Update the member with the new 3.13.0 Hazelcast binaries.

  • Start the member and wait until it joins the cluster. You should see something like the following in your logs:

    ...
    INFO: [192.168.2.2]:5701 [cluster] [3.13] Hazelcast 3.9 (20170630 - a67dc3a) starting at [192.168.2.2]:5701
    ...
    INFO: [192.168.2.2]:5701 [cluster] [3.13] Cluster version set to 3.12

The version in brackets ([3.13]) still denotes the member’s codebase version (running on the hypothetical hazelcast-3.13.jar binary). Once the member locates the existing cluster members, it sends its join request to the master. The master validates that the new member is allowed to join the cluster and lets the new member know that the cluster is currently operating at 3.12 cluster version. The new member sets 3.12 as its cluster version and starts operating normally.

At this point all members of the cluster have been upgraded to codebase version 3.13.0 but the cluster still operates at cluster version 3.12. In order to use 3.13 features the cluster version must be changed to 3.13.

Rolling upgrade can be used for one version at a time, e.g., 3.n to 3.n+1. You cannot upgrade your members, for example, from 3.13 to 3.15 in a single rolling upgrade session.

6.4. Upgrading Cluster Version

You have the following options to upgrade the cluster version:

Note that you need to enable the REST API to use either of the above methods to upgrade your cluster version. For this, enable the CLUSTER_WRITE REST endpoint group (its default is disabled). See the Using the REST Endpoint Groups section on how to enable them.

Also note that you need to upgrade your Management Center version before upgrading the member version if you want to change cluster version using Management Center. Management Center is compatible with the previous minor version of Hazelcast, starting with version 3.9. For example, Management Center 3.12 works with both Hazelcast IMDG 3.11 and 3.12. To change your cluster version to 3.12, you need Management Center 3.12.

Upgrading Cluster Version From IMDG 3.11 to 3.12

For the IMDG versions before 3.12, REST API could be enabled by using the hazelcast.rest.enabled system property, which is deprecated for 3.12.x versions. IMDG 3.12 and newer versions introduce the rest-api configuration element along with REST endpoint groups. Therefore, a configuration change is needed specifically when performing a rolling member upgrade from IMDG 3.11 to 3.12.

So, the steps listed in the above Rolling Upgrade Procedure section should be as follows:

  1. Shutdown the 3.11 member

  2. Wait until all partition migrations are completed

  3. Update the member with 3.12 binaries

  4. Update the configuration (see below)

  5. Start the member

For the 4th step ("Update the configuration"), the configuration should be updated as follows:

<hazelcast>
    ...
    <rest-api enabled="true">
        <endpoint-group name="CLUSTER_WRITE" enabled="true"/>
    </rest-api>
    ...
</hazelcast>

See the Using the REST Endpoint Groups section for more information.

6.5. Enabling Auto-Upgrading

The cluster can automatically upgrade its version. As soon as it detects that all its members have a version higher than the current cluster version, it upgrades the cluster version to match it. This feature is disabled by default. To enable it, set the system property hazelcast.cluster.version.auto.upgrade.enabled to true.

There is one tricky detail here: as you are shutting down and upgrading the members one by one, when you shut down the last one, all the members in the remaining cluster have the newer version, but you don’t want the auto-upgrade to kick in before you have successfully upgraded the last member as well. To avoid this, you can use the hazelcast.cluster.version.auto.upgrade.min.cluster.size system property. You should set it to the size of your cluster, and then Hazelcast will wait for the last member to join before it can proceed with the auto-upgrade.

6.6. Network Partitions and Rolling Upgrades

In the event of network partitions which split your cluster into two subclusters, split-brain handling works as explained in the Network Partitioning chapter, with the additional constraint that two subclusters only merge as long as they operate on the same cluster version. This is a requirement to ensure that all members participating in each one of the subclusters are able to operate as members of the merged cluster at the same cluster version.

With regards to rolling upgrades, the above constraint implies that if a network partition occurs while a change of cluster version is in progress, then with some unlucky timing, one subcluster may be upgraded to the new cluster version and another subcluster may have upgraded members but still operate at the old cluster version.

In order for the two subclusters to merge, it is necessary to change the cluster version of the subcluster that still operates on the old cluster version, so that both subclusters will be operating at the same, upgraded cluster version and able to merge as soon as the network partition is fixed.

6.7. Rolling Upgrade FAQ

The following provide answers to the frequently asked questions related to rolling member upgrades.

How is the cluster version set?

When a new member starts, it is not yet joined to a cluster; therefore its cluster version is still undetermined. In order for the cluster version to be set, one of the following must happen:

  • the member cannot locate any members of the cluster to join or is configured without a joiner: in this case, the member appoints itself as the master of a new single-member cluster and its cluster version is set to the major.minor version of its own codebase version. So a standalone member running on codebase version 3.12.0 sets its own cluster version to 3.12.

  • the member that is starting locates members of the cluster and identifies which is the master: in this case, the master validates that the joining member’s codebase version is compatible with the current cluster version. If it is found to be compatible, then the member joins and the master sends the cluster version, which is set on the joining member. Otherwise, the starting member fails to join and shuts down.

What if a new Hazelcast minor version changes fundamental cluster protocol communication, like join messages?

The version numbers used in the paragraph below are only used as an example.

On startup, as answered in the above question (How is the cluster version set?), the cluster version is not yet known to a member that has not joined any cluster. By default the newly started member uses the cluster protocol that corresponds to its codebase version until this member joins a cluster (so for codebase 3.12.0 this means implicitly assuming cluster version 3.12). If, hypothetically, major changes in discovery & join operations have been introduced which do not allow the member to join a 3.11 cluster, then the member should be explicitly configured to start assuming a 3.11 cluster version.

Do I have to upgrade clients to work with rolling upgrades?

Clients which implement the Open Binary Client Protocol are compatible with Hazelcast version 3.6 and newer minor versions. Thus older client versions are compatible with next minor versions. Newer clients connected to a cluster operate at the lower version of capabilities until all members are upgraded and the cluster version upgrade occurs.

Can I stop and start multiple members at once during a rolling member upgrade?

It is not recommended due to potential network partitions. It is advised to always stop and start one member in each upgrade step.

Can I upgrade my business app together with Hazelcast while doing a rolling member upgrade?

Yes, but make sure to make the new version of your app compatible with the old one since there will be a timespan when both versions interoperate. Checking if two versions of your app are compatible includes verifying binary and algorithmic compatibility and some other steps.

It is worth mentioning that a business app upgrade is orthogonal to a rolling member upgrade. A rolling business app upgrade may be done without upgrading the members.

7. Distributed Data Structures

As mentioned in the Overview section, Hazelcast offers distributed implementations of many common data structures. For each of the client languages, Hazelcast mimics as closely as possible the natural interface of the structure. So, for example in Java, the map follows java.util.Map semantics. In the descriptions below, we mention each structure’s Java equivalent interface. All of these structures are usable from Java, .NET, C++, Node.js, Python, Go and Scala.

  • Standard utility collections

    • Map is the distributed implementation of java.util.Map. It lets you read from and write to a Hazelcast map with methods such as get and put.

    • Queue is the distributed implementation of java.util.concurrent.BlockingQueue. You can add an item in one member and remove it from another one.

    • Ringbuffer is implemented for reliable eventing system.

    • Set is the distributed and concurrent implementation of java.util.Set. It does not allow duplicate elements and does not preserve their order.

    • List is similar to Hazelcast Set. The only difference is that it allows duplicate elements and preserves their order.

    • Multimap is a specialized Hazelcast map. It is a distributed data structure where you can store multiple values for a single key.

    • Replicated Map does not partition data. It does not spread data to different cluster members. Instead, it replicates the data to all members.

    • Cardinality Estimator is a data structure which implements Flajolet’s HyperLogLog algorithm.

  • Topic is the distributed mechanism for publishing messages that are delivered to multiple subscribers. It is also known as the publish/subscribe (pub/sub) messaging model. See the Topic section for more information. Hazelcast also has a structure called Reliable Topic which uses the same interface of Hazelcast Topic. The difference is that it is backed up by the Ringbuffer data structure. See the Reliable Topic section.

  • Concurrency utilities

    • FencedLock is the distributed implementation of java.util.concurrent.locks.Lock. When you use lock, the critical section that Hazelcast Lock guards is guaranteed to be executed by only one thread in the entire cluster.

    • ISemaphore is the distributed implementation of java.util.concurrent.Semaphore. When performing concurrent activities, semaphores offer permits to control the thread counts.

    • IAtomicLong is the distributed implementation of java.util.concurrent.atomic.AtomicLong. Most of AtomicLong’s operations are available. However, these operations involve remote calls and hence their performances differ from AtomicLong, due to being distributed.

    • IAtomicReference is the distributed implementation of java.util.concurrent.atomic.AtomicReference. When you need to deal with a reference in a distributed environment, you can use Hazelcast IAtomicReference.

    • FlakeIdGenerator is used to generate cluster-wide unique identifiers.

    • ICountdownLatch is the distributed implementation of java.util.concurrent.CountDownLatch. Hazelcast CountDownLatch is a gate keeper for concurrent activities. It enables the threads to wait for other threads to complete their operations.

    • PN counter is a distributed data structure where each Hazelcast instance can increment and decrement the counter value and these updates are propagated to all replicas.

  • Event Journal is a distributed data structure that stores the history of mutation actions on map or cache.

7.1. Overview of Hazelcast Distributed Objects

Hazelcast has two types of distributed objects in terms of their partitioning strategies:

  1. Data structures where each partition stores a part of the instance, namely partitioned data structures.

  2. Data structures where a single partition stores the whole instance, namely non-partitioned data structures.

The following are the partitioned Hazelcast data structures:

  • Map

  • MultiMap

  • Cache (Hazelcast JCache implementation)

  • Event Journal

The following are the non-partitioned Hazelcast data structures:

  • Queue

  • Set

  • List

  • Ringbuffer

  • FencedLock

  • ISemaphore

  • IAtomicLong

  • IAtomicReference

  • FlakeIdGenerator

  • ICountdownLatch

  • Cardinality Estimator

  • PN Counter

Besides these, Hazelcast also offers the Replicated Map structure as explained in the above Standard utility collections list.

7.1.1. Loading and Destroying a Distributed Object

Hazelcast offers a get method for most of its distributed objects. To load an object, first create a Hazelcast instance and then use the related get method on this instance. Following example code snippet creates an Hazelcast instance and a map on this instance.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
Map<Integer, String> customers = hazelcastInstance.getMap( "customers" );

As to the configuration of distributed object, Hazelcast uses the default settings from the file hazelcast.xml that comes with your Hazelcast download. Of course, you can provide an explicit configuration in this XML or programmatically according to your needs. See the Understanding Configuration section.

Note that, most of Hazelcast’s distributed objects are created lazily, i.e., a distributed object is created once the first operation accesses it.

If you want to use an object you loaded in other places, you can safely reload it using its reference without creating a new Hazelcast instance (customers in the above example).

To destroy a Hazelcast distributed object, you can use the method destroy. This method clears and releases all resources of the object. Therefore, you must use it with care since a reload with the same object reference after the object is destroyed creates a new data structure without an error. See the following example code where one of the queues are destroyed and the other one is accessed.

HazelcastInstance hz1 = Hazelcast.newHazelcastInstance();
HazelcastInstance hz2 = Hazelcast.newHazelcastInstance();
IQueue<String> q1 = hz1.getQueue("q");
IQueue<String> q2 = hz2.getQueue("q");
q1.add("foo");
System.out.println("q1.size: "+q1.size()+ " q2.size:"+q2.size());
q1.destroy();
System.out.println("q1.size: " + q1.size() + " q2.size:" + q2.size());

If you start the Member above, the output is as shown below:

q1.size: 1 q2.size:1
q1.size: 0 q2.size:0

As you see, no error is generated and a new queue resource is created.

Hazelcast is designed to create any distributed data structure whenever it is accessed, i.e., whenever a call is made to the data structure. Therefore, keep in mind that a data structure is recreated when you perform an operation on it even after you have destroyed it.

7.1.2. Controlling Partitions

Hazelcast uses the name of a distributed object to determine which partition it will be put. Let’s load two queues as shown below:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
IQueue q1 = hazelcastInstance.getQueue("q1");
IQueue q2 = hazelcastInstance.getQueue("q2");

Since these queues have different names, they will be placed into different partitions. If you want to put these two into the same partition, you use the @ symbol as shown below:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
IQueue q1 = hazelcastInstance.getQueue("q1@foo");
IQueue q2 = hazelcastInstance.getQueue("q2@foo");

Now, these two queues will be put into the same partition whose partition key is foo. Note that you can use the method getPartitionKey to learn the partition key of a distributed object. It may be useful when you want to create an object in the same partition of an existing object. See its usage as shown below:

String partitionKey = q1.getPartitionKey();
IQueue q3 = hazelcastInstance.getQueue("q3@"+partitionKey);

7.1.3. Common Features of all Hazelcast Data Structures

  • If a member goes down, its backup replica (which holds the same data) dynamically redistributes the data, including the ownership and locks on them, to the remaining live members. As a result, there will not be any data loss.

  • There is no single cluster master that can be a single point of failure. Every member in the cluster has equal rights and responsibilities. No single member is superior. There is no dependency on an external 'server' or 'master'.

7.1.4. Example Distributed Object Code

Here is an example of how you can retrieve existing data structure instances (map, queue, set, topic, etc.) and how you can listen for instance events, such as an instance being created or destroyed.

    ExampleDOL example = new ExampleDOL();
    Config config = new Config();

    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);
    hazelcastInstance.addDistributedObjectListener(example);

    Collection<DistributedObject> distributedObjects = hazelcastInstance.getDistributedObjects();
    for (DistributedObject distributedObject : distributedObjects) {
        System.out.println(distributedObject.getName());
    }
}

@Override
public void distributedObjectCreated(DistributedObjectEvent event) {
    DistributedObject instance = event.getDistributedObject();
    System.out.println("Created " + instance.getName());
}

@Override
public void distributedObjectDestroyed(DistributedObjectEvent event) {
    DistributedObject instance = event.getDistributedObject();
    System.out.println("Destroyed " + instance.getName());
}

7.2. Map

Hazelcast Map (IMap) extends the interface java.util.concurrent.ConcurrentMap and hence java.util.Map. It is the distributed implementation of Java map. You can perform operations like reading and writing from/to a Hazelcast map with the well known get and put methods.


IMap data structure can also be used by Hazelcast Jet for Real-Time Stream Processing (by enabling the Event Journal on your map) and Fast Batch Processing. Hazelcast Jet uses IMap as a source (reads data from IMap) and as a sink (writes data to IMap). See the Fast Batch Processing and Real-Time Stream Processing use cases for Hazelcast Jet. See also here in the Hazelcast Jet Reference Manual to learn how Jet uses IMap, i.e., how it can read from and write to IMap.

7.2.1. Getting a Map and Putting an Entry

Hazelcast partitions your map entries and their backups, and almost evenly distribute them onto all Hazelcast members. Each member carries approximately "number of map entries * 2 * 1/n" entries, where n is the number of members in the cluster. For example, if you have a member with 1000 objects to be stored in the cluster and then you start a second member, each member will both store 500 objects and back up the 500 objects in the other member.

Let’s create a Hazelcast instance and fill a map named Capitals with key-value pairs using the following code. Use the HazelcastInstance getMap method to get the map, then use the map put method to put an entry into the map.

HazelcastInstance hzInstance = Hazelcast.newHazelcastInstance();
Map<String, String> capitalcities = hzInstance.getMap( "capitals" );
    capitalcities.put( "1", "Tokyo" );
    capitalcities.put( "2", "Paris" );
    capitalcities.put( "3", "Washington" );
    capitalcities.put( "4", "Ankara" );
    capitalcities.put( "5", "Brussels" );
    capitalcities.put( "6", "Amsterdam" );
    capitalcities.put( "7", "New Delhi" );
    capitalcities.put( "8", "London" );
    capitalcities.put( "9", "Berlin" );
    capitalcities.put( "10", "Oslo" );
    capitalcities.put( "11", "Moscow" );
    ...
    capitalcities.put( "120", "Stockholm" );

When you run this code, a cluster member is created with a map whose entries are distributed across the members' partitions. See the below illustration. For now, this is a single member cluster.

Map Entries in a Single Member
Please note that some of the partitions do not contain any data entries since we only have 120 objects and the partition count is 271 by default. This count is configurable and can be changed using the system property hazelcast.partition.count. See the System Properties appendix.

7.2.2. Creating A Member for Map Backup

Now let’s create a second member by running the above code again. This creates a cluster with two members. This is also where backups of entries are created - remember the backup partitions mentioned in the Hazelcast Overview section. The following illustration shows two members and how the data and its backup is distributed.

Map Entries with Backups in Two Members

As you see, when a new member joins the cluster, it takes ownership and loads some of the data in the cluster. Eventually, it will carry almost "(1/n * total-data) + backups" of the data, reducing the load on other members.

HazelcastInstance.getMap() returns an instance of com.hazelcast.core.IMap which extends the java.util.concurrent.ConcurrentMap interface. Methods like ConcurrentMap.putIfAbsent(key,value) and ConcurrentMap.replace(key,value) can be used on the distributed map, as shown in the example below.

public class BasicMapOperations {

    private HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

    public Customer getCustomer(String id) {
        ConcurrentMap<String, Customer> customers = hazelcastInstance.getMap("customers");
        Customer customer = customers.get(id);
        if (customer == null) {
            customer = new Customer(id);
            customer = customers.putIfAbsent(id, customer);
        }
        return customer;
    }

    public boolean updateCustomer(Customer customer) {
        ConcurrentMap<String, Customer> customers = hazelcastInstance.getMap("customers");
        return (customers.replace(customer.getId(), customer) != null);
    }

    public boolean removeCustomer(Customer customer) {
        ConcurrentMap<String, Customer> customers = hazelcastInstance.getMap("customers");
        return customers.remove(customer.getId(), customer);
    }
}

All ConcurrentMap operations such as put and remove might wait if the key is locked by another thread in the local or remote JVM. But, they will eventually return with success. ConcurrentMap operations never throw a java.util.ConcurrentModificationException.

7.2.3. Backing Up Maps

Hazelcast distributes map entries onto multiple cluster members (JVMs). Each member holds some portion of the data.

Distributed maps have one backup by default. If a member goes down, your data is recovered using the backups in the cluster. There are two types of backups as described below: sync and async.

Creating Sync Backups

To provide data safety, Hazelcast allows you to specify the number of backup copies you want to have. That way, data on a cluster member is copied onto other member(s).

To create synchronous backups, select the number of backup copies using the backup-count property.

<hazelcast>
    ...
    <map name="default">
        <backup-count>1</backup-count>
    </map>
    ...
</hazelcast>

When this count is 1, a map entry will have its backup on one other member in the cluster. If you set it to 2, then a map entry will have its backup on two other members. You can set it to 0 if you do not want your entries to be backed up, e.g., if performance is more important than backing up. The maximum value for the backup count is 6.

Hazelcast supports both synchronous and asynchronous backups. By default, backup operations are synchronous and configured with backup-count. In this case, backup operations block operations until backups are successfully copied to backup members (or deleted from backup members in case of remove) and acknowledgements are received. Therefore, backups are updated before a put operation is completed, provided that the cluster is stable. Sync backup operations have a blocking cost which may lead to latency issues.

Creating Async Backups

Asynchronous backups, on the other hand, do not block operations. They are fire & forget and do not require acknowledgements; the backup operations are performed at some point in time.

To create asynchronous backups, select the number of async backups with the async-backup-count property. An example is shown below.

<hazelcast>
    ...
    <map name="default">
        <backup-count>0</backup-count>
        <async-backup-count>1</async-backup-count>
    </map>
    ...
</hazelcast>

See Consistency and Replication Model for more detail.

Backups increase memory usage since they are also kept in memory.
A map can have both sync and async backups at the same time.
Enabling Backup Reads

By default, Hazelcast has one sync backup copy. If backup-count is set to more than 1, then each member will carry both owned entries and backup copies of other members. So for the map.get(key) call, it is possible that the calling member has a backup copy of that key. By default, map.get(key) always reads the value from the actual owner of the key for consistency.

To enable backup reads (read local backup entries), set the value of the read-backup-data property to true. Its default value is false for consistency. Enabling backup reads can improve performance but on the other hand it can cause stale reads while still preserving monotonic-reads property.

<hazelcast>
    ...
    <map name="default">
        <backup-count>0</backup-count>
        <async-backup-count>1</async-backup-count>
        <read-backup-data>true</read-backup-data>
    </map>
    ...
</hazelcast>

This feature is available when there is at least one sync or async backup.

Please note that if you are performing a read from a backup, you should take into account that your hits to the keys in the backups are not reflected as hits to the original keys on the primary members. This has an impact on IMap’s maximum idle seconds or time-to-live seconds expiration. Therefore, even though there is a hit on a key in backups, your original key on the primary member may expire.

7.2.4. Map Eviction

Starting with Hazelcast 3.7, Hazelcast Map uses a new eviction mechanism which is based on the sampling of entries. See the Eviction Algorithm section for details.

Unless you delete the map entries manually or use an eviction policy, they will remain in the map. Hazelcast supports policy-based eviction for distributed maps. Currently supported policies are LRU (Least Recently Used) and LFU (Least Frequently Used).

Understanding Map Eviction

Hazelcast Map performs eviction based on partitions. For example, when you specify a size using the PER_NODE attribute for max-size (see the Configuring Map Eviction section), Hazelcast internally calculates the maximum size for every partition. Hazelcast uses the following equation to calculate the maximum size of a partition:

partition-maximum-size = max-size * member-count / partition-count
If the partition-maximum-size is less than 1 in the equation above, it will be set to 1 (otherwise, the partitions would be emptied immediately by eviction due to the exceedance of max-size being less than 1).

The eviction process starts according to this calculated partition maximum size when you try to put an entry. When entry count in that partition exceeds partition maximum size, eviction starts on that partition.

Assume that you have the following figures as examples:

  • partition count: 200

  • entry count for each partition: 100

  • max-size (PER_NODE): 20000

The total number of entries here is 20000 (partition count * entry count for each partition). This means you are at the eviction threshold since you set the max-size to 20000. When you try to put an entry:

  1. the entry goes to the relevant partition

  2. the partition checks whether the eviction threshold is reached (max-size)

  3. only one entry will be evicted.

As a result of this eviction process, when you check the size of your map, it is 19999. After this eviction, subsequent put operations do not trigger the next eviction until the map size is again close to the max-size.

The above scenario is simply an example that describes how the eviction process works. Hazelcast finds the most optimum number of entries to be evicted according to your cluster size and selected policy.
Configuring Map Eviction

The following is an example declarative configuration for map eviction.

<hazelcast>
    ...
    <map name="default">
        <time-to-live-seconds>0</time-to-live-seconds>
        <max-idle-seconds>0</max-idle-seconds>
        <eviction eviction-policy="LRU" max-size-policy="PER_NODE" size="5000"/>
    </map>
    ...
</hazelcast>

The following are the configuration element descriptions:

  • time-to-live-seconds: Maximum time in seconds for each entry to stay in the map (TTL). It limits the lifetime of the entries relative to the time of the last write access performed on them. If it is not 0, the entries whose lifetime exceeds this period (without any write access performed on them during this period) are expired and evicted automatically. An individual entry may have its own lifetime limit by using one of the methods accepting a TTL; see Evicting Specific Entries section. If there is no TTL value provided for the individual entry, it inherits the value set for this element. Valid values are integers between 0 and Integer.MAX VALUE. Its default value is 0, which means infinite (no expiration and eviction). If it is not 0, entries are evicted regardless of the set eviction-policy described below.

  • max-idle-seconds: Maximum time in seconds for each entry to stay idle in the map. It limits the lifetime of the entries relative to the time of the last read or write access performed on them. The entries whose idle period exceeds this limit are expired and evicted automatically. An entry is idle if no get, put, EntryProcessor.process or containsKey is called on it. Valid values are integers between 0 and Integer.MAX VALUE. Its default value is 0, which means infinite.

    Setting this property to 1 second expires the entry after 1 second, regardless of the operations done on that entry in-between, due to the loss of millisecond resolution on the entry timestamps. Assume that you create a record at time = 1 second (1000 milliseconds) and access it at wall clock time 1100 milliseconds and then again at 1400 milliseconds. In this case, the entry is deemed as not touched. So, setting this property to 1 second is not supported.
    Both time-to-live-seconds and max-idle-seconds may be used simultaneously on the map entries. In that case, the entry is considered expired if at least one of the policies marks it as expired.
  • eviction-policy: Eviction policy to be applied when the size of map grows larger than the value specified by the size element described below. Valid values are:

    • NONE: Default policy. If set, no items are evicted and the property size described below is ignored. You still can combine it with time-to-live-seconds and max-idle-seconds.

    • LRU: Least Recently Used.

    • LFU: Least Frequently Used.

      Apart from the above values, you can also develop and use your own eviction policy. See the Custom Eviction Policy section.

  • size: Maximum size of the map. When maximum size is reached, the map is evicted based on the policy defined. Valid values are integers between 0 and Integer.MAX VALUE. Its default value is 0, which means infinite. If you want size to work, set the eviction-policy property to a value other than NONE. Its attributes are described below.

    • PER_NODE: Maximum number of map entries in each cluster member. This is the default policy.

      <eviction max-size-policy="PER_NODE" size="5000"/>

    • PER_PARTITION: Maximum number of map entries within each partition. Storage size depends on the partition count in a cluster member. This attribute should not be used often. For instance, avoid using this attribute with a small cluster. If the cluster is small, it hosts more partitions, and therefore map entries, than that of a larger cluster. Thus, for a small cluster, eviction of the entries decreases performance (the number of entries is large).

      <eviction max-size-policy="PER_PARTITION" size="27100" />

    • USED_HEAP_SIZE: Maximum used heap size in megabytes per map for each Hazelcast instance. Please note that this policy does not work when in-memory format is set to OBJECT, since the memory footprint cannot be determined when data is put as OBJECT.

      <eviction max-size-policy="USED_HEAP_SIZE" size="4096" />

    • USED_HEAP_PERCENTAGE: Maximum used heap size percentage per map for each Hazelcast instance. If, for example, a JVM is configured to have 1000 MB and this value is 10, then the map entries will be evicted when used heap size exceeds 100 MB. Please note that this policy does not work when in-memory format is set to OBJECT, since the memory footprint cannot be determined when data is put as OBJECT.

      <eviction max-size-policy="USED_HEAP_PERCENTAGE" size="10" />

    • FREE_HEAP_SIZE: Minimum free heap size in megabytes for each JVM.

      <eviction max-size-policy="FREE_HEAP_SIZE" size="512" />

    • FREE_HEAP_PERCENTAGE: Minimum free heap size percentage for each JVM. If, for example, a JVM is configured to have 1000 MB and this value is 10, then the map entries will be evicted when free heap size is below 100 MB.

      <eviction max-size-policy="FREE_HEAP_PERCENTAGE" size="10" />

    • USED_NATIVE_MEMORY_SIZE: (Hazelcast IMDG Enterprise HD) Maximum used native memory size in megabytes per map for each Hazelcast instance.

      <eviction max-size-policy="USED_NATIVE_MEMORY_SIZE" size="1024" />

    • USED_NATIVE_MEMORY_PERCENTAGE: (Hazelcast IMDG Enterprise HD) Maximum used native memory size percentage per map for each Hazelcast instance.

      <eviction max-size-policy="USED_NATIVE_MEMORY_PERCENTAGE" size="65" />

    • FREE_NATIVE_MEMORY_SIZE: (Hazelcast IMDG Enterprise HD) Minimum free native memory size in megabytes for each Hazelcast instance.

      <eviction max-size-policy="FREE_NATIVE_MEMORY_SIZE" size="256" />

    • FREE_NATIVE_MEMORY_PERCENTAGE: (Hazelcast IMDG Enterprise HD) Minimum free native memory size percentage for each Hazelcast instance.

      <eviction max-size-policy="FREE_NATIVE_MEMORY_PERCENTAGE" size="5" />

To put it briefly, Hazelcast maps have no restrictions on the size and may grow arbitrarily large, by default. When it comes to reducing the size of a map, there are two concepts: expiration and eviction.

Expiration puts a limit on the maximum lifetime of an entry stored inside the map. When the entry expires it cannot be retrieved from the map any longer and at some point in time it will be cleaned out from the map to free up the memory. Expiration, and hence the eviction based on the expiration, can be configured using the element time-to-live-seconds and max-idle-seconds as described above.

Eviction puts a limit on the maximum size of the map. If the size of the map grows larger than the maximum allowed size, an eviction policy decides which item to evict from the map to reduce its size. The maximum allowed size can be configured using the element size and the eviction policy can be configured using the element eviction-policy as described above.

Eviction and expiration can be used together. In this case, the expiration configurations (time-to-live-seconds and max-idle-seconds) continue to work as usual cleaning out the expired entries regardless of the map size. Note that locked map entries are not the subjects for eviction and expiration.

Example Eviction Configurations
<hazelcast>
    ...
    <map name="documents">
        <eviction eviction-policy="LRU" max-size-policy="PER_NODE" size="10000"/>
        <max-idle-seconds>60</max-idle-seconds>
    </map>
    ...
</hazelcast>

In the above example, documents map starts to evict its entries from a member when the map size exceeds 10000 in that member. Then the entries least recently used will be evicted. The entries not used for more than 60 seconds will be evicted as well.

And the following is an example eviction configuration for a map having NATIVE as the in-memory format:

<hazelcast>
    ...
    <map name="nativeMap*">
        <in-memory-format>NATIVE</in-memory-format>
        <eviction max-size-policy="USED_NATIVE_MEMORY_PERCENTAGE" eviction-policy="LFU" size="99"/>
    </map>
    ...
</hazelcast>
Evicting Specific Entries

The eviction policies and configurations explained above apply to all the entries of a map. The entries that meet the specified eviction conditions are evicted.

If you want to evict some specific map entries, you can use the ttl and ttlUnit parameters of the method map.put(). An example code line is given below.

myMap.put( "1", "John", 50, TimeUnit.SECONDS )

The map entry with the key "1" will be evicted 50 seconds after it is put into myMap.

You may also use map.setTTL method to alter the time-to-live value of an existing entry. It is done as follows:

myMap.setTTL( "1", 50, TimeUnit.SECONDS )

In addition to the ttl, you may also specify a maximum idle timeout for specific map entries using the maxIdle and maxIdleUnit parameters:

myMap.put( "1", "John", 50, TimeUnit.SECONDS, 40, TimeUnit.SECONDS )

Here ttl is set as 50 seconds and maxIdle is set as 40 seconds. The entry is considered to be evicted if at least one of these policies marks it as expired. If you want to specify only the maxIdle parameter, you need to set ttl as 0 seconds.

Evicting All Entries

To evict all keys from the map except the locked ones, use the method evictAll(). If a MapStore is defined for the map, deleteAll is not called by evictAll. If you want to call the method deleteAll, use clear().

An example is given below.

final int numberOfKeysToLock = 4;
final int numberOfEntriesToAdd = 1000;

HazelcastInstance node1 = Hazelcast.newHazelcastInstance();
HazelcastInstance node2 = Hazelcast.newHazelcastInstance();

IMap<Integer, Integer> map = node1.getMap( "map" );
for (int i = 0; i < numberOfEntriesToAdd; i++) {
    map.put(i, i);
}

for (int i = 0; i < numberOfKeysToLock; i++) {
    map.lock(i);
}

// should keep locked keys and evict all others.
map.evictAll();

System.out.printf("# After calling evictAll...\n");
System.out.printf("# Expected map size\t: %d\n", numberOfKeysToLock);
System.out.printf("# Actual map size\t: %d\n", map.size());
Only EVICT_ALL event is fired for any registered listeners.
Forced Eviction

Hazelcast IMDG Enterprise

Hazelcast may use forced eviction in the cases when the eviction explained in Understanding Map Eviction is not enough to free up your memory. Note that this is valid if you are using Hazelcast IMDG Enterprise and you set your in-memory format to NATIVE.

The forced eviction mechanism is explained below as steps in the given order:

  • When the normal eviction is not enough, forced eviction is triggered and first it tries to evict approx. 20% of the entries from the current partition. It retries this five times.

  • If the result of above step is still not enough, forced eviction applies the above step to all maps. This time it might perform eviction from some other partitions too, provided that they are owned by the same thread.

  • If that is still not enough to free up your memory, it evicts not the 20% but all the entries from the current partition.

  • If that is not enough, it will evict all the entries from the other data structures; from the partitions owned by the local thread.

Finally, when all the above steps are not enough, Hazelcast throws a native OutOfMemoryException.

When you have an evictable cache/map, you should safely put entries to it without facing with any memory shortages. Forced eviction helps to achieve this. Regular eviction removes one entry at a time while forced eviction can remove multiple entries, which can even be owned by another caches/maps.

Custom Eviction Policy

Apart from the policies such as LRU and LFU, which Hazelcast provides out-of-the-box, you can develop and use your own eviction policy.

To achieve this, you need to provide an implementation of MapEvictionPolicyComparator as in the following OddEvictor example:

public class MapCustomEvictionPolicyComparator {

    public static void main(String[] args) {
        Config config = new Config();
        config.getMapConfig("test")
                .getEvictionConfig()
                .setComparator(new OddEvictor())
                .setMaxSizePolicy(PER_NODE)
                .setSize(10000);

        HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
        IMap<Integer, Integer> map = instance.getMap("test");

        final Queue<Integer> oddKeys = new ConcurrentLinkedQueue<Integer>();
        final Queue<Integer> evenKeys = new ConcurrentLinkedQueue<Integer>();

        map.addEntryListener((EntryEvictedListener<Integer, Integer>) event -> {
            Integer key = event.getKey();
            if (key % 2 == 0) {
                evenKeys.add(key);
            } else {
                oddKeys.add(key);
            }
        }, false);

        // wait some more time to receive evicted-events
        parkNanos(SECONDS.toNanos(5));

        for (int i = 0; i < 15000; i++) {
            map.put(i, i);
        }

        String msg = "IMap uses sampling based eviction. After eviction"
                + " is completed, we are expecting number of evicted-odd-keys"
                + " should be greater than number of evicted-even-keys. \nNumber"
                + " of evicted-odd-keys = %d, number of evicted-even-keys = %d";
        out.println(format(msg, oddKeys.size(), evenKeys.size()));

        instance.shutdown();
    }

    /**
     * Odd evictor tries to evict odd keys first.
     */
    private static class OddEvictor
            implements MapEvictionPolicyComparator<Integer, Integer> {

        @Override
        public int compare(EntryView<Integer, Integer> e1,
                           EntryView<Integer, Integer> e2) {

            Integer key1 = e1.getKey();
            if (key1 % 2 != 0) {
                return -1;
            }

            Integer key2 = e2.getKey();
            if (key2 % 2 != 0) {
                return 1;
            }

            return 0;
        }

    }
}

Then you can enable your policy by setting it via the method MapConfig.getEvictionConfig().setComparatorClassName() programmatically or via XML declaratively. Following is the example declarative configuration for the eviction policy OddEvictor implemented above:

<hazelcast>
    ...
    <map name="test">
        ...
        <eviction comparator-class-name="com.mycompany.OddEvictor"/>
        ...
    </map>
</hazelcast>

If you Hazelcast with Spring, you can enable your policy as shown below.

<hz:map name="test">
    <hz:map-eviction comparator-class-name="com.package.OddEvictor"/>
</hz:map>

7.2.5. Setting In-Memory Format

IMap (and a few other Hazelcast data structures, such as ICache) has an in-memory-format configuration option. By default, Hazelcast stores data into memory in binary (serialized) format. Sometimes it can be efficient to store the entries in their object form, especially in cases of local processing, such as entry processor and queries.

Specify the in-memory-format element in the configuration to set how the data will be stored in the memory. You have the following format options:

  • BINARY (default): The data (both the key and value) is stored in serialized binary format. You can use this option if you mostly perform regular map operations, such as put and get.

  • OBJECT: The data is stored in deserialized form. This configuration is good for maps where entry processing and queries form the majority of all operations and the objects are complex, making the serialization cost comparatively high. By storing objects, entry processing does not contain the deserialization cost. Note that when you use OBJECT as the in-memory format, the key is still stored in binary format and the value is stored in object format.

  • NATIVE: (Hazelcast IMDG Enterprise HD) This format behaves the same as BINARY, however, instead of heap memory, key and value are stored in the off-heap memory.

Regular operations like get rely on the object instance. When the OBJECT format is used and a get is performed, the map does not return the stored instance, but creates a clone. Therefore, this whole get operation first includes a serialization on the member owning the instance and then a deserialization on the member calling the instance. When the BINARY format is used, only a deserialization is required; BINARY is faster.

Similarly, a put operation is faster when the BINARY format is used. If the format was OBJECT, the map would create a clone of the instance, and there would first be a serialization and then a deserialization. When BINARY is used, only a deserialization is needed.

If a value is stored in OBJECT format, a change on a returned value does not affect the stored instance. In this case, the returned instance is not the actual one but a clone. Therefore, changes made on an object after it is returned will not reflect on the actual stored data. Similarly, when a value is written to a map and the value is stored in OBJECT format, it will be a copy of the put value. Therefore, changes made on the object after it is stored will not reflect on the stored data.

7.2.6. Using High-Density Memory Store with Map

Hazelcast IMDG Enterprise HD

Hazelcast instances are Java programs. In case of BINARY and OBJECT in-memory formats, Hazelcast stores your distributed data into the heap of its server instances. Java heap is subject to garbage collection (GC). In case of larger heaps, garbage collection might cause your application to pause for tens of seconds (even minutes for really large heaps), badly affecting your application performance and response times.

As the data gets bigger, you either run the application with larger heap, which would result in longer GC pauses or run multiple instances with smaller heap which can turn into an operational nightmare if the number of such instances becomes very high.

To overcome this challenge, Hazelcast offers High-Density Memory Store for your maps. You can configure your map to use High-Density Memory Store by setting the in-memory format to NATIVE. The following snippet is the declarative configuration example.

<hazelcast>
    ...
    <map name="nativeMap*">
        <in-memory-format>NATIVE</in-memory-format>
    </map>
    ...
</hazelcast>

Keep in mind that you should have already enabled the High-Density Memory Store usage for your cluster. See the Configuring High-Density Memory Store section.

You can also benefit from the persistent memory technologies such as Intel® Optane™ DC to be used by the High-Density Memory Store. See the Using Persistent Memory section.

Required configuration changes when using NATIVE

Note that the eviction mechanism is different for NATIVE in-memory format. The new eviction algorithm for map with High-Density Memory Store is similar to that of JCache with High-Density Memory Store and is described here.

+

<hazelcast>
    ...
    <map name="nativeMap*">
        <in-memory-format>NATIVE</in-memory-format>
        <eviction-percentage>25</eviction-percentage> <--! NO IMPACT with NATIVE -->
    </map>
    ...
</hazelcast>

+ * These IMap eviction policies for size cannot be used: FREE_HEAP_PERCENTAGE, FREE_HEAP_SIZE, USED_HEAP_PERCENTAGE, USED_HEAP_SIZE.

+ * Near Cache eviction policy ENTRY_COUNT cannot be used for max-size-policy.

See the High-Density Memory Store section for more information.

7.2.7. Metadata Policy

Hazelcast IMap offers automatic preprocessing of various data types on the update time to make queries faster. It is currently supported only by the HazelcastJsonValue type. When metadata creation is on, IMap creates additional metadata about the objects of supported types and uses this metadata during the querying. It does not affect the latency and throughput of the object of any type except the supported types.

This feature is on by default. You can configure it using the metadata-policy configuration element.

Declarative Configuration:

<hazelcast>
    ...
    <map name="map-a">
        <!--
        valid values for metadata-policy are:
          - OFF
          - CREATE_ON_UPDATE (default)
        -->
        <metadata-policy>OFF</metadata-policy>
    </map>
    ...
</hazelcast>

Programmatic Configuration:

MapConfig mapConfig = new MapConfig();
mapConfig.setMetadataPolicy(MetadataPolicy.OFF);

7.2.8. Loading and Storing Persistent Data

Hazelcast allows you to load and store the distributed map entries from/to a persistent data store such as a relational database. To do this, you can use Hazelcast’s MapStore and MapLoader interfaces.

When you provide a MapLoader implementation and request an entry (IMap.get()) that does not exist in memory, MapLoader's load method loads that entry from the data store. This loaded entry is placed into the map and will stay there until it is removed or evicted.

When a MapStore implementation is provided, an entry is also put into a user defined data store.

Data store needs to be a centralized system that is accessible from all Hazelcast members. Persistence to a local file system is not supported.
Also note that the MapStore interface extends the MapLoader interface as you can see in the interface code.
Starting with Hazelcast IMDG 3.11, all loads can be listened via EntryLoadedListener.

Following is a MapStore example.

public class PersonMapStore implements MapStore<Long, Person> {

    private final Connection con;
    private final PreparedStatement allKeysStatement;

    public PersonMapStore() {
        try {
            con = DriverManager.getConnection("jdbc:hsqldb:mydatabase", "SA", "");
            con.createStatement().executeUpdate(
                    "create table if not exists person (id bigint not null, name varchar(45), primary key (id))");
            allKeysStatement = con.prepareStatement("select id from person");
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    public synchronized void delete(Long key) {
        System.out.println("Delete:" + key);
        try {
            con.createStatement().executeUpdate(
                    format("delete from person where id = %s", key));
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    public synchronized void store(Long key, Person value) {
        try {
            con.createStatement().executeUpdate(
                    format("insert into person values(%s,'%s')", key, value.getName()));
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    public synchronized void storeAll(Map<Long, Person> map) {
        for (Map.Entry<Long, Person> entry : map.entrySet()) {
            store(entry.getKey(), entry.getValue());
        }
    }

    public synchronized void deleteAll(Collection<Long> keys) {
        for (Long key : keys) {
            delete(key);
        }
    }

    public synchronized Person load(Long key) {
        try {
            ResultSet resultSet = con.createStatement().executeQuery(
                    format("select name from person where id =%s", key));
            try {
                if (!resultSet.next()) {
                    return null;
                }
                String name = resultSet.getString(1);
                return new Person(key, name);
            } finally {
                resultSet.close();
            }
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    public synchronized Map<Long, Person> loadAll(Collection<Long> keys) {
        Map<Long, Person> result = new HashMap<Long, Person>();
        for (Long key : keys) {
            result.put(key, load(key));
        }
        return result;
    }

    public Iterable<Long> loadAllKeys() {
        return new StatementIterable<Long>(allKeysStatement);
    }
}
During the initial loading process, MapStore uses a thread different from the partition threads that are used by the ExecutorService. After the initialization is completed, the map.get method looks up any nonexistent value from the database in a partition thread, or the map.put method looks up the database to return the previously associated value for a key also in a partition thread.

Entries loaded by MapLoader do not have a set time-to-live property. Therefore, they live until evicted or explicitly removed. It is possible to enforce time-to-live on the entries by using EntryLoader. EntryLoader allows you to set time-to-live values per key before handing the values to Hazelcast. Therefore, you can store and load key specific time-to-live values in the external storage.

Similar to EntryLoader, in order to store custom expiration times associated with the entries, you may use EntryStore. EntryStore allows you to retrieve associated expiration date for each entry. The expiration date is an offset from an epoch in milliseconds. Epoch is January 1, 1970 UTC which is used by System.currentTimeMillis().

Although the expiration date is expressed in milliseconds, IMap has second granularity when it comes to expiration. Therefore, the expiration date is rounded to the nearest lower whole second.

EntryLoader and EntryStore extend from MapLoader and MapStore, respectively. Therefore, all features and configuration parameters of MapLoader and MapStore apply to them, too.

Following is an EntryStore example.

public class PersonEntryStore implements EntryStore<Long, Person> {

    private final Connection con;
    private final PreparedStatement allKeysStatement;

    public PersonEntryStore() {
        try {
            con = DriverManager.getConnection("jdbc:hsqldb:mydatabase", "SA", "");
            con.createStatement().executeUpdate(
                    "create table if not exists person (id bigint not null, name varchar(45), expiration-date bigint, primary key (id))");
            allKeysStatement = con.prepareStatement("select id from person");
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public synchronized void delete(Long key) {
        System.out.println("Delete:" + key);
        try {
            con.createStatement().executeUpdate(
                    format("delete from person where id = %s", key));
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public synchronized void store(Long key, MetadataAwareValue<Person> value) {
        try {
            con.createStatement().executeUpdate(
                    format("insert into person values(%s,'%s', %d)", key, value.getValue().getName(), value.getExpirationTime()));
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public void storeAll(Map<Long, MetadataAwareValue<Person>> map) {
        for (Map.Entry<Long, MetadataAwareValue<Person>> entry : map.entrySet()) {
            store(entry.getKey(), entry.getValue());
        }
    }

    @Override
    public synchronized void deleteAll(Collection<Long> keys) {
        for (Long key : keys) {
            delete(key);
        }
    }

    @Override
    public synchronized MetadataAwareValue<Person> load(Long key) {
        try {
            ResultSet resultSet = con.createStatement().executeQuery(
                    format("select name,expiration-date from person where id =%s", key));
            try {
                if (!resultSet.next()) {
                    return null;
                }
                String name = resultSet.getString(1);
                Long expirationDate = resultSet.getLong(2);
                return new MetadataAwareValue<>(new Person(key, name), expirationDate);
            } finally {
                resultSet.close();
            }
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public synchronized Map<Long, MetadataAwareValue<Person>> loadAll(Collection<Long> keys) {
        Map<Long, MetadataAwareValue<Person>> result = new HashMap<>();
        for (Long key : keys) {
            result.put(key, load(key));
        }
        return result;
    }

    public Iterable<Long> loadAllKeys() {
        return new StatementIterable<Long>(allKeysStatement);
    }
}
For more MapStore/MapLoader code samples, see here.

Hazelcast supports read-through, write-through and write-behind persistence modes, which are explained in the subsections below.

Using Read-Through Persistence

If an entry does not exist in memory when an application asks for it, Hazelcast asks the loader implementation to load that entry from the data store. If the entry exists there, the loader implementation gets it, hands it to Hazelcast, and Hazelcast puts it into memory. This is read-through persistence mode.

As you can remember from the introduction of this section, the IMap.get() method triggers the load() method in your MapLoader implementation if an entry does not exist in the memory. In this case, note that the IMap.get() method does not create backup copies for such entries, when the mode is read-through persistence: there is no need for backups for these entries since if the primary entry is lost, then a read for the key triggers the load() method and loads the entry from the persistence layer.

Setting Write-Through Persistence

MapStore can be configured to be write-through by setting the write-delay-seconds property to 0. This means the entries are put to the data store synchronously.

In this mode, when the map.put(key,value) call returns:

  • MapStore.store(key,value) is successfully called so the entry is persisted.

  • In-Memory entry is updated.

  • In-Memory backup copies are successfully created on other cluster members (if backup-count is greater than 0).

If MapStore throws an exception then the exception is propagated to the original put or remove call in the form of RuntimeException.

There is a key difference in the behaviors of map.remove(key) and map.delete(key), i.e., the latter results in MapStore.delete(key) to be invoked whereas the former only removes the entry from IMap.
Setting Write-Behind Persistence

You can configure MapStore as write-behind by setting the write-delay-seconds property to a value bigger than 0. This means the modified entries will be put to the data store asynchronously after a configured delay.

In write-behind mode, Hazelcast coalesces updates on a specific key by default, which means it applies only the last update on that key. However, you can set MapStoreConfig.setWriteCoalescing() to FALSE and you can store all updates performed on a key to the data store.
When you set MapStoreConfig.setWriteCoalescing() to FALSE, after you reached per-node maximum write-behind-queue capacity, subsequent put operations will fail with ReachedMaxSizeException. This exception is thrown to prevent uncontrolled grow of write-behind queues. You can set per-node maximum capacity using the system property hazelcast.map.write.behind.queue.capacity. See the System Properties appendix for information on this property and how to set the system properties.

In write-behind mode, when the map.put(key,value) call returns:

  • in-memory entry is updated

  • in-memory backup copies are successfully created on the other cluster members (if backup-count is greater than 0)

  • the entry is marked as dirty so that after write-delay-seconds, it can be persisted with MapStore.store(key,value) call

  • and for fault tolerance, dirty entries are stored in a queue on the primary member and also on a back-up member.

The same behavior goes for the map.remove(key), the only difference is that MapStore.delete(key) is called when the entry will be deleted.

If MapStore throws an exception, then Hazelcast tries to store the entry again. If the entry still cannot be stored, a log message is printed and the entry is re-queued.

For batch write operations, which are only allowed in write-behind mode, Hazelcast calls the MapStore.storeAll(map) and MapStore.deleteAll(collection) methods to do all writes in a single call.

If a map entry is marked as dirty, meaning that it is waiting to be persisted to the MapStore in a write-behind scenario, the eviction process forces the entry to be stored. This way you have control over the number of entries waiting to be stored, and thus you can prevent a possible OutOfMemory exception.
MapStore or MapLoader implementations should not use Hazelcast Map/Queue/MultiMap/List/Set operations. Your implementation should only work with your data store. Otherwise, you may get into deadlock situations.

Here is an example configuration:

<hazelcast>
    ...
    <map name="default">
        <map-store enabled="true" initial-mode="LAZY">
            <class-name>com.hazelcast.examples.DummyStore</class-name>
            <write-delay-seconds>60</write-delay-seconds>
            <write-batch-size>1000</write-batch-size>
            <write-coalescing>true</write-coalescing>
        </map-store>
    </map>
    ...
</hazelcast>

The following are the descriptions of MapStore configuration elements and attributes:

  • class-name: Name of the class implementing MapLoader and/or MapStore.

  • write-delay-seconds: Number of seconds to delay to call the MapStore.store(key, value). If the value is zero then it is write-through, so the MapStore.store(key,value) method is called as soon as the entry is updated. Otherwise, it is write-behind; so the updates will be stored after the write-delay-seconds value by calling the Hazelcast.storeAll(map) method. Its default value is 0.

  • write-batch-size: Used to create batch chunks when writing map store. In default mode, all map entries are tried to be written in one go. To create batch chunks, the minimum meaningful value for write-batch-size is 2. For values smaller than 2, it works as in default mode.

  • write-coalescing: In write-behind mode, Hazelcast coalesces updates on a specific key by default; it applies only the last update on it. You can set this element to false to store all updates performed on a key to the data store.

  • enabled: True to enable this map-store, false to disable. Its default value is true.

  • initial-mode: Sets the initial load mode. LAZY is the default load mode, where load is asynchronous. EAGER means load is blocked till all partitions are loaded. See the Initializing Map on Startup section for more details.

Storing Entries to Multiple Maps

A configuration can be applied to more than one map using wildcards (see Using Wildcards), meaning that the configuration is shared among the maps. But MapStore does not know which entries to store when there is one configuration applied to multiple maps.

To store entries when there is one configuration applied to multiple maps, use Hazelcast’s MapStoreFactory interface. Using the MapStoreFactory interface, MapStores for each map can be created when a wildcard configuration is used. Example code is shown below.

Config config = new Config();
MapConfig mapConfig = config.getMapConfig( "*" );
MapStoreConfig mapStoreConfig = mapConfig.getMapStoreConfig();
mapStoreConfig.setFactoryImplementation( new MapStoreFactory<Object, Object>() {
    @Override
    public MapLoader<Object, Object> newMapStore( String mapName, Properties properties ) {
        return null;
    }
});

To initialize the MapLoader implementation with the given map name, configuration properties and the Hazelcast instance, implement the MapLoaderLifecycleSupport interface. This interface has the methods init() and destroy().

The method init() initializes the MapLoader implementation. Hazelcast calls this method when the map is first used on the Hazelcast instance. The MapLoader implementation can initialize the required resources for implementing MapLoader such as reading a configuration file or creating a database connection.

Hazelcast calls the method destroy() before shutting down. You can override this method to cleanup the resources held by this MapLoader implementation, such as closing the database connections.

Initializing Map on Startup

To pre-populate the in-memory map when the map is first touched/used, use the MapLoader.loadAllKeys API.

If MapLoader.loadAllKeys returns NULL, then nothing will be loaded. Your MapLoader.loadAllKeys implementation can return all or some of the keys. For example, you may select and return only the keys which are most important to you that you want to load them while initializing the map. MapLoader.loadAllKeys is the fastest way of pre-populating the map since Hazelcast optimizes the loading process by having each cluster member load its owned portion of the entries.

The InitialLoadMode configuration parameter in the class MapStoreConfig has two values: LAZY and EAGER. If InitialLoadMode is set to LAZY, data is not loaded during the map creation. If it is set to EAGER, all the data is loaded while the map is created and everything becomes ready to use. Also, if you add indices to your map with the IndexConfig class or the addIndex method, then InitialLoadMode is overridden and MapStoreConfig behaves as if EAGER mode is on.

Here is the MapLoader initialization flow:

  1. When getMap() is first called from any member, initialization starts depending on the value of InitialLoadMode. If it is set to EAGER, initialization starts on all partitions as soon as the map is touched, i.e., all partitions are loaded when getMap is called. If it is set to LAZY, data is loaded partition by partition, i.e., each partition is loaded with its first touch.

  2. Hazelcast calls MapLoader.loadAllKeys() to get all your keys on one of the members.

  3. That member distributes keys to all other members in batches.

  4. Each member loads values of all its owned keys by calling MapLoader.loadAll(keys).

  5. Each member puts its owned entries into the map by calling IMap.putTransient(key,value).

If the load mode is LAZY and the clear() method is called (which triggers MapStore.deleteAll()), Hazelcast removes ONLY the loaded entries from your map and datastore. Since all the data is not loaded in this case (LAZY mode), please note that there may still be entries in your datastore.
If you do not want the MapStore start to load as soon as the first cluster member starts, you can use the system property hazelcast.initial.min.cluster.size. For example, if you set its value as 3, loading process will be blocked until all three members are completely up.
The return type of loadAllKeys() is changed from Set to Iterable with the release of Hazelcast 3.5. MapLoader implementations from previous releases are also supported and do not need to be adapted.
Loading Keys Incrementally

If the number of keys to load is large, it is more efficient to load them incrementally rather than loading them all at once. To support incremental loading, the MapLoader.loadAllKeys() method returns an Iterable which can be lazily populated with the results of a database query.

Hazelcast iterates over the Iterable and, while doing so, sends out the keys to their respective owner members. The Iterator obtained from MapLoader.loadAllKeys() may also implement the Closeable interface, in which case Iterator is closed once the iteration is over. This is intended for releasing resources such as closing a JDBC result set.

Forcing All Keys To Be Loaded

The method loadAll loads some or all keys into a data store in order to optimize the multiple load operations. The method has two signatures; the same method can take two different parameter lists. One signature loads the given keys and the other loads all keys. See the example code below.

final int numberOfEntriesToAdd = 1000;
final String mapName = LoadAll.class.getCanonicalName();
final Config config = createNewConfig(mapName);
final HazelcastInstance node = Hazelcast.newHazelcastInstance(config);
final IMap<Integer, Integer> map = node.getMap(mapName);

populateMap(map, numberOfEntriesToAdd);
System.out.printf("# Map store has %d elements\n", numberOfEntriesToAdd);

map.evictAll();
System.out.printf("# After evictAll map size\t: %d\n", map.size());

map.loadAll(true);
System.out.printf("# After loadAll map size\t: %d\n", map.size());
Post-Processing Objects in Map Store

In some scenarios, you may need to modify the object after storing it into the map store. For example, you can get an ID or version auto-generated by your database and then need to modify your object stored in the distributed map, but not to break the synchronization between the database and the data grid.

To post-process an object in the map store, implement the PostProcessingMapStore interface to put the modified object into the distributed map. This triggers an extra step of Serialization, so use it only when needed. (This is only valid when using the write-through map store configuration.)

Here is an example of post processing map store:

class ProcessingStore implements MapStore<Integer, Employee>, PostProcessingMapStore {
    @Override
    public void store( Integer key, Employee employee ) {
        EmployeeId id = saveEmployee();
        employee.setId( id.getId() );
    }
}
Please note that if you are using a post-processing map store in combination with the entry processors, post-processed values will not be carried to backups.
Accessing a Database Using Properties

You can prepare your own MapLoader to access a database such as Cassandra and MongoDB. For this, you can first declaratively specify the database properties in your hazelcast.xml configuration file and then implement the MapLoaderLifecycleSupport interface to pass those properties.

You can define the database properties, such as its URL and name, using the properties configuration element. The following is a configuration example for MongoDB:

<hazelcast>
    ...
    <map name="supplements">
        <map-store enabled="true" initial-mode="LAZY">
            <class-name>com.hazelcast.loader.YourMapStoreImplementation</class-name>
            <properties>
                <property name="mongo.url">mongodb://localhost:27017</property>
                <property name="mongo.db">mydb</property>
                <property name="mongo.collection">supplements</property>
            </properties>
        </map-store>
    </map>
    ...
</hazelcast>

After you specified the database properties in your configuration, you need to implement the MapLoaderLifecycleSupport interface and give those properties in the init() method, as shown below:

public class YourMapStoreImplementation implements MapStore<String, Supplement>, MapLoaderLifecycleSupport {

    private MongoClient mongoClient;
    private MongoCollection collection;

    public YourMapStoreImplementation() {
    }

    @Override
    public void init(HazelcastInstance hazelcastInstance, Properties properties, String mapName) {
        String mongoUrl = (String) properties.get("mongo.url");
        String dbName = (String) properties.get("mongo.db");
        String collectionName = (String) properties.get("mongo.collection");
        this.mongoClient = new MongoClient(new MongoClientURI(mongoUrl));
        this.collection = mongoClient.getDatabase(dbName).getCollection(collectionName);
    }

See the full example here.

MapStore and MapLoader Methods Triggered by IMap Operations

As it is explained in the above sections, you can configure Hazelcast maps to be backed by a map store to persist the entries. In this case many of the IMap methods call MapLoader or MapStore methods to load, store or remove data. This section summarizes these methods. Here are the Hazelcast IMap operations that may trigger the MapStore or MapLoader methods:

IMap Method Impact on the MapStore/MapLoader

flush()

If the map has a MapStore, this method flushes all the local dirty entries. It calls the MapStore.storeAll(Map) or MapStore.deleteAll(Collection) methods with the elements marked as dirty.

  • put()

  • putAll()

  • putAsync()

  • tryPut()

  • putIfAbsent()

These methods are used to put entries to the map. They call the MapLoader.load(Object) method for each entry not found in the memory to load the value from the map store backing the map. They also call the MapStore.store(Object, Object) method for each entry, if write-through persistence mode is configured before the entry is added into the memory.

  • set()

  • setAsync()

These methods put an entry into the map without returning the old value. They call the MapStore.store(Object, Object) method if write-through persistence mode is configured before the entry is added into the memory, to write the value into the map store.

remove()

Removes the mapping for a key from the map if it is present. It calls the MapLoader.load(Object) method if no value is found with key in the memory, to load the value from the map store backing the map. It also calls the MapStore.delete(Object) method if write-through persistence mode is configured before the value is removed from the memory, to remove the value from the map store.

  • removeAll()

  • delete()

  • removeAsync()

  • tryRemove()

These methods are used to remove entries from the map for various conditions. They call the MapStore.delete(Object) method if write-through persistence mode is configured before the value is removed from the memory, to remove the value from the map store.

  • setTtl

This method updates time-to-live of an existing entry. It calls the MapLoader.load(Object) method if no value is found in the memory. It also calls EntryStore.store(Object, MetadataAwareValue) with the entry whose time-to-live has been updated.

clear()

It clears the map and deletes the items from the backing map store. It calls the MapStore.deleteAll(Collection) method on each partition with the keys that the given partition stores.

replace()

It replaces the entry for a key only if currently mapped to a given value. It calls the MapStore.store(Object, Object) method if write-through persistence mode is configured before the value is stored in the memory, to write the value into the map store. 

  • executeOnKey()

  • executeOnKeys()

  • submitToKey()

  • executeOnAllEntries()

These methods apply the user defined entry processors to the entry or entries. They call the MapLoader.load(Object) method if the value with key is not found in the memory, to load the value from the map store backing the map. If the entry processor updates the entry and write-through persistence mode is configured, before the value is stored in memory, they call the MapStore.store(Object, Object) method to write the value into the map store. If the entry processor updates the entry’s value to null value and write-through persistence mode is configured, before the value is removed from the memory, they call the MapStore.delete(Object) method to delete the value from the map store.

7.2.9. Creating Near Cache for Map

The Hazelcast distributed map supports a local Near Cache for remotely stored entries to increase the performance of local read operations. See the Near Cache section for a detailed explanation of the Near Cache feature and its configuration.

7.2.10. Locking Maps

Hazelcast Distributed Map (IMap) is thread-safe to meet your thread safety requirements. When these requirements increase or you want to have more control on the concurrency, consider the Hazelcast solutions described here.

Consider the following example:

public class RacyUpdateMember {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, Value> map = hz.getMap( "map" );
        String key = "1";
        map.put( key, new Value() );
        System.out.println( "Starting" );
        for ( int k = 0; k < 1000; k++ ) {
            if ( k % 100 == 0 ) System.out.println( "At: " + k );
            Value value = map.get( key );
            Thread.sleep( 10 );
            value.amount++;
            map.put( key, value );
        }
        System.out.println( "Finished! Result = " + map.get(key).amount );
    }

    static class Value implements Serializable {
        public int amount;
    }
}

If the above code is run by more than one cluster member simultaneously, a race condition is likely. You can solve this condition with Hazelcast using either pessimistic or optimistic locking.

Pessimistic Locking

One way to solve the race issue is by using pessimistic locking - lock the map entry until you are finished with it.

To perform pessimistic locking, use the lock mechanism provided by the Hazelcast distributed map, i.e., the map.lock and map.unlock methods. See the below example code.

public class PessimisticUpdateMember {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, Value> map = hz.getMap( "map" );
        String key = "1";
        map.put( key, new Value() );
        System.out.println( "Starting" );
        for ( int k = 0; k < 1000; k++ ) {
            map.lock( key );
            try {
                Value value = map.get( key );
                Thread.sleep( 10 );
                value.amount++;
                map.put( key, value );
            } finally {
                map.unlock( key );
            }
        }
        System.out.println( "Finished! Result = " + map.get( key ).amount );
    }

    static class Value implements Serializable {
        public int amount;
    }
}

The IMap lock will automatically be collected by the garbage collector when the lock is released and no other waiting conditions exist on the lock.

The IMap lock is reentrant, but it does not support fairness.

In some cases, a client application connected to your cluster may cause the entries in a map to remain locked after the application has been restarted (which were already locked before such a restart). This can be due to the reasons such as incomplete/incorrect client implementations. In these cases, you can unlock the entries, either from the thread which locked them using the IMap.unlock() method, or check if the entry is locked using the IMap.isLock() method and then call IMap.forceUnlock().
For the above case, as a workaround, you can also kill all the applications connected to the cluster and use the Management Center’s scripting functionality to clear the map and release the locks (instead of using IMap.forceUnlock()). Keep in mind that the scripting functionality is limited to working with maps that have primitive key types, e.g., string keys and limited to relaying only a single string of output per member to the result panel in the Management Center.

Another way to solve the race issue is by acquiring a predictable Lock object from Hazelcast. This way, every value in the map can be given a lock, or you can create a stripe of locks.

Optimistic Locking

In Hazelcast, you can apply the optimistic locking strategy with the map’s replace method. This method compares values in object or data forms depending on the in-memory format configuration. If the values are equal, it replaces the old value with the new one. If you want to use your defined equals method, in-memory-format should be OBJECT. Otherwise, Hazelcast serializes objects to BINARY forms and compares them.

See the below example code.

The below example code is intentionally broken.
public class OptimisticMember {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, Value> map = hz.getMap( "map" );
        String key = "1";
        map.put( key, new Value() );
        System.out.println( "Starting" );
        for ( int k = 0; k < 1000; k++ ) {
            if ( k % 10 == 0 ) System.out.println( "At: " + k );
            for (; ; ) {
                Value oldValue = map.get( key );
                Value newValue = new Value( oldValue );
                Thread.sleep( 10 );
                newValue.amount++;
                if ( map.replace( key, oldValue, newValue ) )
                    break;
            }
        }
        System.out.println( "Finished! Result = " + map.get( key ).amount );
    }

    static class Value implements Serializable {
        public int amount;

        public Value() {
        }

        public Value( Value that ) {
            this.amount = that.amount;
        }

        public boolean equals( Object o ) {
            if ( o == this ) return true;
            if ( !( o instanceof Value ) ) return false;
            Value that = ( Value ) o;
            return that.amount == this.amount;
        }
    }
}
Pessimistic vs. Optimistic Locking

The locking strategy you choose depends on your locking requirements.

Optimistic locking is better for mostly read-only systems. It has a performance boost over pessimistic locking.

Pessimistic locking is good if there are lots of updates on the same key. It is more robust than optimistic locking from the perspective of data consistency.

In Hazelcast, use IExecutorService to submit a task to a key owner, or to a member or members. This is the recommended way to perform task executions, rather than using pessimistic or optimistic locking techniques. IExecutorService has fewer network hops and less data over wire, and tasks are executed very near to the data. See the Data Affinity section.

Solving the ABA Problem

The ABA problem occurs in environments when a shared resource is open to change by multiple threads. Even if one thread sees the same value for a particular key in consecutive reads, it does not mean that nothing has changed between the reads. Another thread may change the value, do work and change the value back, while the first thread thinks that nothing has changed.

To prevent these kind of problems, you can assign a version number and check it before any write to be sure that nothing has changed between consecutive reads. Although all the other fields are equal, the version field will prevent objects from being seen as equal. This is the optimistic locking strategy; it is used in environments that do not expect intensive concurrent changes on a specific key.

In Hazelcast, you can apply the optimistic locking strategy with the map replace method.

Lock Split-Brain Protection with Pessimistic Locking

Locks can be configured to check the number of currently present members before applying a locking operation. If the check fails, the lock operation fails with a SplitBrainProtectionException (see the Split-Brain Protection section). As pessimistic locking uses lock operations internally, it also uses the configured lock split-brain protection. This means that you can configure a lock split-brain protection with the same name or a pattern that matches the map name. Note that the split-brain protection for IMap locking actions can be different from the split-brain protection for other IMap actions.

The following actions check for lock split-brain protection before being applied:

  • IMap.lock(K) and IMap.lock(K, long, java.util.concurrent.TimeUnit)

  • IMap.isLocked()

  • IMap.tryLock(K), IMap.tryLock(K, long, java.util.concurrent.TimeUnit) and IMap.tryLock(K, long, java.util.concurrent.TimeUnit, long, java.util.concurrent.TimeUnit)

  • IMap.unlock()

  • IMap.forceUnlock()

  • MultiMap.lock(K) and MultiMap.lock(K, long, java.util.concurrent.TimeUnit)

  • MultiMap.isLocked()

  • MultiMap.tryLock(K), MultiMap.tryLock(K, long, java.util.concurrent.TimeUnit) and MultiMap.tryLock(K, long, java.util.concurrent.TimeUnit, long, java.util.concurrent.TimeUnit)

  • MultiMap.unlock()

  • MultiMap.forceUnlock()

An example of declarative configuration:

<hazelcast>
    ...
    <map name="myMap">
        <split-brain-protection-ref>map-actions-split-brain-protection</split-brain-protection-ref>
    </map>
    <lock name="myMap">
        <split-brain-protection-ref>map-lock-actions-split-brain-protection</split-brain-protection-ref>
    </lock>
    ...
</hazelcast>

Here the configured map uses the map-lock-actions-split-brain-protection for map lock actions and the map-actions-split-brain-protection for other map actions.

7.2.11. Accessing Map and Entry Statistics

You can retrieve the statistics of the map in your Hazelcast IMDG member using the getLocalMapStats() method, which is the programmatic approach. It returns information such as primary and backup entry count, last update time and locked entry count. If you need the cluster-wide map statistics, you can get the local map statistics from all members of the cluster and combine them. Alternatively, you can see the map statistics on the Hazelcast Management Center.

To be able to retrieve the map statistics, the statistics-enabled element under the map configuration should be set as true, which is the default value:

<hazelcast>
    ...
    <map name="myMap">
        <statistics-enabled>true</statistics-enabled>
    </map>
    ...
</hazelcast>

When this element is set to false, the statistics are not gathered for the map and cannot be seen on the Hazelcast Management Center, nor retrieved by the getLocalMapStats() method.

Hazelcast also keeps statistics about each map entry, such as creation time, last update time, last access time, and number of hits and version. To access the map entry statistics, use an IMap.getEntryView(key) call. Here is an example.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
EntryView entry = hz.getMap( "quotes" ).getEntryView( "1" );
System.out.println ( "size in memory  : " + entry.getCost() );
System.out.println ( "creationTime    : " + entry.getCreationTime() );
System.out.println ( "expirationTime  : " + entry.getExpirationTime() );
System.out.println ( "number of hits  : " + entry.getHits() );
System.out.println ( "lastAccessedTime: " + entry.getLastAccessTime() );
System.out.println ( "lastUpdateTime  : " + entry.getLastUpdateTime() );
System.out.println ( "version         : " + entry.getVersion() );
System.out.println ( "key             : " + entry.getKey() );
System.out.println ( "value           : " + entry.getValue() );

7.2.13. Listening to Map Entries with Predicates

You can listen to the modifications performed on specific map entries. You can think of it as an entry listener with predicates. See the Listening for Map Events section for information on how to add entry listeners to a map.

The default backwards-compatible event publishing strategy only publishes UPDATED events when map entries are updated to a value that matches the predicate with which the listener was registered. This implies that when using the default event publishing strategy, your listener is not notified about an entry whose value is updated from one that matches the predicate to a new value that does not match the predicate.

Since version 3.7, when you configure Hazelcast members with property hazelcast.map.entry.filtering.natural.event.types set to true, handling of entry updates conceptually treats value transition as entry, update or exit with regards to the predicate value space. The following table compares how a listener is notified about an update to a map entry value under the default backwards-compatible Hazelcast behavior (when property hazelcast.map.entry.filtering.natural.event.types is not set or is set to false) versus when set to true:

Default

hazelcast.map.entry.filtering.natural.event.types = true

When old value matches predicate, new value does not match predicate

No event is delivered to entry listener

REMOVED event is delivered to entry listener

When old value matches predicate, new value matches predicate

UPDATED event is delivered to entry listener

UPDATED event is delivered to entry listener

When old value does not match predicate, new value does not match predicate

No event is delivered to entry listener

No event is delivered to entry listener

When old value does not match predicate, new value matches predicate

UPDATED event is delivered to entry listener

ADDED event is delivered to entry listener

As an example, let’s listen to the changes made on an employee with the surname "Smith". First, let’s create the Employee class.

public class Employee implements Serializable {

    private final String surname;

    public Employee(String surname) {
        this.surname = surname;
    }

    @Override
    public String toString() {
        return "Employee{" +
                "surname='" + surname + '\'' +
                '}';
    }
}

Then, let’s create a listener with predicate by adding a listener that tracks ADDED, UPDATED and REMOVED entry events with the surname predicate.

public class ListenerWithPredicate {

    public static void main(String[] args) {
        Config config = new Config();
        config.setProperty("hazelcast.map.entry.filtering.natural.event.types", "true");
        HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);
        IMap<String, String> map = hz.getMap("map");
        map.addEntryListener(new MyEntryListener(),
                Predicates.sql("surname=smith"), true);
        System.out.println("Entry Listener registered");
    }

    static class MyEntryListener
            implements EntryAddedListener<String, String>,
            EntryUpdatedListener<String, String>,
            EntryRemovedListener<String, String> {
        @Override
        public void entryAdded(EntryEvent<String, String> event) {
            System.out.println("Entry Added:" + event);
        }

        @Override
        public void entryRemoved(EntryEvent<String, String> event) {
            System.out.println("Entry Removed:" + event);
        }

        @Override
        public void entryUpdated(EntryEvent<String, String> event) {
            System.out.println("Entry Updated:" + event);
        }
    }
}

And now, let’s play with the employee "smith" and see how that employee is listened to.

public class Modify {

    public static void main(String[] args) {
        Config config = new Config();
        config.setProperty("hazelcast.map.entry.filtering.natural.event.types", "true");
        HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);
        IMap<String, Employee> map = hz.getMap("map");

        map.put("1", new Employee("smith"));
        map.put("2", new Employee("jordan"));
        System.out.println("done");
        System.exit(0);
    }
}

When you first run the class ListenerWithPredicate and then run Modify, an output similar to the one below appears.

entryAdded:EntryEvent {Address[192.168.178.10]:5702} key=1,oldValue=null,
value=Person{name= smith }, event=ADDED, by Member [192.168.178.10]:5702
See the Continuous Query Cache section for more information.

7.2.14. Removing Map Entries in Bulk with Predicates

You can remove all map entries that match your predicate. For this, Hazelcast offers the method removeAll(). Its syntax is as follows:

void removeAll(Predicate<K, V> predicate);

Normally the map entries matching the predicate are found with a full scan of the map. If the entries are indexed, Hazelcast uses the index search to find them. With index, you can expect that finding the entries is faster.

When removeAll() is called, ALL entries in the caller member’s Near Cache are also removed.

7.2.15. Adding Interceptors

You can add intercept operations and execute your own business logic synchronously blocking the operations. You can change the returned value from a get operation, change the value in put, or cancel operations by throwing an exception.

Interceptors are different from listeners. With listeners, you take an action after the operation has been completed. Interceptor actions are synchronous and you can alter the behavior of operation, change its values, or totally cancel it.

Map interceptors are chained, so adding the same interceptor multiple times to the same map can result in duplicate effects. This can easily happen when the interceptor is added to the map at member initialization, so that each member adds the same interceptor. When you add the interceptor in this way, be sure to implement the hashCode() method to return the same value for every instance of the interceptor. It is not strictly necessary, but it is a good idea to also implement equals() as this ensures that the map interceptor can be removed reliably.

The IMap API has two methods for adding and removing an interceptor to the map: addInterceptor and removeInterceptor. See also the MapInterceptor interface to learn about the methods used to intercept the changes in a map.

The following is an example usage.

public class MapInterceptorMember {

    public static void main(String[] args) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, String> map = hz.getMap("themap");
        map.addInterceptor(new MyMapInterceptor());

        map.put("1", "1");
        System.out.println(map.get("1"));
    }

    private static class MyMapInterceptor implements MapInterceptor {

        @Override
        public Object interceptGet(Object value) {
            return value + "-foo";
        }

        @Override
        public void afterGet(Object value) {
        }

        @Override
        public Object interceptPut(Object oldValue, Object newValue) {
            return null;
        }

        @Override
        public void afterPut(Object value) {
        }

        @Override
        public Object interceptRemove(Object removedValue) {
            return null;
        }

        @Override
        public void afterRemove(Object value) {
        }
    }
}

7.2.16. Preventing Out of Memory Exceptions

It is very easy to trigger an out of memory exception (OOME) with query-based map methods, especially with large clusters or heap sizes. For example, on a cluster with five members having 10 GB of data and 25 GB heap size per member, a single call of IMap.entrySet() fetches 50 GB of data and crashes the calling instance.

A call of IMap.values() may return too much data for a single member. This can also happen with a real query and an unlucky choice of predicates, especially when the parameters are chosen by a user of your application.

To prevent this, you can configure a maximum result size limit for query based operations. This is not a limit like SELECT * FROM map LIMIT 100, which you can achieve by a Paging Predicate. A maximum result size limit for query based operations is meant to be a last line of defense to prevent your members from retrieving more data than they can handle.

The Hazelcast component which calculates this limit is the QueryResultSizeLimiter.

Setting Query Result Size Limit

If the QueryResultSizeLimiter is activated, it calculates a result size limit per partition. Each QueryOperation runs on all partitions of a member, so it collects result entries as long as the member limit is not exceeded. If that happens, a QueryResultSizeExceededException is thrown and propagated to the calling instance.

This feature depends on an equal distribution of the data on the cluster members to calculate the result size limit per member. Therefore, there is a minimum value defined in QueryResultSizeLimiter.MINIMUM_MAX_RESULT_LIMIT. Configured values below the minimum will be increased to the minimum.

Local Pre-check

In addition to the distributed result size check in the QueryOperations, there is a local pre-check on the calling instance. If you call the method from a client, the pre-check is executed on the member that invokes the QueryOperations.

Since the local pre-check can increase the latency of a QueryOperation, you can configure how many local partitions should be considered for the pre-check, or you can deactivate the feature completely.

Scope of Result Size Limit

Besides the designated query operations, there are other operations that use predicates internally. Those method calls throw the QueryResultSizeExceededException as well. See the following matrix for the methods that are covered by the query result size limit.

Methods Covered by Query Result Size Limit
Configuring Query Result Size

The query result size limit is configured via the following system properties.

  • hazelcast.query.result.size.limit: Result size limit for query operations on maps. This value defines the maximum number of returned elements for a single query result. If a query exceeds this number of elements, a QueryResultSizeExceededException is thrown.

  • hazelcast.query.max.local.partition.limit.for.precheck: Maximum value of local partitions to trigger local pre-check for Predicates#alwaysTrue() query operations on maps.

See the System Properties appendix to see the full descriptions of these properties and how to set them.

7.3. Queue

Hazelcast distributed queue is an implementation of java.util.concurrent.BlockingQueue. Being distributed, Hazelcast distributed queue enables all cluster members to interact with it. Using Hazelcast distributed queue, you can add an item in one cluster member and remove it from another one.

7.3.1. Getting a Queue and Putting Items

Use the Hazelcast instance’s getQueue method to get the queue, then use the queue’s put method to put items into the queue.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
BlockingQueue<MyTask> queue = hazelcastInstance.getQueue( "tasks" );
queue.put( new MyTask() );
MyTask task = queue.take();

boolean offered = queue.offer( new MyTask(), 10, TimeUnit.SECONDS );
task = queue.poll( 5, TimeUnit.SECONDS );
if ( task != null ) {
    //process task
}

FIFO ordering applies to all queue operations across the cluster. The user objects (such as MyTask in the example above) that are enqueued or dequeued have to be Serializable.

Hazelcast distributed queue performs no batching while iterating over the queue. All items are copied locally and iteration occurs locally.

Hazelcast distributed queue uses ItemListener to listen to the events that occur when items are added to and removed from the queue. See the Listening for Item Events section for information on how to create an item listener class and register it.

7.3.2. Creating an Example Queue

The following example code illustrates a distributed queue that connects a producer and consumer.

Putting Items on the Queue

Let’s put one integer on the queue every second, 100 integers total.

public class ProducerMember {

    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IQueue<Integer> queue = hz.getQueue( "queue" );
        for ( int k = 1; k < 100; k++ ) {
            queue.put( k );
            System.out.println( "Producing: " + k );
            Thread.sleep(1000);
        }
        queue.put( -1 );
        System.out.println( "Producer Finished!" );
    }
}

Producer puts a -1 on the queue to show that the puts are finished.

Taking Items off the Queue

Now, let’s create a Consumer class to take a message from this queue, as shown below.

public class ConsumerMember {

    public static void main( String[] args ) throws Exception {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IQueue<Integer> queue = hz.getQueue( "queue" );
        while ( true ) {
            int item = queue.take();
            System.out.println( "Consumed: " + item );
            if ( item == -1 ) {
                queue.put( -1 );
                break;
            }
            Thread.sleep( 5000 );
        }
        System.out.println( "Consumer Finished!" );
    }
}

As seen in the above example code, Consumer waits five seconds before it consumes the next message. It stops once it receives -1. Also note that Consumer puts -1 back on the queue before the loop is ended.

When you first start Producer and then start Consumer, items produced on the queue will be consumed from the same queue.

Balancing the Queue Operations

From the above example code, you can see that an item is produced every second and consumed every five seconds. Therefore, the consumer keeps growing. To balance the produce/consume operation, let’s start another consumer. This way, consumption is distributed to these two consumers, as seen in the example outputs below.

The second consumer is started. After a while, here is the first consumer output:

...
Consumed 13
Consumed 15
Consumer 17
...

Here is the second consumer output:

...
Consumed 14
Consumed 16
Consumer 18
...

In the case of a lot of producers and consumers for the queue, using a list of queues may solve the queue bottlenecks. In this case, be aware that the order of the messages sent to different queues is not guaranteed. Since in most cases strict ordering is not important, a list of queues is a good solution.

The items are taken from the queue in the same order they were put on the queue. However, if there is more than one consumer, this order is not guaranteed.
ItemIDs When Offering Items

Hazelcast gives an itemId for each item you offer, which is an incrementing sequence identification for the queue items. You should consider the following to understand the itemId assignment behavior:

  • When a Hazelcast member has a queue and that queue is configured to have at least one backup, and that member is restarted, the itemId assignment resumes from the last known highest itemId before the restart; itemId assignment does not start from the beginning for the new items.

  • When the whole cluster is restarted, the same behavior explained in the above consideration applies if your queue has a persistent data store (QueueStore). If the queue has QueueStore, the itemId for the new items are given, starting from the highest itemId found in the IDs returned by the method loadAllKeys. If the method loadAllKeys does not return anything, the itemIds starts from the beginning after a cluster restart.

  • The above two considerations mean there are no duplicated itemIds in the memory or in the persistent data store.

7.3.3. Setting a Bounded Queue

A bounded queue is a queue with a limited capacity. When the bounded queue is full, no more items can be put into the queue until some items are taken out.

To turn a Hazelcast distributed queue into a bounded queue, set the capacity limit with the max-size property. You can set the max-size property in the configuration, as shown below. The max-size element specifies the maximum size of the queue. Once the queue size reaches this value, put operations are blocked until the queue size goes below max-size, which happens when a consumer removes items from the queue.

Let’s set 10 as the maximum size of our example queue in Creating an Example Queue.

<hazelcast>
    ...
    <queue name="queue">
        <max-size>10</max-size>
    </queue>
    ...
</hazelcast>

When the producer is started, ten items are put into the queue and then the queue will not allow more put operations. When the consumer is started, it will remove items from the queue. This means that the producer can put more items into the queue until there are ten items in the queue again, at which point the put operation again becomes blocked.

In this example code, the producer is five times faster than the consumer. It will effectively always be waiting for the consumer to remove items before it can put more on the queue. For this example code, if maximum throughput is the goal, it would be a good option to start multiple consumers to prevent the queue from filling up.

7.3.4. Queueing with Persistent Datastore

Hazelcast allows you to load and store the distributed queue items from/to a persistent datastore using the interface QueueStore. If queue store is enabled, each item added to the queue is also stored at the configured queue store. When the number of items in the queue exceeds the memory limit, the subsequent items are persisted in the queue store, they are not stored in the queue memory.

The QueueStore interface enables you to store, load and delete queue items with methods like store, storeAll, load and delete. The following example class includes all of the QueueStore methods.

public class TheQueueStore implements QueueStore<Item> {

    @Override
    public void delete(Long key) {
        System.out.println("delete");
    }

    @Override
    public void store(Long key, Item value) {
        System.out.println("store");
    }

    @Override
    public void storeAll(Map<Long, Item> map) {
        System.out.println("store all");
    }

    @Override
    public void deleteAll(Collection<Long> keys) {
        System.out.println("deleteAll");
    }

    @Override
    public Item load(Long key) {
        System.out.println("load");
        return null;
    }

    @Override
    public Map<Long, Item> loadAll(Collection<Long> keys) {
        System.out.println("loadALl");
        return null;
    }

    @Override
    public Set<Long> loadAllKeys() {
        System.out.println("loadAllKeys");
        return null;
    }
}

Item must be serializable. The following is an example queue store configuration.

<hazelcast>
    ...
    <queue name="queue">
        <max-size>10</max-size>
        <queue-store>
            <class-name>com.hazelcast.QueueStoreImpl</class-name>
            <properties>
                <property name="binary">false</property>
                <property name="memory-limit">1000</property>
                <property name="bulk-load">500</property>
            </properties>
        </queue-store>
    </queue>
    ...
</hazelcast>

The following are the descriptions for each queue store property:

  • Binary: By default, Hazelcast stores the queue items in serialized form, and before it inserts the queue items into the queue store, it deserializes them. If you are not reaching the queue store from an external application, you might prefer that the items be inserted in binary form. Do this by setting the binary property to true: then you can get rid of the deserialization step, which is a performance optimization. The binary property is false by default.

  • Memory Limit: This is the number of items after which Hazelcast stores items only to the datastore. For example, if the memory limit is 1000, then the 1001st item is put only to the datastore. This feature is useful when you want to avoid out-of-memory conditions. If you want to always use memory, you can set it to Integer.MAX_VALUE. The default number for memory-limit is 1000.

  • Bulk Load: When the queue is initialized, items are loaded from QueueStore in bulks. Bulk load is the size of these bulks. The default value of bulk-load is 250.

7.3.5. Split-Brain Protection for Queue

Queues can be configured to check for a minimum number of available members before applying queue operations (see the Split-Brain Protection section). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

The following is a list of methods, grouped by the protection types, that support split-brain protection checks:

  • WRITE, READ_WRITE

    • Collection.addAll()

    • Collection.removeAll(), Collection.retainAll()

    • BlockingQueue.offer(), BlockingQueue.add(), BlockingQueue.put()

    • BlockingQueue.drainTo()

    • IQueue.poll(), Queue.remove(), IQueue.take()

    • BlockingQueue.remove()

  • READ, READ_WRITE

    • Collection.clear()

    • Collection.containsAll(), BlockingQueue.contains()

    • Collection.isEmpty()

    • Collection.iterator(), Collection.toArray()

    • Queue.peek(), Queue.element()

    • Collection.size()

    • BlockingQueue.remainingCapacity()

7.3.6. Configuring Queue

The following are examples of queue configurations. It includes the QueueStore configuration, which is explained in the Queueing with Persistent Datastore section.

Declarative Configuration:

<hazelcast>
    ...
    <queue name="default">
        <max-size>0</max-size>
        <backup-count>1</backup-count>
        <async-backup-count>0</async-backup-count>
        <empty-queue-ttl>-1</empty-queue-ttl>
        <item-listeners>
            <item-listener>com.hazelcast.examples.ItemListener</item-listener>
        </item-listeners>
        <statistics-enabled>true</statistics-enabled>
        <queue-store>
            <class-name>com.hazelcast.QueueStoreImpl</class-name>
            <properties>
                <property name="binary">false</property>
                <property name="memory-limit">10000</property>
                <property name="bulk-load">500</property>
            </properties>
        </queue-store>
        <split-brain-protection-ref>splitbrainprotection-name</split-brain-protection-ref>
    </queue>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
QueueConfig queueConfig = config.getQueueConfig("default");
queueConfig.setName("MyQueue")
        .setBackupCount(1)
        .setMaxSize(0)
        .setStatisticsEnabled(true)
        .setSplitBrainProtectionName("splitbrainprotectionname");
queueConfig.getQueueStoreConfig()
        .setEnabled(true)
        .setClassName("com.hazelcast.QueueStoreImpl")
        .setProperty("binary", "false");
config.addQueueConfig(queueConfig);

Hazelcast distributed queue has one synchronous backup by default. By having this backup, when a cluster member with a queue goes down, another member having the backup of that queue will continue. Therefore, no items are lost. You can define the number of synchronous backups for a queue using the backup-count element in the declarative configuration. A queue can also have asynchronous backups: you can define the number of asynchronous backups using the async-backup-count element.

To set the maximum size of the queue, use the max-size element. To purge unused or empty queues after a period of time, use the empty-queue-ttl element. If you define a value (time in seconds) for the empty-queue-ttl element, then your queue will be destroyed if it stays empty or unused for the time in seconds that you give.

The following is the full list of queue configuration elements with their descriptions:

  • max-size: Maximum number of items in the queue. It is used to set an upper bound for the queue. You will not be able to put more items when the queue reaches to this maximum size whether you have a queue store configured or not.

  • backup-count: Number of synchronous backups. Queue is a non-partitioned data structure, so all entries of a queue reside in one partition. When this parameter is '1', it means there will be one backup of that queue in another member in the cluster. When it is '2', two members will have the backup.

  • async-backup-count: Number of asynchronous backups.

  • empty-queue-ttl: Used to purge unused or empty queues. If you define a value (time in seconds) for this element, then your queue will be destroyed if it stays empty or unused for that time.

  • item-listeners: Adds listeners (listener classes) for the queue items. You can also set the attribute include-value to true if you want the item event to contain the item values. You can set local to true if you want to listen to the items on the local member.

  • queue-store: Includes the queue store factory class name and the properties binary, memory limit and bulk load. See the Queueing with Persistent Datastore section.

  • statistics-enabled: Specifies whether the statistics gathering is enabled for your queue. If set to false, you cannot collect statistics in your implementation (using getLocalQueueStats()) and also Hazelcast Management Center will not show them. Its default value is true.

  • split-brain-protection-ref : Name of the split-brain protection configuration that you want this queue to use.

7.4. MultiMap

Hazelcast MultiMap is a specialized map where you can store multiple values under a single key. Just like any other distributed data structure implementation in Hazelcast, MultiMap is distributed and thread-safe.

Hazelcast MultiMap is not an implementation of java.util.Map due to the difference in method signatures. It supports most features of Hazelcast Map except for indexing, predicates and MapLoader/MapStore. Yet, like Hazelcast Map, entries are almost evenly distributed onto all cluster members. When a new member joins the cluster, the same ownership logic used in the distributed map applies.

7.4.1. Getting a MultiMap and Putting an Entry

The following example creates a MultiMap and puts items into it:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
MultiMap<String , String > map = hazelcastInstance.getMultiMap( "map" );

map.put( "a", "1" );
map.put( "a", "2" );
map.put( "b", "3" );
System.out.println( "PutMember:Done" );

We use the getMultiMap method to create the MultiMap and then use the put method to put an entry into it.

Now let’s print the entries in this MultiMap using the following code:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
MultiMap<String, String> map = hazelcastInstance.getMultiMap("map");

map.put("a", "1");
map.put("a", "2");
map.put("b", "3");
System.out.printf("PutMember:Done");

for (String key: map.keySet()){
    Collection<String> values = map.get(key);
    System.out.printf("%s -> %s\n", key, values);
}

After you run ExampleMultiMap, run PrintMember. You will see the key a has two values, as shown below:

b → [3]

a → [2, 1]

Hazelcast MultiMap uses EntryListener to listen to events which occur when entries are added to, updated in or removed from the MultiMap. See the Listening for MultiMap Events section for information on how to create an entry listener class and register it.

7.4.2. Configuring MultiMap

When using MultiMap, the collection type of the values can be either Set or List. Configure the collection type with the valueCollectionType parameter. If you choose Set, duplicate and null values are not allowed in your collection and ordering is irrelevant. If you choose List, ordering is relevant and your collection can include duplicate and null values.

You can also enable statistics for your MultiMap with the statisticsEnabled parameter. If you enable statisticsEnabled, statistics can be retrieved with getLocalMultiMapStats() method.

Currently, eviction is not supported for the MultiMap data structure.

The following are the example MultiMap configurations.

Declarative Configuration:

<hazelcast>
    ...
    <multimap name="default">
        <backup-count>0</backup-count>
        <async-backup-count>1</async-backup-count>
        <value-collection-type>SET</value-collection-type>
        <entry-listeners>
            <entry-listener include-value="false" local="false" >com.hazelcast.examples.EntryListener</entry-listener>
        </entry-listeners>
        <split-brain-protection-ref>split-brain-protection-name</split-brain-protection-ref>
    </multimap>
    ...
</hazelcast>

Programmatic Configuration:

MultiMapConfig mmConfig = new MultiMapConfig();
mmConfig.setName( "default" )
        .setBackupCount( 0 ).setAsyncBackupCount( 1 )
        .setValueCollectionType( "SET" )
        .setSplitBrainProtectionName( "splitbrainprotectionname" );

The following are the configuration elements and their descriptions:

  • backup-count: Defines the number of synchronous backups. For example, if it is set to 1, backup of a partition will be placed on one other member. If it is 2, it will be placed on two other members.

  • async-backup-count: The number of asynchronous backups. Behavior is the same as that of the backup-count element.

  • statistics-enabled: Specifies whether the statistics gathering is enabled for your MultiMap. If set to false, you cannot collect statistics in your implementation (using getLocalMultiMapStats()) and also Hazelcast Management Center will not show them. Its default value is true.

  • value-collection-type: Type of the value collection. It can be SET or LIST.

  • entry-listeners: Lets you add listeners (listener classes) for the map entries. You can also set the attribute include-value to true if you want the item event to contain the entry values. You can set local to true if you want to listen to the entries on the local member.

  • split-brain-protection-ref: Name of the split-brain protection configuration that you want this MultiMap to use. See the Split-Brain Protection for MultiMap and TransactionalMultiMap section.

7.4.3. Split-Brain Protection for MultiMap and TransactionalMultiMap

MultiMap & TransactionalMultiMap can be configured to check for a minimum number of available members before applying their operations (see the Split-Brain Protection section). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

The following is a list of methods that support split-brain protection checks. The list is grouped by the protection types.

MultiMap:

  • WRITE, READ_WRITE:

    • clear

    • forceUnlock

    • lock

    • put

    • remove

    • tryLock

    • unlock

  • READ, READ_WRITE:

    • containsEntry

    • containsKey

    • containsValue

    • entrySet

    • get

    • isLocked

    • keySet

    • localKeySet

    • size

    • valueCount

    • values

TransactionalMultiMap:

  • WRITE, READ_WRITE:

    • put

    • remove

  • READ, READ_WRITE:

    • size

    • get

    • valueCount

Configuring Split-Brain Protection

Split-brain protection for MultiMap can be configured programmatically using the method setSplitBrainProtectionName(), or declaratively using the element split-brain-protection-ref. Following is an example declarative configuration:

<hazelcast>
    ...
    <multimap name="default">
        <split-brain-protection-ref>splitbrainprotection-name</split-brain-protection-ref>
    </multimap>
    ...
</hazelcast>

The value of split-brain-protection-ref should be the split-brain protection configuration name which you configured under the split-brain-protection element as explained in the Split-Brain Protection section.

7.5. Set

Hazelcast Set (ISet) is a distributed and concurrent implementation of java.util.Set. It has the following features:

  • Hazelcast Set does not allow duplicate elements.

  • Hazelcast Set does not preserve the order of elements.

  • Hazelcast Set is a non-partitioned data structure: all the data that belongs to a set lives on one single partition in that member.

  • Hazelcast Set cannot be scaled beyond the capacity of a single machine. Since the whole set lives on a single partition, storing a large amount of data on a single set may cause memory pressure. Therefore, you should use multiple sets to store a large amount of data. This way, all the sets are spread across the cluster, sharing the load.

  • A backup of Hazelcast Set is stored on a partition of another member in the cluster so that data is not lost in the event of a primary member failure.

  • All items are copied to the local member and iteration occurs locally.

  • The equals method implemented in Hazelcast Set uses a serialized byte version of objects, as opposed to java.util.HashSet.

7.5.1. Getting a Set and Putting Items

Use the HazelcastInstances getSet method to get the Set, then use the add method to put items into it.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
ISet<String> set = hz.getSet("set");
set.add("Tokyo");
set.add("Paris");
set.add("London");
set.add("New York");
System.out.println("Putting finished!");

Hazelcast Set uses ItemListener to listen to events that occur when items are added to and removed from the Set. See the Listening for Item Events section for information on how to create an item listener class and register it.

7.5.2. Configuring Set

The following are the example Hazelcast Set configurations.

Declarative Configuration:

<hazelcast>
    ...
    <set name="default">
        <backup-count>1</backup-count>
        <async-backup-count>0</async-backup-count>
        <max-size>10</max-size>
        <item-listeners>
            <item-listener>com.hazelcast.examples.ItemListener</item-listener>
        </item-listeners>
        <split-brain-protection-ref>splitbrainprotection-name</split-brain-protection-ref>
    </set>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
CollectionConfig collectionSet = config.getSetConfig("MySet");
collectionSet.setBackupCount(1)
        .setMaxSize(10)
        .setSplitBrainProtectionName("splitbrainprotectionname");

Hazelcast Set configuration has the following elements:

  • statistics-enabled: True (default) if statistics gathering is enabled on the Set, false otherwise.

  • backup-count: Count of synchronous backups. Set is a non-partitioned data structure, so all entries of a Set reside in one partition. When this parameter is '1', it means there will be one backup of that Set in another member in the cluster. When it is '2', two members will have the backup.

  • async-backup-count: Count of asynchronous backups.

  • max-size: The maximum number of entries for this Set. It can be any number between 0 and Integer.MAX_VALUE. Its default value is 0, meaning there is no capacity constraint.

  • item-listeners: Lets you add listeners (listener classes) for the list items. You can also set the attributes include-value to true if you want the item event to contain the item values. You can set local to true if you want to listen to the items on the local member.

  • split-brain-protection-ref: Name of the split-brain protection configuration that you want this Set to use. See the Split-Brain Protection for ISet and TransactionalSet section.

7.5.3. Split-Brain Protection for ISet and TransactionalSet

ISet & TransactionalSet can be configured to check for a minimum number of available members before applying queue operations (see the Split-Brain Protection section). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

The following is a list of methods, grouped by the protection types, that support split-brain protection checks:

ISet:

  • WRITE, READ_WRITE:

    • add

    • addAll

    • clear

    • remove

    • removeAll

  • READ, READ_WRITE:

    • contains

    • containsAll

    • isEmpty

    • iterator

    • size

    • toArray

TransactionalSet:

  • WRITE, READ_WRITE:

    • add

    • remove

  • READ, READ_WRITE:

    • size

Configuring Split-Brain Protection

Split-brain protection for ISet can be configured programmatically using the method setSplitBrainProtectionName(), or declaratively using the element split-brain-protection-ref. The following is an example declarative configuration:

<hazelcast>
    ...
    <set name="default">
        <split-brain-protection-ref>splitbrainprotection-name</split-brain-protection-ref>
    </set>
    ...
</hazelcast>

The value of split-brain-protection-ref should be the split-brain protection configuration name which you configured under the split-brain-protection element as explained in the Split-Brain Protection section.

7.6. List

Hazelcast List (IList) is similar to Hazelcast Set, but it also allows duplicate elements.

  • Besides allowing duplicate elements, Hazelcast List preserves the order of elements.

  • Hazelcast List is a non-partitioned data structure where values and each backup are represented by their own single partition.

  • Hazelcast List cannot be scaled beyond the capacity of a single machine.

  • All items are copied to local and iteration occurs locally.


While IMap and ICache are the recommended data structures to be used by Hazelcast Jet, IList can also be used by it for unit testing or similar non-production situations. See here in the Hazelcast Jet Reference Manual to learn how Jet can use IList, e.g., how it can fill IList with data, consume it in a Jet job and drain the results to another IList. See also the Fast Batch Processing and Real-Time Stream Processing use cases for Hazelcast Jet.

7.6.1. Getting a List and Putting Items

Use the HazelcastInstances getList method to get the List, then use the add method to put items into it.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
IList<String> list = hz.getList("list");
list.add("Tokyo");
list.add("Paris");
list.add("London");
list.add("New York");
System.out.println("Putting finished!");

Hazelcast List uses ItemListener to listen to events that occur when items are added to and removed from the List. See the Listening for Item Events section for information on how to create an item listener class and register it.

7.6.2. Configuring List

The following are the example Hazelcast List configurations.

Declarative Configuration:

<hazelcast>
    ...
    <list name="default">
        <backup-count>1</backup-count>
        <async-backup-count>0</async-backup-count>
        <max-size>10</max-size>
        <item-listeners>
            <item-listener>
                com.hazelcast.examples.ItemListener
            </item-listener>
        </item-listeners>
        <split-brain-protection-ref>splitbrainprotection-name</split-brain-protection-ref>
    </list>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
CollectionConfig collectionList = config.getListConfig("MyList");
collectionList.setBackupCount(1)
        .setMaxSize(10)
        .setSplitBrainProtectionName("splitbrainprotectionname");

Hazelcast List configuration has the following elements:

  • statistics-enabled: True (default) if statistics gathering is enabled on the list, false otherwise.

  • backup-count: Number of synchronous backups. List is a non-partitioned data structure, so all entries of a List reside in one partition. When this parameter is '1', there will be one backup of that List in another member in the cluster. When it is '2', two members will have the backup.

  • async-backup-count: Number of asynchronous backups.

  • max-size: The maximum number of entries for this List.

  • item-listeners: Lets you add listeners (listener classes) for the list items. You can also set the attribute include-value to true if you want the item event to contain the item values. You can set the attribute local to true if you want to listen the items on the local member.

  • split-brain-protection-ref: Name of the split-brain protection configuration that you want this List to use. See the Split-Brain Protection for IList and TransactionalList section.

7.6.3. Split-Brain Protection for IList and TransactionalList

IList & TransactionalList can be configured to check for a minimum number of available members before applying queue operations (see the Split-Brain Protection section). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

The following is a list of methods, grouped by the protection types, that support split-brain protection checks:

IList:

  • WRITE, READ_WRITE:

    • add

    • addAll

    • clear

    • remove

    • removeAll

    • set

  • READ, READ_WRITE:

    • add

    • contains

    • containsAll

    • get

    • indexOf

    • isEmpty

    • iterator

    • lastIndexOf

    • listIterator

    • size

    • subList

    • toArray

TransactionalList:

  • WRITE, READ_WRITE:

    • add

    • remove

  • READ, READ_WRITE:

    • size

Configuring Split-Brain Protection

Split-brain protection for IList can be configured programmatically using the method setSplitBrainProtectionName(), or declaratively using the element split-brain-protection-ref. Following is an example declarative configuration:

<hazelcast>
    ...
    <list name="default">
        <split-brain-protection-ref>splitbrainprotection-name</split-brain-protection-ref>
    </list>
    ...
</hazelcast>

The value of split-brain-protection-ref should be the split-brain protection configuration name which you configured under the split-brain-protection element as explained in the Split-Brain Protection section.

7.7. Ringbuffer

Hazelcast Ringbuffer is a replicated but not partitioned data structure that stores its data in a ring-like structure. You can think of it as a circular array with a given capacity. Each Ringbuffer has a tail and a head. The tail is where the items are added and the head is where the items are overwritten or expired. You can reach each element in a Ringbuffer using a sequence ID, which is mapped to the elements between the head and tail (inclusive) of the Ringbuffer.

7.7.1. Getting a Ringbuffer and Reading Items

Reading from Ringbuffer is simple: get the Ringbuffer with the HazelcastInstance getRingbuffer method, get its current head with the headSequence method and start reading. Use the method readOne to return the item at the given sequence; readOne blocks if no item is available. To read the next item, increment the sequence by one.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
Ringbuffer<String> ringbuffer = hz.getRingbuffer("rb");
long sequence = ringbuffer.headSequence();
while(true){
    String item = ringbuffer.readOne(sequence);
    sequence++;
    // process item
}

By exposing the sequence, you can now move the item from the Ringbuffer as long as the item is still available. If the item is not available any longer, StaleSequenceException is thrown.

7.7.2. Adding Items to a Ringbuffer

Adding an item to a Ringbuffer is also easy with the Ringbuffer add method:

Ringbuffer<String> ringbuffer = hz.getRingbuffer("ExampleRB");
ringbuffer.add("someitem");

Use the method add to return the sequence of the inserted item; the sequence value is always unique. You can use this as a very cheap way of generating unique IDs if you are already using Ringbuffer.

7.7.3. IQueue vs. Ringbuffer

Hazelcast Ringbuffer can sometimes be a better alternative than an Hazelcast IQueue. Unlike IQueue, Ringbuffer does not remove the items, it only reads items using a certain position. There are many advantages to this approach as described below:

  • The same item can be read multiple times by the same thread. This is useful for realizing semantics of read-at-least-once or read-at-most-once.

  • The same item can be read by multiple threads. Normally you could use an IQueue per thread for the same semantic, but this is less efficient because of the increased remoting. A take from an IQueue is destructive, so the change needs to be applied for backup also, which is why a queue.take() is more expensive than a ringBuffer.read(…​).

  • Reads are extremely cheap since there is no change in the Ringbuffer. Therefore no replication is required.

  • Reads and writes can be batched to speed up performance. Batching can dramatically improve the performance of Ringbuffer.

7.7.4. Configuring Ringbuffer Capacity

By default, a Ringbuffer is configured with a capacity of 10000 items. This creates an array with a size of 10000. If a time-to-live is configured, then an array of longs is also created that stores the expiration time for every item. In a lot of cases you may want to change this capacity number to something that better fits your needs.

Below is a declarative configuration example of a Ringbuffer with a capacity of 2000 items.

<hazelcast>
    ...
    <ringbuffer name="rb">
        <capacity>2000</capacity>
    </ringbuffer>
    ...
</hazelcast>

Currently, Hazelcast Ringbuffer is not a partitioned data structure; its data is stored in a single partition and the replicas are stored in another partition. Therefore, create a Ringbuffer that can safely fit in a single cluster member.

7.7.5. Backing Up Ringbuffer

Hazelcast Ringbuffer has a single synchronous backup by default. You can control the Ringbuffer backup just like most of the other Hazelcast distributed data structures by setting the synchronous and asynchronous backups: backup-count and async-backup-count. In the example below, a Ringbuffer is configured with no synchronous backups and one asynchronous backup:

<hazelcast>
    ...
    <ringbuffer name="rb">
        <backup-count>0</backup-count>
        <async-backup-count>1</async-backup-count>
    </ringbuffer>
    ...
</hazelcast>

An asynchronous backup probably gives you better performance. However, there is a chance that the item added will be lost when the member owning the primary crashes before the backup could complete. You may want to consider batching methods if you need high performance but do not want to give up on consistency.

7.7.6. Configuring Ringbuffer Time-To-Live

You can configure Hazelcast Ringbuffer with a time-to-live in seconds. Using this setting, you can control how long the items remain in the Ringbuffer before they are expired. By default, the time-to-live is set to 0, meaning that unless the item is overwritten, it will remain in the Ringbuffer indefinitely. If you set a time-to-live and an item is added, then, depending on the Overflow Policy, either the oldest item is overwritten, or the call is rejected.

In the example below, a Ringbuffer is configured with a time-to-live of 180 seconds.

<hazelcast>
    ...
    <ringbuffer name="rb">
        <time-to-live-seconds>180</time-to-live-seconds>
    </ringbuffer>
    ...
</hazelcast>

7.7.7. Setting Ringbuffer Overflow Policy

Using the overflow policy, you can determine what to do if the oldest item in the Ringbuffer is not old enough to expire when more items than the configured Ringbuffer capacity are being added. The below options are currently available:

  • OverflowPolicy.OVERWRITE: The oldest item is overwritten.

  • OverflowPolicy.FAIL: The call is aborted. The methods that make use of the OverflowPolicy return -1 to indicate that adding the item has failed.

Overflow policy gives you fine control on what to do if the Ringbuffer is full. You can also use the overflow policy to apply a back pressure mechanism. The following example code shows the usage of an exponential backoff.

Random random = new Random();
HazelcastInstance hz = Hazelcast.newHazelcastInstance();
Ringbuffer<Long> rb = hz.getRingbuffer("rb");

long i = 100;
while (true) {
    long sleepMs = 100;
    for (; ; ) {
        long result = rb.addAsync(i, OverflowPolicy.FAIL).toCompletableFuture().get();
        if (result != -1) {
            break;
        }
        TimeUnit.MILLISECONDS.sleep(sleepMs);
        sleepMs = min(5000, sleepMs * 2);
    }

    // add a bit of random delay to make it look a bit more realistic
    Thread.sleep(random.nextInt(10));

    System.out.println("Written: " + i);
    i++;
}

7.7.8. Ringbuffer with Persistent Datastore

Hazelcast allows you to load and store the Ringbuffer items from/to a persistent datastore using the interface RingbufferStore. If a Ringbuffer store is enabled, each item added to the Ringbuffer will also be stored at the configured Ringbuffer store.

If the Ringbuffer store is configured, you can get items with sequences which are no longer in the actual Ringbuffer but are only in the Ringbuffer store. This is probably much slower but still allows you to continue consuming items from the Ringbuffer even if they are overwritten with newer items in the Ringbuffer.

When a Ringbuffer is being instantiated, it checks if the Ringbuffer store is configured and requests the latest sequence in the Ringbuffer store. This is to enable the Ringbuffer to start with sequences larger than the ones in the Ringbuffer store. In this case, the Ringbuffer is empty but you can still request older items from it (which will be loaded from the Ringbuffer store).

The Ringbuffer store stores items in the same format as the Ringbuffer. If the BINARY in-memory format is used, the Ringbuffer store must implement the interface RingbufferStore<byte[]> meaning that the Ringbuffer receives items in the binary format. If the OBJECT in-memory format is used, the Ringbuffer store must implement the interface RingbufferStore<K>, where K is the type of item being stored (meaning that the Ringbuffer store receives the deserialized object).

When adding items to the Ringbuffer, the method storeAll allows you to store items in batches.

The following example class includes all of the RingbufferStore methods.

public class TheRingbufferObjectStore implements RingbufferStore<Item> {

    @Override
    public void store(long sequence, Item data) {
        System.out.println("Object store");
    }

    @Override
    public void storeAll(long firstItemSequence, Item[] items) {
        System.out.println("Object store all");
    }

    @Override
    public Item load(long sequence) {
        System.out.println("Object load");
        return null;
    }

    @Override
    public long getLargestSequence() {
        System.out.println("Object get largest sequence");
        return -1;
    }
}

Item must be serializable. The following is an example of a Ringbuffer with the Ringbuffer store configured and enabled.

<hazelcast>
    ...
    <ringbuffer name="default">
        <capacity>10000</capacity>
        <time-to-live-seconds>30</time-to-live-seconds>
        <backup-count>1</backup-count>
        <async-backup-count>0</async-backup-count>
        <in-memory-format>BINARY</in-memory-format>
        <ringbuffer-store>
            <class-name>com.hazelcast.RingbufferStoreImpl</class-name>
        </ringbuffer-store>
    </ringbuffer>
    ...
</hazelcast>

The following are the explanations for the Ringbuffer store configuration elements:

  • class-name: Name of the class implementing the `RingbufferStore interface.

  • factory-class-name: Name of the class implementing the RingbufferStoreFactory interface. This interface allows a factory class to be registered instead of a class implementing the RingbufferStore interface.

Either the class-name or the factory-class-name element should be used.

7.7.9. Configuring Ringbuffer In-Memory Format

You can configure Hazelcast Ringbuffer with an in-memory format that controls the format of the Ringbuffer’s stored items. By default, BINARY in-memory format is used, meaning that the object is stored in a serialized form. You can select the OBJECT in-memory format, which is useful when filtering is applied or when the OBJECT in-memory format has a smaller memory footprint than BINARY.

In the declarative configuration example below, a Ringbuffer is configured with the OBJECT in-memory format:

<hazelcast>
    ...
    <ringbuffer name="rb">
        <in-memory-format>OBJECT</in-memory-format>
    </ringbuffer>
    ...
</hazelcast>

7.7.10. Configuring Split-Brain Protection for Ringbuffer

Ringbuffer can be configured to check for a minimum number of available members before applying Ringbuffer operations. This is a check to avoid performing successful Ringbuffer operations on all parts of a cluster during a network partition and can be configured using the element split-brain-protection-ref. You should set this element’s value as the quorum’s name, which you configured under the split-brain-protection element as explained in the Split-Brain Protection section. Following is an example snippet:

<hazelcast>
    ...
    <ringbuffer name="rb">
        <split-brain-protection-ref>quorumname</split-brain-protection-ref>
    </ringbuffer>
    ...
</hazelcast>

The following is a list of methods, grouped by the protection types, that support split-brain protection checks:

  • WRITE, READ_WRITE:

    • add

    • addAllAsync

    • addAsync

  • READ, READ_WRITE:

    • capacity

    • headSequence

    • readManyAsync

    • readOne

    • remainingCapacity

    • size

    • tailSequence

7.7.11. Adding Batched Items

In the previous examples, the method ringBuffer.add() is used to add an item to the Ringbuffer. The problems with this method are that it always overwrites and that it does not support batching. Batching can have a huge impact on the performance. You can use the method addAllAsync to support batching.

See the following example code.

List<String> items = Arrays.asList("1","2","3");
CompletionStage<Long> s = rb.addAllAsync(items, OverflowPolicy.OVERWRITE);
// block until all items are added
s.toCompletableFuture().join();

In the above case, three strings are added to the Ringbuffer using the policy OverflowPolicy.OVERWRITE. See the Overflow Policy section for more information.

7.7.12. Reading Batched Items

In the previous example, the readOne method read items from the Ringbuffer. readOne is simple but not very efficient for the following reasons:

  • readOne does not use batching.

  • readOne cannot filter items at the source; the items need to be retrieved before being filtered.

The method readManyAsync can read a batch of items and can filter items at the source.

See the following example code.

CompletionStage<ReadResultSet<E>> readManyAsync(
    long startSequence,
    int minCount,
    int maxCount,
    IFunction<E, Boolean> filter);

The meanings of the readManyAsync arguments are given below:

  • startSequence: Sequence of the first item to read.

  • minCount: Minimum number of items to read. If you do not want to block, set it to 0. If you want to block for at least one item, set it to 1.

  • maxCount: Maximum number of the items to retrieve. Its value cannot exceed 1000.

  • filter: A function that accepts an item and checks if it should be returned. If no filtering should be applied, set it to null.

A full example is given below.

long sequence = rb.headSequence();
for(;;) {
    CompletionStage<ReadResultSet<String>> f = rb.readManyAsync(sequence, 1, 10, null);
    CompletionStage<Integer> readCountStage = f.thenApplyAsync(rs -> {
        for (String s : rs) {
            System.out.println(s);
        }
        return rs.readCount();
    });
    sequence += readCountStage.toCompletableFuture().join();
}

Please take a careful look at how your sequence is being incremented. You cannot always rely on the number of items being returned if the items are filtered out.

There is not any filtering applied in the above example. The following example shows how you can apply a filter when reading batched items. First, let’s create our filter as shown below:

public class FruitFilter implements IFunction<String, Boolean> {
    public FruitFilter() {}

    public Boolean apply(String s) {
        return s.startsWith("a");
    }
}

So, the FruitFilter checks whether a String object starts with the letter "a". You can see this filter in action in the below example:

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
Ringbuffer<String> rb = hz.getRingbuffer("rb");

rb.add("apple");
rb.add("orange");
rb.add("pear");
rb.add("peach");
rb.add("avocado");

long sequence = rb.headSequence();
CompletableFuture<ReadResultSet<String>> f = rb.readManyAsync(sequence, 2, 5, new FruitFilter()).toCompletableFuture();

ReadResultSet<String> rs = f.join();
for (String s : rs) {
    System.out.println(s);
}

7.7.13. Using Async Methods

Hazelcast Ringbuffer provides asynchronous methods for more powerful operations like batched writing or batched reading with filtering. To wait for the result of the operation in a blocking way, obtain a CompletableFuture from the returned CompletionStage by invoking CompletionStage#toCompletableFuture() method, then use either CompletableFuture#get() or CompletableFuture#join().

See the following example code.

CompletionStage<Long> f = ringbuffer.addAsync(item, OverflowPolicy.FAIL);
f.toCompletableFuture().get();

However, you can also use CompletionStage API to add subsequent dependent computation stages which will be executed when the operation has completed. This way the thread used for the call is not blocked until the response is returned.

See the below code as an example of when you want to get notified when a batch of reads has completed.

CompletionStage<ReadResultSet<String>> stage = rb.readManyAsync(sequence, min, max, someFilter);
stage.whenCompleteAsync((response, throwable) -> {
    if (throwable == null) {
         for (String s : response) {
             System.out.println("Received:" + s);
         }
    } else {
        throwable.printStackTrace();
    }
});

7.7.14. Ringbuffer Configuration Examples

The following shows the declarative configuration of a Ringbuffer called rb. The configuration is modeled after the Ringbuffer defaults.

<hazelcast>
    ...
    <ringbuffer name="rb">
        <capacity>10000</capacity>
        <backup-count>1</backup-count>
        <async-backup-count>0</async-backup-count>
        <time-to-live-seconds>0</time-to-live-seconds>
        <in-memory-format>BINARY</in-memory-format>
        <split-brain-protection-ref>quorumname</split-brain-protection-ref>
    </ringbuffer>
    ...
</hazelcast>

You can also configure a Ringbuffer programmatically. The following is a programmatic version of the above declarative configuration.

Config config = new Config();
RingbufferConfig rbConfig = config.getRingbufferConfig("myRB");
rbConfig.setCapacity(10000)
        .setBackupCount(1)
        .setAsyncBackupCount(0)
        .setTimeToLiveSeconds(0)
        .setInMemoryFormat(InMemoryFormat.BINARY)
        .setSplitBrainProtectionName("splitbrainprotectionname");

7.8. Topic

Hazelcast provides a distribution mechanism for publishing messages that are delivered to multiple subscribers. This is also known as a publish/subscribe (pub/sub) messaging model. Publishing and subscribing operations are cluster wide. When a member subscribes to a topic, it is actually registering for messages published by any member in the cluster, including the new members that joined after you add the listener.

Publish operation is async. It does not wait for operations to run in remote members; it works as fire and forget.

7.8.1. Getting a Topic and Publishing Messages

Use the HazelcastInstance’s getTopic method to get the topic, then use the topic’s publish method to publish your messages. The following is an example publisher:

public class TopicPublisher {

    public static void main(String[] args) {

        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        ITopic<Date> topic = hz.getTopic("topic");
        topic.publish(new Date());
    }
}

And here is an example subscriber:

public class TopicSubscriber {

    public static void main(String[] args) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        ITopic<Date> topic = hz.getTopic("topic");
        topic.addMessageListener(new MessageListenerImpl());
        System.out.println("Subscribed");
    }

    private static class MessageListenerImpl implements MessageListener<Date> {
        public void onMessage(Message<Date> m) {
            System.out.println("Received: " + m.getMessageObject());
        }
    }
}

Hazelcast Topic uses the MessageListener interface to listen for events that occur when a message is received. See the Listening for Topic Messages section for information on how to create a message listener class and register it.

7.8.2. Getting Topic Statistics

Topic has two statistic variables that you can query. These values are incremental and local to the member.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
ITopic<Object> myTopic = hazelcastInstance.getTopic( "myTopicName" );

myTopic.getLocalTopicStats().getPublishOperationCount();
myTopic.getLocalTopicStats().getReceiveOperationCount();

getPublishOperationCount and getReceiveOperationCount returns the total number of published and received messages since the start of this member, respectively. Note that these values are not backed up, so if the member goes down, these values will be lost.

You can disable this feature with topic configuration. See the Configuring Topic section.

These statistics values can be also viewed in Management Center. See the Monitoring Topics section in Hazelcast Management Center Reference Manual.

7.8.3. Understanding Topic Behavior

Each cluster member has a list of all registrations in the cluster. When a new member is registered for a topic, it sends a registration message to all members in the cluster. Also, when a new member joins the cluster, it receives all registrations made so far in the cluster.

The behavior of a topic varies depending on the value of the configuration parameter globalOrderEnabled.

Ordering Messages as Published

If globalOrderEnabled is disabled, messages are not ordered and listeners (subscribers) process the messages in the order that the messages are published. If cluster member M publishes messages m1, m2, m3, …​, mn to a topic T, then Hazelcast makes sure that all of the subscribers of topic T receive and process m1, m2, m3, …​, mn in the given order.

Here is how it works: Let’s say that we have three members (member1, member2 and member3) and that member1 and member2 are registered to a topic named news. Note that all three members know that member1 and member2 are registered to news.

In this example, member1 publishes two messages: a1 and a2. Member3 publishes two messages: c1 and c2. When member1 and member3 publish a message, they check their local list for registered members, discover that member1 and member2 are in their lists, and then they fire messages to those members. One possible order of the messages received could be the following.

member1c1, a1, a2, c2

member2c1, c2, a1, a2

Ordering Messages for Members

If globalOrderEnabled is enabled, all members listening to the same topic get its messages in the same order.

Here is how it works. Let’s say that we have three members (member1, member2 and member3) and that member1 and member2 are registered to a topic named news. Note that all three members know that member1 and member2 are registered to news.

In this example, member1 publishes two messages: a1 and a2. Member3 publishes two messages: c1 and c2. When a member publishes messages over the topic news, it first calculates which partition the news ID corresponds to. Then it sends an operation to the owner of the partition for that member to publish messages. Let’s assume that news corresponds to a partition that member2 owns. member1 and member3 first sends all messages to member2. Assume that the messages are published in the following order:

member1a1, c1, a2, c2

member2 then publishes these messages by looking at registrations in its local list. It sends these messages to member1 and member2 (it makes a local dispatch for itself).

member1a1, c1, a2, c2

member2a1, c1, a2, c2

This way we guarantee that all members see the events in the same order.

Keeping Generated and Published Order the Same

In both cases, there is a StripedExecutor in EventService that is responsible for dispatching the received message. For all events in Hazelcast, the order that events are generated and the order they are published to the user are guaranteed to be the same via this StripedExecutor.

In StripedExecutor, there are as many threads as are specified in the property hazelcast.event.thread.count (default is five). For a specific event source (for a particular topic name), hash of that source’s name % 5 gives the ID of the responsible thread. Note that there can be another event source (entry listener of a map, item listener of a collection, etc.) corresponding to the same thread. In order not to make other messages to block, heavy processing should not be done in this thread. If there is time-consuming work that needs to be done, the work should be handed over to another thread. See the Getting a Topic and Publishing Messages section.

7.8.4. Configuring Topic

To configure a topic, set the topic name, decide on statistics and global ordering, and set the message listeners. The following are the default values:

  • global-ordering is false, meaning that by default, there is no guarantee of global order.

  • statistics is true, meaning that by default, statistics are calculated.

You can see the example configuration snippets below.

Declarative Configuration:

<hazelcast>
    ...
    <topic name="yourTopicName">
        <global-ordering-enabled>true</global-ordering-enabled>
        <statistics-enabled>true</statistics-enabled>
        <message-listeners>
            <message-listener>MessageListenerImpl</message-listener>
        </message-listeners>
    </topic>
    ...
</hazelcast>

Programmatic Configuration:

TopicConfig topicConfig = new TopicConfig();
topicConfig.setGlobalOrderingEnabled( true );
topicConfig.setStatisticsEnabled( true );
topicConfig.setName( "yourTopicName" );
MessageListener<String> implementation = new MessageListener<String>() {
    @Override
    public void onMessage( Message<String> message ) {
        // process the message
    }
};
topicConfig.addMessageListenerConfig( new ListenerConfig( implementation ) );
HazelcastInstance instance = Hazelcast.newHazelcastInstance();

Topic configuration has the following elements:

  • statistics-enabled: Specifies whether the statistics gathering is enabled for your topic. If set to false, you cannot collect statistics in your implementation (using getLocalTopicStats()) and also Hazelcast Management Center will not show them. Its default value is true.

  • global-ordering-enabled: Default is false, meaning there is no global order guarantee.

  • message-listeners: Lets you add listeners (listener classes) for the topic messages.

Besides the above elements, there are the following system properties that are topic related but not topic specific:

  • hazelcast.event.queue.capacity with a default value of 1,000,000

  • hazelcast.event.queue.timeout.millis with a default value of 250

  • hazelcast.event.thread.count with a default value of 5

For the descriptions of these parameters, see the Global Event Configuration section.

7.9. Reliable Topic

Reliable Topic uses the same ITopic interface as a regular topic. The main difference is that Reliable Topic is backed up by the Ringbuffer data structure. The following are the advantages of this approach:

  • Events are not lost since the Ringbuffer is configured with one synchronous backup by default.

  • Each Reliable ITopic gets its own Ringbuffer; if a topic has a very fast producer, it will not lead to problems at topics that run at a slower pace.

  • Since the event system behind a regular ITopic is shared with other data structures, e.g., collection listeners, you can run into isolation problems. This does not happen with the Reliable ITopic.

Here is an example for a publisher using Reliable Topic:

public class PublisherMember {
    public static void main(String[] args) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        Random random = new Random();
        ITopic<Long> topic = hz.getReliableTopic("sometopic");
        long messageId = 0;

        while (true) {
            topic.publish(messageId);
            messageId++;
            System.out.println("Written: " + messageId);
            sleepMillis(random.nextInt(100));
        }
    }
    public static boolean sleepMillis(int millis) {
        try {
            MILLISECONDS.sleep(millis);
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            return false;
        }
        return true;
    }
}

And the following is an example for the subscriber:

public class SubscribedMember {

    public static void main(String[] args) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        ITopic<Long> topic = hz.getReliableTopic("sometopic");
        topic.addMessageListener(new MessageListenerImpl());
    }

    private static class MessageListenerImpl implements MessageListener<Long> {
        public void onMessage(Message<Long> m) {
            System.out.println("Received: " + m.getMessageObject());
        }
    }
}

When you create a Reliable Topic, Hazelcast automatically creates a Ringbuffer for it. You may configure this Ringbuffer by adding a Ringbuffer config with the same name as the Reliable Topic. For instance, if you have a Reliable Topic with the name "sometopic", you should add a Ringbuffer config with the name "sometopic" to configure the backing Ringbuffer. Some of the things that you may configure are the capacity, the time-to-live for the topic messages, and you can even add a Ringbuffer store which allows you to have a persistent topic. By default, a Ringbuffer does not have any TTL (time-to-live) and it has a limited capacity; you may want to change that configuration. The following is an example configuration for the "sometopic" given above.

<hazelcast>
    ...
    <!-- This is the ringbuffer that is used by the 'sometopic' Reliable-topic. As you can see the
         ringbuffer has the same name as the topic. -->
    <ringbuffer name="sometopic">
        <capacity>1000</capacity>
        <time-to-live-seconds>5</time-to-live-seconds>
    </ringbuffer>
    <reliable-topic name="sometopic">
        <topic-overload-policy>BLOCK</topic-overload-policy>
    </reliable-topic>
    ...
</hazelcast>

See the Configuring Reliable Topic section below for the descriptions of all Reliable Topic configuration elements.

By default, the Reliable ITopic uses a shared thread pool. If you need a better isolation, you can configure a custom executor on the ReliableTopicConfig.

Because the reads on a Ringbuffer are not destructive, batching is easy to apply. ITopic uses read batching and reads ten items at a time (if available) by default. See Reading Batched Items for more information.

7.9.1. Slow Consumers

The Reliable ITopic provides control and a way to deal with slow consumers. It is unwise to keep events for a slow consumer in memory indefinitely since you do not know when the slow consumer is going to catch up. You can control the size of the Ringbuffer by using its capacity. For the cases when a Ringbuffer runs out of its capacity, you can specify the following policies for the TopicOverloadPolicy configuration:

  • DISCARD_OLDEST: Overwrite the oldest item, even if a TTL is set. In this case the fast producer supersedes a slow consumer.

  • DISCARD_NEWEST: Discard the newest item.

  • BLOCK: Wait until the items are expired in the Ringbuffer.

  • ERROR: Immediately throw TopicOverloadException if there is no space in the Ringbuffer.

7.9.2. Configuring Reliable Topic

The following are example Reliable Topic configurations.

Declarative Configuration:

<hazelcast>
    ...
    <reliable-topic name="default">
        <statistics-enabled>true</statistics-enabled>
        <message-listeners>
            <message-listener>
                ...
            </message-listener>
        </message-listeners>
        <read-batch-size>10</read-batch-size>
        <topic-overload-policy>BLOCK</topic-overload-policy>
    </reliable-topic>
    ...
</hazelcast>

Programmatic Configuration:

Config config = new Config();
ReliableTopicConfig rtConfig = config.getReliableTopicConfig( "default" );
rtConfig.setTopicOverloadPolicy( TopicOverloadPolicy.BLOCK )
    .setReadBatchSize( 10 )
    .setStatisticsEnabled( true );

Reliable Topic configuration has the following elements:

  • statistics-enabled: Specifies whether the statistics gathering is enabled for your Reliable Topic. If set to false, you cannot collect statistics in your implementation and also Hazelcast Management Center will not show them. Its default value is true.

  • message-listener: Message listener class that listens to the messages when they are added or removed.

  • read-batch-size: Minimum number of messages that Reliable Topic tries to read in batches. Its default value is 10.

  • topic-overload-policy: Policy to handle an overloaded topic. Available values are DISCARD_OLDEST, DISCARD_NEWEST, BLOCK and ERROR. Its default value is BLOCK. See Slow Consumers for definitions of these policies.

7.10. FencedLock

FencedLock is a member of CP Subsystem API. For detailed information, see the CP Subsystem chapter.

FencedLock is a linearizable and distributed implementation of java.util.concurrent.locks.Lock, meaning that if you lock using a FencedLock, the critical section that it guards is guaranteed to be executed by only one thread in the entire cluster. Even though locks are great for synchronization, they can lead to problems if not used properly. Also note that Hazelcast Lock does not support fairness.

For detailed information and configuration, see the FencedLock section under the CP Subsystem chapter.

7.10.1. Using Try-Catch Blocks with Locks

Always use locks with try-catch blocks. This ensures that locks are released if an exception is thrown from the code in a critical section. Also note that the lock method is outside the try-catch block because we do not want to unlock if the lock operation itself fails.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

Lock lock = hazelcastInstance.getCPSubsystem().getLock("myLock");
lock.lock();
try {
    // do something here
} finally {
    lock.unlock();
}

7.10.2. Releasing Locks with tryLock Timeout

If a lock is not released in the cluster, another thread that is trying to get the lock can wait forever. To avoid this, use tryLock with a timeout value. You can set a high value (normally it should not take that long) for tryLock. You can check the return value of tryLock as follows:

if ( lock.tryLock ( 10, TimeUnit.SECONDS ) ) {
  try {
    // do some stuff here..
  } finally {
    lock.unlock();
  }
} else {
  // warning
}

7.10.3. Understanding Lock Behavior

  • Locks are fail-safe. If a member holds a lock and some other members go down, the cluster will keep your locks safe and available. Moreover, when a member leaves the cluster, all the locks acquired by that dead member will be removed so that those locks are immediately available for live members.

  • Locks are not automatically removed. If a lock is not used anymore, Hazelcast does not automatically perform garbage collection in the lock. This can lead to an OutOfMemoryError. If you create locks on the fly, make sure they are destroyed.

  • Locks are re-entrant. The same thread can lock multiple times on the same lock. Note that for other threads to be able to require this lock, the owner of the lock must call unlock as many times as the owner called lock.

7.11. IAtomicLong

IAtomicLong is a member of CP Subsystem API. For detailed information, see the CP Subsystem chapter.

Hazelcast IAtomicLong is the distributed implementation of java.util.concurrent.atomic.AtomicLong. It offers most of AtomicLong’s operations such as get, set, getAndSet, compareAndSet and incrementAndGet. Since IAtomicLong is a distributed implementation, these operations involve remote calls and thus their performances differ from AtomicLong.

The following example code creates an instance, increments it by a million and prints the count.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
IAtomicLong counter = hazelcastInstance.getCPSubsystem().getAtomicLong( "counter" );
for ( int k = 0; k < 1000 * 1000; k++ ) {
    if ( k % 500000 == 0 ) {
        System.out.println( "At: " + k );
    }
    counter.incrementAndGet();
}
System.out.printf( "Count is %s\n", counter.get() );

When you start other instances with the code above, you will see the count as member count times a million.

7.11.1. Sending Functions to IAtomicLong

You can send functions to an IAtomicLong. IFunction is a Hazelcast owned, single method interface. The following example IFunction implementation adds two to the original value.

private static class Add2Function implements IFunction<Long, Long> {
    @Override
    public Long apply( Long input ) {
        return input + 2;
    }
}

7.11.2. Executing Functions on IAtomicLong

You can use the following methods to execute functions on IAtomicLong:

  • apply: Applies the function to the value in IAtomicLong without changing the actual value and returning the result.

  • alter: Alters the value stored in the IAtomicLong by applying the function. It does not send back a result.

  • alterAndGet: Alters the value stored in the IAtomicLong by applying the function, storing the result in the IAtomicLong and returning the result.

  • getAndAlter: Alters the value stored in the IAtomicLong by applying the function and returning the original value.

The following example includes these methods.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
IAtomicLong atomicLong = hazelcastInstance.getCPSubsystem().getAtomicLong( "counter" );

atomicLong.set( 1 );
long result = atomicLong.apply( new Add2Function() );
System.out.println( "apply.result: " + result);
System.out.println( "apply.value: " + atomicLong.get() );

atomicLong.set( 1 );
atomicLong.alter( new Add2Function() );
System.out.println( "alter.value: " + atomicLong.get() );

atomicLong.set( 1 );
result = atomicLong.alterAndGet( new Add2Function() );
System.out.println( "alterAndGet.result: " + result );
System.out.println( "alterAndGet.value: " + atomicLong.get() );

atomicLong.set( 1 );
result = atomicLong.getAndAlter( new Add2Function() );
System.out.println( "getAndAlter.result: " + result );
System.out.println( "getAndAlter.value: " + atomicLong.get() );

The output of the above class when run is as follows:

apply.result: 3
apply.value: 1
alter.value: 3
alterAndGet.result: 3
alterAndGet.value: 3
getAndAlter.result: 1
getAndAlter.value: 3

7.11.3. Reasons to Use Functions with IAtomicLong

The reason for using a function instead of a simple code line like atomicLong.set(atomicLong.get() + 2)); is that the IAtomicLong read and write operations are not atomic. Since IAtomicLong is a distributed implementation, those operations can be remote ones, which may lead to race problems. By using functions, the data is not pulled into the code, but the code is sent to the data. This makes it more scalable.

7.12. ISemaphore

ISemaphore is a member of CP Subsystem API. For detailed information, see the CP Subsystem chapter.

Hazelcast ISemaphore is the distributed implementation of java.util.concurrent.Semaphore.

7.12.1. Controlling Thread Counts with Permits

Semaphores offer permits to control the thread counts when performing concurrent activities. To execute a concurrent activity, a thread grants a permit or waits until a permit becomes available. When the execution is completed, the permit is released.

ISemaphore with a single permit may be considered as a lock. Unlike the locks, when semaphores are used, any thread can release the permit depending on the configuration, and semaphores can have multiple permits. For more information, see the Semaphore Configuration section.
Hazelcast ISemaphore does not support fairness at all times. There are some edge cases where the fairness is not honored, e.g., when the permit becomes available at the time when an internal timeout occurs.

When a permit is acquired on ISemaphore:

  • If there are permits, the number of permits in the semaphore is decreased by one and the calling thread performs its activity. If there is contention, the longest waiting thread acquires the permit before all other threads.

  • If no permits are available, the calling thread blocks until a permit becomes available. When a timeout happens during this block, the thread is interrupted.

7.12.2. Example Semaphore Code

The following example code uses an IAtomicLong resource 1000 times, increments the resource when a thread starts to use it and decrements it when the thread completes.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
ISemaphore semaphore = hazelcastInstance.getCPSubsystem().getSemaphore( "semaphore" );
IAtomicLong resource = hazelcastInstance.getCPSubsystem().getAtomicLong( "resource" );
for ( int k = 0 ; k < 1000 ; k++ ) {
    System.out.println( "At iteration: " + k + ", Active Threads: " + resource.get() );
    semaphore.acquire();
    try {
        resource.incrementAndGet();
        Thread.sleep( 1000 );
        resource.decrementAndGet();
    } finally {
        semaphore.release();
    }
}
System.out.println("Finished");

If you execute the above SemaphoreMember class 5 times, the following output appears:

At iteration: 0, Active Threads: 1

At iteration: 1, Active Threads: 2

At iteration: 2, Active Threads: 3

At iteration: 3, Active Threads: 3

At iteration: 4, Active Threads: 3

As you can see, the maximum count of concurrent threads is equal or smaller than three. If you remove the semaphore acquire/release statements in SemaphoreMember, you will see that there is no limitation on the number of concurrent usages.

7.13. IAtomicReference

IAtomicReference is a member of CP Subsystem API. For detailed information, see the CP Subsystem chapter.

The IAtomicLong is very useful if you need to deal with a long, but in some cases you need to deal with a reference. That is why Hazelcast also supports the IAtomicReference which is the distributed version of the java.util.concurrent.atomic.AtomicReference.

Here is an IAtomicReference example.

Config config = new Config();

HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);

IAtomicReference<String> ref = hz.getCPSubsystem().getAtomicReference("reference");
ref.set("foo");
System.out.println(ref.get());
System.exit(0);

When you execute the above example, the output is as follows:

foo

7.13.1. Sending Functions to IAtomicReference

Just like IAtomicLong, IAtomicReference has methods that accept a 'function' as an argument, such as alter, alterAndGet, getAndAlter and apply. There are two big advantages of using these methods:

  • From a performance point of view, it is better to send the function to the data than the data to the function. Often the function is a lot smaller than the data and therefore cheaper to send over the line. Also the function only needs to be transferred once to the target machine and the data needs to be transferred twice.

  • You do not need to deal with concurrency control. If you would perform a load, transform, store, you could run into a data race since another thread might have updated the value you are about to overwrite.

7.13.2. Using IAtomicReference

The following are some considerations you need to know when you use IAtomicReference:

  • IAtomicReference works based on the byte-content and not on the object-reference. If you use the compareAndSet method, do not change to the original value because its serialized content will then be different. It is also important to know that if you rely on Java serialization, sometimes (especially with hashmaps) the same object can result in different binary content.

  • All methods returning an object return a private copy. You can modify the private copy, but the rest of the world is shielded from your changes. If you want these changes to be visible to the rest of the world, you need to write the change back to the IAtomicReference; but be careful about introducing a data-race.

  • The 'in-memory format' of an IAtomicReference is binary. The receiving side does not need to have the class definition available unless it needs to be deserialized on the other side, e.g., because a method like 'alter' is executed. This deserialization is done for every call that needs to have the object instead of the binary content, so be careful with expensive object graphs that need to be deserialized.

  • If you have an object with many fields or an object graph and you only need to calculate some information or need a subset of fields, you can use the apply method. With the apply method, the whole object does not need to be sent over the line; only the information that is relevant is sent.

7.14. ICountDownLatch

ICountDownLatch is a member of CP Subsystem API. For detailed information, see the CP Subsystem chapter.

Hazelcast ICountDownLatch is the distributed implementation of java.util.concurrent.CountDownLatch. But unlike Java’s implementation, Hazelcast’s ICountDownLatch count can be reset after a countdown has finished, but not during an active count.

7.14.1. Gate-Keeping Concurrent Activities

ICountDownLatch is considered to be a gate keeper for concurrent activities. It enables the threads to wait for other threads to complete their operations. The following examples describe the mechanism of ICountDownLatch.

Assume that there is a leader process and there are follower processes that will wait until the leader completes. Here is the leader:

public class Leader {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
        ICountDownLatch latch = hazelcastInstance.getCPSubsystem().getCountDownLatch( "countDownLatch" );
        System.out.println( "Starting" );
        latch.trySetCount( 1 );
        Thread.sleep( 30000 );
        latch.countDown();
        System.out.println( "Leader finished" );
        latch.destroy();
    }
}

Since only a single step is needed to be completed as a sample, the above code initializes the latch with 1. Then, the code sleeps for a while to simulate a process and starts the countdown. Finally, it clears up the latch. Let’s write a follower:

public class Follower {
    public static void main( String[] args ) throws Exception {
        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
        ICountDownLatch latch = hazelcastInstance.getCPSubsystem().getCountDownLatch( "countDownLatch" );
        System.out.println( "Waiting" );
        boolean success = latch.await( 10, TimeUnit.SECONDS );
        System.out.println( "Complete: " + success );
    }
}

The follower class above first retrieves ICountDownLatch and then calls the await method to enable the thread to listen for the latch. The method await has a timeout value as a parameter. This is useful when the countDown method fails. To see ICountDownLatch in action, start the leader first and then start one or more followers. You will see that the followers wait until the leader completes.

7.15. PN Counter

A Conflict-free Replicated Data Type (CRDT) is a distributed data structure that achieves high availability by relaxing consistency constraints. There may be several replicas for the same data and these replicas can be modified concurrently without coordination. This means that you may achieve high throughput and low latency when updating a CRDT data structure. On the other hand, all of the updates are replicated asynchronously. Each replica then receives updates made on other replicas eventually and if no new updates are done, all replicas which can communicate to each other return the same state (converge) after some time.

Hazelcast offers a lightweight CRDT PN counter (Positive-Negative Counter) implementation where each Hazelcast instance can increment and decrement the counter value and these updates are propagated to all replicas. Only a Hazelcast member can store state for a counter which means that counter method invocations performed on a Hazelcast member are usually local (depending on the configured replica count). If there is no member failure, it is guaranteed that each replica sees the final value of the counter eventually. Counter’s state converges with each update and all CRDT replicas that can communicate to each other will eventually have the same state.

Using the PN Counter, you can get a distributed counter, increment and decrement it, and query its value with RYW (read-your-writes) and monotonic reads. The implementation borrows most methods from the AtomicLong which should be familiar in most cases and easily interchangeable in the existing code.

Some examples of PN counter are:

  • counting the number of "likes" or "+1"

  • counting the number of logged in users

  • counting the number of page hits/views.

How it works

The counter supports adding and subtracting values as well as retrieving the current counter value. Each replica of this counter can perform operations locally without coordination with the other replicas, thus increasing availability. The counter guarantees that whenever two members have received the same set of updates, possibly in a different order, their state is identical, and any conflicting updates are merged automatically. If no new updates are made to the shared state, all members that can communicate will eventually have the same data.

The updates to the counter are applied locally when invoked on a CRDT replica. A CRDT replica can be any Hazelcast instance which is NOT a client or a lite member. You can configure the number of replicas in the cluster using the replica-count configuration element.

When invoking updates from a non-replica instance, the invocation is remote. This may lead to indeterminate state - the update may be applied but the response has not been received. In this case, the caller is notified with a TargetDisconnectedException when invoked from a client or a MemberLeftException when invoked from a member.

The read and write methods provide monotonic read and RYW (read-your-write) guarantees. These guarantees are session guarantees which mean that if no replica with the previously observed state is reachable, the session guarantees are lost and the method invocation throws a ConsistencyLostException. This does not mean that an update is lost. All of the updates are part of some replica and eventually reflected in the state of all other replicas. This exception just means that you cannot observe your own writes because all replicas that contain your updates are currently unreachable. After you have received a ConsistencyLostException, you can either wait for a sufficiently up-to-date replica to become reachable in which case the session can be continued or you can reset the session by calling the method `reset(). If you have called this method, a new session is started with the next invocation to a CRDT replica.

The CRDT state is kept entirely on non-lite (data) members. If there aren’t any and the methods here are invoked on a lite member, they fail with a NoDataMemberInClusterException.

The following is an example code.

final HazelcastInstance instance = Hazelcast.newHazelcastInstance();
final PNCounter counter = instance.getPNCounter("counter");
counter.addAndGet(5);
final long value = counter.get();

This code snippet creates an instance of a PN counter, increments it by 5 and retrieves the value.

7.15.1. Configuring PN Counter

Following is an example declarative configuration snippet:

<hazelcast>
    ...
    <pn-counter name="default">
        <replica-count>10</replica-count>
        <statistics-enabled>true</statistics-enabled>
    </pn-counter>
    ...
</hazelcast>

PN Counter has the following configuration elements:

  • name: Name of your PN Counter.

  • replica-count: Number of replicas on which state for this PN counter is kept. This number applies in quiescent state, if there are currently membership changes or clusters are merging, the state may be temporarily kept on more replicas. Its default value is Integer.MAX_VALUE. Generally, keeping the state on more replicas means that more Hazelcast members are able to perform updates locally but it also means that the PN counter state is kept on more replicas, increasing the network traffic, decreasing the speed at which replica states converge and increasing the size of the PN counter state kept on each replica.

  • statistics-enabled: Specifies whether the statistics gathering is enabled for your PN Counter. If set to false, you cannot collect statistics in your implementation (using getLocalPNCounterStats()) and also Hazelcast Management Center will not show them. Its default value is true.

Following is an equivalent snippet of Java configuration:

PNCounterConfig pnCounterConfig = new PNCounterConfig("default")
        .setReplicaCount(10)
        .setStatisticsEnabled(true);
Config hazelcastConfig = new Config()
        .addPNCounterConfig(pnCounterConfig);

7.15.2. Configuring the CRDT Replication Mechanism

Configuring the replication mechanism is for advanced use cases only - usually the default configuration works fine for most cases.

In some cases, you may want to configure the replication mechanism for all CRDT implementations. The CRDT states are replicated in rounds (the period is configurable) and in each round the state is replicated up to the configured number of members. Generally speaking, you may increase the speed at which replicas converge at the expense of more network traffic or decrease the network traffic at the expense of slower convergence of replicas. Hazelcast implements the state-based replication mechanism - the CRDT state for changed CRDTs is replicated in its entirety to other replicas on each replication round.

<hazelcast>
    ...
    <crdt-replication>
        <max-concurrent-replication-targets>1</max-concurrent-replication-targets>
        <replication-period-millis>1000</replication-period-millis>
    </crdt-replication>
    ...
</hazelcast>

CRDT replication has the following configuration elements:

  • max-concurrent-replication-targets: The maximum number of target members that we replicate the CRDT states to in one period. A higher count leads to states being disseminated more rapidly at the expense of burst-like behavior - one update to a CRDT leads to a sudden burst in the number of replication messages in a short time interval. Its default value is 1 which means that each replica replicates state to only one other replica in each replication round.

  • replication-period-millis: The period between two replications of CRDT states in milliseconds. A lower value increases the speed at which changes are disseminated to other cluster members at the expense of burst-like behavior - less updates are batched together in one replication message, and one update to a CRDT may cause a sudden burst of replication messages in a short time interval. The value must be a positive non-null integer. Its default value is 1000 milliseconds which means that the changed CRDT state is replicated every 1 second.

Following is an equivalent snippet of Java configuration:

final CRDTReplicationConfig crdtReplicationConfig = new CRDTReplicationConfig()
        .setMaxConcurrentReplicationTargets(1)
        .setReplicationPeriodMillis(1000);
Config hazelcastConfig = new Config()
        .setCRDTReplicationConfig(crdtReplicationConfig);

7.16. Flake ID Generator

Hazelcast Flake ID Generator is used to generate cluster-wide unique identifiers. Generated identifiers are long primitive values and are k-ordered (roughly ordered). IDs are in the range from 0 to Long.MAX_VALUE.

7.16.1. Generating Cluster-Wide IDs

The IDs contain timestamp component and a node ID component, which is assigned when the member joins the cluster. This allows the IDs to be ordered and unique without any coordination between the members, which makes the generator safe even in split-brain scenarios (for limitations in this case, see the Node ID assignment section below).

Timestamp component is in milliseconds since 1.1.2018, 0:00 UTC and has 41 bits. This caps the useful lifespan of the generator to little less than 70 years (until ~2088). The sequence component is 6 bits. If more than 64 IDs are requested in single millisecond, IDs gracefully overflow to the next millisecond and uniqueness is guaranteed in this case. The implementation does not allow overflowing by more than 15 seconds, if IDs are requested at higher rate, the call blocks. Note, however, that clients are able to generate even faster because each call goes to a different (random) member and the 64 IDs/ms limit is for single member.

7.16.2. Performance

Operation on member is always local, if the member has valid node ID, otherwise it’s remote. On the client, the newId() method goes to a random member and gets a batch of IDs, which is then returned locally for a limited time. The pre-fetch size and the validity time can be configured for each client and member.

7.16.3. Example

Let’s write an example identifier generator.

public class ExampleFlakeIdGenerator {
    public static void main(String[] args) {
        HazelcastInstance hazelcast = Hazelcast.newHazelcastInstance();

        ClientConfig clientConfig = new ClientConfig()
                .addFlakeIdGeneratorConfig(new ClientFlakeIdGeneratorConfig("idGenerator")
                        .setPrefetchCount(10)
                        .setPrefetchValidityMillis(MINUTES.toMillis(10)));
        HazelcastInstance client = HazelcastClient.newHazelcastClient(clientConfig);

        FlakeIdGenerator idGenerator = client.getFlakeIdGenerator("idGenerator");
        for (int i = 0; i < 10000; i++) {
            sleepSeconds(1);
            System.out.printf("Id: %s\n", idGenerator.newId());
        }
    }
}

7.16.4. Node ID Assignment

Flake IDs require a unique node ID to be assigned to each member, from which point the member can generate unique IDs without any coordination. Hazelcast uses the member list version from the moment when the member joined the cluster as a unique node ID.

The join algorithm is specifically designed to ensure that member list join version is unique for each member in the cluster. This ensures that IDs are unique even during network splits, with one caveat: at most one member is allowed to join the cluster during a network split. If two members join different subclusters, they are likely to get the same node ID. This is resolved when the cluster heals, but until then, they can generate duplicate IDs.

Node ID Overflow

Node ID component of the ID has 16 bits. Members with the member list join version higher than 2^16 won’t be able to generate IDs, but functionality is preserved by forwarding to another member. It is possible to generate IDs on any member or client as long as there is at least one member with join version smaller than 2^16 in the cluster. The remedy is to restart the cluster: the node ID component will be reset and assigned starting from zero again. Uniqueness after the restart will be preserved thanks to the timestamp component.

7.16.5. Configuring Flake ID Generator

Following is an example declarative configuration snippet:

<hazelcast>
    ...
    <flake-id-generator name="default">
        <prefetch-count>100</prefetch-count>
        <prefetch-validity-millis>600000</prefetch-validity-millis>
        <id-offset>0</id-offset>
        <node-id-offset>0</node-id-offset>
        <statistics-enabled>true</statistics-enabled>
    </flake-id-generator>
    ...
</hazelcast>

The following are the descriptions of configuration elements and attributes:

  • name: Name of your Flake ID Generator. It is a required attribute.

  • prefetch-count: Count of IDs which are pre-fetched on the background when one call to FlakeIdGenerator.newId() is made. Its value must be in the range 1 -100,000. Its default value is 100. This setting pertains only to newId() calls made on the member that configured it.

  • prefetch-validity-millis: Specifies for how long the pre-fetched IDs can be used. After this time elapses, a new batch of IDs are fetched. Time unit is milliseconds. Its default value is 600,000 milliseconds (10 minutes). The IDs contain a timestamp component, which ensures a rough global ordering of them. If an ID is assigned to an object that was created later, it will be out of order. If ordering is not important, set this value to 0. This setting pertains only to newId() calls made on the member that configured it.

  • id-offset: Specifies the offset that is added to the returned IDs. Its default value is 0. Setting might be useful when migrating from ID Generator. The default value works for all green-field projects. For example, assume the largest ID returned from ID Generator is 150. And, Flake ID Generator now returns 100. If you set this element to 50 and stop using the ID Generator, the next ID from Flake ID Generator will be 151 or larger and no duplicate IDs will be generated. In real-life, the IDs are much larger. You also need to add a reserve to the offset because the IDs from Flake ID Generator are only roughly ordered. Recommended reserve is 2^38, that is 274877906944. Negative values are allowed to increase the lifespan of the generator, however keep in mind that the generated IDs might also be negative.

  • node-id-offset: Specifies the offset that is added to the node ID assigned to cluster member for this generator. Might be useful in A/B deployment scenarios where you have cluster A which you want to upgrade. You create cluster B and for some time both will generate IDs and you want to have them unique. In this case, configure node ID offset for generators on cluster B.

  • statistics-enabled: Specifies whether the statistics gathering is enabled for your Flake ID Generator. If set to false, you cannot collect statistics in your implementation (using getLocalFlakeIdGeneratorStats()) and also Hazelcast Management Center will not show them. Its default value is true.

7.17. Replicated Map

A Replicated Map is a distributed key-value data structure where the data is replicated to all members in the cluster. It provides full replication of entries to all members for high speed access.

The following are the features of Replicated Map:

  • When you have a Replicated Map in the cluster, your clients can communicate with any cluster member.

  • All cluster members are able to perform write operations.

  • It supports all methods of the interface java.util.Map.

  • It supports automatic initial fill up when a new member is started.

  • It provides statistics for entry access, write and update so that you can monitor it using Hazelcast Management Center.

  • New members joining to the cluster pull all the data from the existing members.

  • You can listen to entry events using listeners. See the Using EntryListener on Replicated Map section.

7.17.1. Replicating Instead of Partitioning

A Replicated Map does not partition data (it does not spread data to different cluster members); instead, it replicates the data to all members.

Replication leads to higher memory consumption. However, a Replicated Map has faster read and write access since the data is available on all members.

Writes could take place on local/remote members in order to provide write-order, eventually being replicated to all other members.

Replicated Map is suitable for objects, catalog data, or idempotent calculable data (such as HTML pages). It fully implements the java.util.Map interface, but it lacks the methods from java.util.concurrent.ConcurrentMap since there are no atomic guarantees to writes or reads.

If Replicated Map is used from a unisocket client and this unisocket client is connected to a lite member, the entry listeners cannot be registered/de-registered.
You cannot use Replicated Map from a lite member. A com.hazelcast.replicatedmap.ReplicatedMapCantBeCreatedOnLiteMemberException is thrown if com.hazelcast.core.HazelcastInstance.getReplicatedMap(name) is invoked on a lite member.

7.17.2. Example Replicated Map Code

Here is an example of Replicated Map code. The HazelcastInstance’s getReplicatedMap method gets the Replicated Map, and the Replicated Map’s put method creates map entries.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
Map<String, String> map = hz.getReplicatedMap("map");

map.put("1", "Tokyo");
map.put("2", "Paris");
map.put("3", "New York");

System.out.println("Finished loading map");
hz.shutdown();

HazelcastInstance.getReplicatedMap() returns com.hazelcast.core.ReplicatedMap which, as stated above, extends the java.util.Map interface.

The com.hazelcast.core.ReplicatedMap interface has some additional methods for registering entry listeners or retrieving values in an expected order.

7.17.3. Considerations for Replicated Map

If you have a large cluster or very high occurrences of updates, the Replicated Map may not scale linearly as expected since it has to replicate update operations to all members in the cluster.

Since the replication of updates is performed in an asynchronous manner, we recommend you enable back pressure in case your system has high occurrences of updates. See the Back Pressure section to learn how to enable it.

Replicated Map has an anti-entropy system that converges values to a common one if some of the members are missing replication updates.

Replicated Map does not guarantee eventual consistency because there are some edge cases that fail to provide consistency.

Replicated Map uses the internal partition system of Hazelcast in order to serialize updates happening on the same key at the same time. This happens by sending updates of the same key to the same Hazelcast member in the cluster.

Due to the asynchronous nature of replication, a Hazelcast member could die before successfully replicating a "write" operation to other members after sending the "write completed" response to its caller during the write process. In this scenario, Hazelcast’s internal partition system promotes one of the replicas of the partition as the primary one. The new primary partition does not have the latest "write" since the dead member could not successfully replicate the update. (This leaves the system in a state that the caller is the only one that has the update and the rest of the cluster have not.) In this case even the anti-entropy system simply could not converge the value since the source of true information is lost for the update. This leads to a break in the eventual consistency because different values can be read from the system for the same key.

Other than the aforementioned scenario, the Replicated Map behaves like an eventually consistent system with read-your-writes and monotonic-reads consistency.

7.17.4. Configuration Design for Replicated Map

There are several technical design decisions you should consider when you configure a Replicated Map.

Initial Provisioning

If a new member joins the cluster, there are two ways you can handle the initial provisioning that is executed to replicate all existing values to the new member. Each involves how you configure the async fill up.

First, you can configure async fill up to true, which does not block reads while the fill up operation is underway. That way, you have immediate access on the new member, but it will take time until all the values are eventually accessible. Not yet replicated values are returned as non-existing (null).

Second, you can configure for a synchronous initial fill up (by configuring the async fill up to false), which blocks every read or write access to the map until the fill up operation is finished. Use this with caution since it might block your application from operating.

7.17.5. Configuring Replicated Map

Replicated Map can be configured programmatically or declaratively.

Declarative Configuration:

You can declare your Replicated Map configuration in the Hazelcast configuration file hazelcast.xml. See the following example:

<hazelcast>
    ...
    <replicatedmap name="default">
        <in-memory-format>BINARY</in-memory-format>
        <async-fillup>true</async-fillup>
        <statistics-enabled>true</statistics-enabled>
        <entry-listeners>
            <entry-listener include-value="true">
                com.hazelcast.examples.EntryListener
            </entry-listener>
       </entry-listeners>
       <split-brain-protection-ref>quorumname</split-brain-protection-ref>
    </replicatedmap>
    ...
</hazelcast>

Replicated Map has the following configuration elements:

  • in-memory-format: Internal storage format. See the In-Memory Format section. Its default value is OBJECT.

  • async-fillup: Specifies whether the Replicated Map is available for reads before the initial replication is completed. Its default value is true. If set to false, i.e., synchronous initial fill up, no exception is thrown when the Replicated Map is not yet ready, but null values can be seen until the initial replication is completed.

  • statistics-enabled: Specifies whether the statistics gathering is enabled for your Replicated Map. If set to false, you cannot collect statistics in your implementation (using getLocalReplicatedMapStats()) and also Hazelcast Management Center will not show them. Its default value is true.

  • entry-listener: Full canonical classname of the EntryListener implementation.

    • entry-listener#include-value: Specifies whether the event includes the value or not. Sometimes the key is enough to react on an event. In those situations, setting this value to false saves a deserialization cycle. Its default value is true.

    • entry-listener#local: Not used for Replicated Map since listeners are always local.

  • split-brain-protection-ref: Name of quorum configuration that you want this Replicated Map to use. See the Split-Brain Protection for Replicated Map section.

Programmatic Configuration:

You can configure a Replicated Map programmatically, as you can do for all other data structures in Hazelcast. You must create the configuration upfront, when you nstantiate the HazelcastInstance. A basic example of how to configure the Replicated Map using the programmatic approach is shown in the following snippet.

Config config = new Config();

ReplicatedMapConfig replicatedMapConfig =
        config.getReplicatedMapConfig( "default" );

replicatedMapConfig.setInMemoryFormat( InMemoryFormat.BINARY )
        .setSplitBrainProtectionName( "splitbrainprotectionname" );

All properties that can be configured using the declarative configuration are also available using programmatic configuration by transforming the tag names into getter or setter names.

In-Memory Format on Replicated Map

Currently, you can use the following in-memory-format options with the Replicated Map:

  • OBJECT (default): The data is stored in deserialized form. This configuration is the default choice since the data replication is mostly used for high speed access. Please be aware that changing the values without a Map.put() is not reflected on the other members but is visible on the changing members for later value accesses.

  • BINARY: The data is stored in serialized binary format and has to be deserialized on every request. This option offers higher encapsulation since changes to values are always discarded as long as the newly changed object is not explicitly Map.put() into the map again.

7.17.6. Using EntryListener on Replicated Map

A com.hazelcast.core.EntryListener used on a Replicated Map serves the same purpose as it would on other data structures in Hazelcast. You can use it to react on add, update and remove operations. Replicated Maps do not yet support eviction.

Difference in EntryListener on Replicated Map

The fundamental difference in Replicated Map behavior, compared to the other data structures, is that an EntryListener only reflects changes on local data. Since replication is asynchronous, all listener events are fired only when an operation is finished on a local member. Events can fire at different times on different members.

Example of Replicated Map EntryListener

Here is a code example for using EntryListener on a Replicated Map.

The HazelcastInstance s getReplicatedMap method gets a Replicated Map (customers), and the ReplicatedMap s addEntryListener method adds an entry listener to the Replicated Map. Then, the ReplicatedMap s put method adds a Replicated Map entry and updates it. The method remove removes the entry.

    HazelcastInstance hz = Hazelcast.newHazelcastInstance();
    ReplicatedMap<String, String> map = hz.getReplicatedMap("somemap");
    map.addEntryListener(new MyEntryListener());
    System.out.println("EntryListener registered");
}

private static class MyEntryListener implements EntryListener<String, String> {

    @Override
    public void entryAdded(EntryEvent<String, String> event) {
        System.out.println("entryAdded: " + event);
    }

    @Override
    public void entryRemoved(EntryEvent<String, String> event) {
        System.out.println("entryRemoved: " + event);
    }

    @Override
    public void entryUpdated(EntryEvent<String, String> event) {
        System.out.println("entryUpdated: " + event);
    }

    @Override
    public void entryEvicted(EntryEvent<String, String> event) {
        System.out.println("entryEvicted: " + event);
    }
    @Override
    public void entryExpired(EntryEvent<String, String> event) {
        System.out.println( "Entry expired: " + event );
    }
    @Override
    public void mapEvicted(MapEvent event) {
        System.out.println("mapEvicted:" + event);

    }

    @Override
    public void mapCleared(MapEvent event) {
        System.out.println("mapCleared: " + event);
    }

7.17.7. Split-Brain Protection for Replicated Map

Replicated Map can be configured to check for a minimum number of available members before applying its operations (see the Split-Brain Protection section). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

The following is a list of methods, grouped by the protection types, that support split-brain protection checks:

  • WRITE, READ_WRITE:

    • clear

    • put

    • putAll

    • remove

  • READ, READ_WRITE:

    • containsKey

    • containsValue

    • entrySet

    • get

    • isEmpty

    • keySet

    • size

    • values

Configuring Split-Brain Protection

Split-brain protection for Replicated Map can be configured programmatically using the method setSplitBrainProtectionName(), or declaratively using the element split-brain-protection-ref. Following is an example declarative configuration:

<hazelcast>
    ...
    <replicatedmap name="default">
        <split-brain-protection-ref>quorumname</split-brain-protection-ref>
    </replicatedmap>
    ...
</hazelcast>

The value of split-brain-protection-ref should be the split-brain protection configuration name which you configured under the split-brain-protection element as explained in the Split-Brain Protection section.

7.18. Cardinality Estimator Service

Hazelcast’s cardinality estimator service is a data structure which implements Flajolet’s HyperLogLog algorithm for estimating cardinalities of unique objects in theoretically huge data sets. The implementation offered by Hazelcast includes improvements from Google’s version of the algorithm, i.e., HyperLogLog++.

The cardinality estimator service does not provide any ways to configure its properties, but rather uses some well tested defaults:

  • P: Stands for precision with a default value of 14 (using the 14 LSB of the hash for the index)

  • M: 2 ^ P = 16384 (16K) registers

  • P': Stands for sparse precision with a default value of 25

  • Durability: Count of backups for each estimator with a default value of 2

It is important to understand that this data structure is not 100% accurate, it is used to provide estimates. The error rate is typically a result of 1.04/sqrt(M) which in our implementation is around 0.81% for high percentiles.

The memory consumption of this data structure is close to 16K despite the size of elements in the source data set or stream.

There are two phases in using the cardinality estimator.

  1. Add objects to the instance of the estimator, e.g., for IPs estimator.add("0.0.0.0."). The provided object is first serialized and then the byte array is used to generate a hash for that object.

    Objects must be serializable in a form that Hazelcast understands.
  2. Compute the estimate of the set so far estimator.estimate().

See the cardinality estimator Javadoc for more information on its API.

The following is an example code.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
CardinalityEstimator visitorsEstimator = hz.getCardinalityEstimator("visitors");

InputStreamReader isr = new InputStreamReader(ExampleCardinalityEstimator.class.getResourceAsStream("visitors.txt"));
BufferedReader br = new BufferedReader(isr);
try {
    String visitor = br.readLine();
    while (visitor != null) {
        visitorsEstimator.add(visitor);
        visitor = br.readLine();
    }
} catch (IOException e) {
    e.printStackTrace();
} finally {
    closeResource(br);
    closeResource(isr);
}

System.out.printf("Estimated unique visitors seen so far: %d%n", visitorsEstimator.estimate());

Hazelcast.shutdownAll();

7.18.1. Split-Brain Protection for Cardinality Estimator

Cardinality Estimator can be configured to check for a minimum number of available members before applying its operations (see the Split-Brain Protection section). This is a check to avoid performing successful queue operations on all parts of a cluster during a network partition.

The following is a list of methods, grouped by the protection types, that support split-brain protection checks:

  • WRITE, READ_WRITE:

    • add

    • addAsync

  • READ, READ_WRITE:

    • estimate

    • estimateAsync

Configuring Split-Brain Protection

Split-brain protection for Cardinality Estimator can be configured programmatically using the method setSplitBrainProtectionName(), or declaratively using the element split-brain-protection-ref. Following is an example declarative configuration:

<hazelcast>
    ...
    <cardinality-estimator name="default">
        <split-brain-protection-ref>quorumname</split-brain-protection-ref>
    </cardinality-estimator>
    ...
</hazelcast>

The value of split-brain-protection-ref should be the split-brain protection configuration name which you configured under the split-brain-protection element as explained in the Split-Brain Protection section.

Configuring Merge Policy

While recovering from a split-brain syndrome, Cardinality Estimator in the small cluster merges into the bigger cluster based on a configured merge policy. When an estimator merges into the cluster, an estimator with the same name might already exist in the cluster. So the merge policy resolves these kinds of conflicts with different out-of-the-box strategies. It can be configured programmatically using the method setMergePolicyConfig(), or declaratively using the element merge-policy. Following is an example declarative configuration:

<hazelcast>
    ...
    <cardinality-estimator name="default">
        <merge-policy>HyperLogLogMergePolicy</merge-policy>
    </cardinality-estimator>
    ...
</hazelcast>

The following out-of-the-box merge policies are available:

  • DiscardMergePolicy: Estimator from the smaller cluster is discarded.

  • HyperLogLogMergePolicy: Estimator merges with the existing one, using the algorithmic merge for HyperLogLog. This is the default policy.

  • PassThroughMergePolicy: Estimator from the smaller cluster wins.

  • PutIfAbsentMergePolicy: Estimator from the smaller cluster wins if it doesn’t exist in the cluster.

7.19. Event Journal

The event journal is a distributed data structure that stores the history of mutation actions on map or cache. Each action on the map or cache which modifies its contents (such as put, remove or scheduled tasks which are not triggered by using the public API) creates an event which is stored in the event journal. The event stores the event type as well as the key, old value and updated value for the entry (when applicable). As a user, you can only append to the journal indirectly by using the map and cache methods or configuring the expiration and eviction. By reading from the event journal you can recreate the state of the map or cache at any point in time.

Currently the event journal does not expose a public API for reading the event journal in Hazelcast IMDG. The event journal can be used to stream event data to Hazelcast Jet, so it should be used in conjunction with Hazelcast Jet. Because of this we describe how to configure it but not how to use it from IMDG in this section. If you enable and configure the event journal, you may only reach it through private API and you most probably do not get any benefits but the journal retains events nevertheless and consumes heap space.

The event journal has a fixed capacity and an expiration time. Internally it is structured as a ringbuffer (partitioned by ringbuffer item) and shares many similarities with it.

7.19.1. Interaction with Evictions and Expiration for IMap

Configuring IMap with eviction and expiration can cause the event journal to contain different events on the different replicas of the same partition. You can run into issues if you are reading from the event journal and the partition owner is terminated. A backup replica is then promoted into the partition owner but the event journal will contain different events. The event count should stay the same but the entries which you previously thought were evicted and expired could now be "alive" and vice versa.

This is because eviction and expiration randomly choose entries to be evicted/expired. The entry is not coordinated between partition replicas. In these cases, the event journal diverges and will not converge at any future point, but will remain inconsistent just as well as the contents of the internal record stores are inconsistent between replicas. You may say that the event journal on a specific replica is in-sync with the record store on that replica but the event journals and record stores between replicas are out-of-sync.

7.19.2. Configuring Event Journal Capacity

By default, an event journal is configured with a capacity of 10000 items. This creates a single array per partition, roughly the size of the capacity divided by the number of partitions. Thus, if the configured capacity is 10000 and number of partitions is 271, we create 271 arrays of size 36 (10000/271). If a time-to-live is configured, then an array of longs is also created that stores the expiration time for every item. A single array of the event journal keeps events that are only related to the map entries in that partition. In a lot of cases you may want to change this capacity number to something that better fits your needs. As the capacity is shared between partitions, keep in mind not to set it to a value which is too low for you. Setting the capacity to a number lower than the partition count results in an error when initializing the event journal.

Below is a declarative configuration example of an event journal with a capacity of 5000 items for a map and 10000 items for a cache:

<hazelcast>
    ...
    <event-journal enabled="true">
        <mapName>myMap</mapName>
        <capacity>5000</capacity>
        <time-to-live-seconds>20</time-to-live-seconds>
    </event-journal>
    <event-journal enabled="true">
        <cacheName>myCache</cacheName>
        <capacity>10000</capacity>
        <time-to-live-seconds>0</time-to-live-seconds>
    </event-journal>
    ...
</hazelcast>

You can also configure an event journal programmatically. The following is a programmatic version of the above declarative configuration:

EventJournalConfig eventJournalMapConfig = new EventJournalConfig()
        .setEnabled(true)
        .setCapacity(5000)
        .setTimeToLiveSeconds(20);

EventJournalConfig eventJournalCacheConfig = new EventJournalConfig()
        .setEnabled(true)
        .setCapacity(10000)
        .setTimeToLiveSeconds(0);

Config config = new Config();
config.getMapConfig("myMap").setEventJournalConfig(eventJournalMapConfig);
config.getCacheConfig("myCache").setEventJournalConfig(eventJournalCacheConfig);

The mapName and cacheName attributes define the map or cache to which this event journal configuration applies. You can use pattern-matching and the default keyword when doing so. For instance, by using a mapName of journaled*, the journal configuration applies to all maps whose names start with "journaled" and don’t have other journal configurations that match (e.g., if you would have a more specific journal configuration with an exact name match). If you specify the mapName or cacheName as default, the journal configuration applies to all maps and caches that don’t have any other journal configuration. This means that potentially all maps and/or caches have one single event journal configuration.

7.19.3. Event Journal Partitioning

The event journal is a partitioned data structure. The partitioning is done by the event key. Because of this, the map and cache entry with a specific key is co-located with the events for that key and will be migrated accordingly. Also, the backup count for the event journal is equal to the backup count of the map or cache for which it contains events. The events on the backup replicas will be created with the map or cache backup operations and no additional network traffic is introduced when appending events to the event journal.

7.19.4. Configuring Event Journal time-to-live

You can configure Hazelcast event journal with a time-to-live in seconds. Using this setting, you can control how long the items remain in the event journal before they are expired. By default, the time-to-live is set to 0, meaning that unless the item is overwritten, it remains in the journal indefinitely. The expiration time of the existing journal events is checked whenever a new event is appended to the event journal or when the event journal is being read. If the journal is not being read from or written to, the journal may keep expired items indefinitely.

In the example below, an event journal is configured with a time-to-live of 180 seconds:

<hazelcast>
    ...
    <event-journal enabled="true">
        <cacheName>myCache</cacheName>
        <capacity>10000</capacity>
        <time-to-live-seconds>180</time-to-live-seconds>
    </event-journal>
    ...
</hazelcast>

8. Distributed Events

You can register for Hazelcast entry events so you are notified when those events occur. Event listeners are cluster-wide: when a listener is registered in one member of cluster, it is actually registered for the events that originated at any member in the cluster. When a new member joins, events originated at the new member are also delivered.

An event is created only if you registered an event listener. If no listener is registered, then no event is created. If you provided a predicate when you registered the event listener, pass the predicate before sending the event to the listener (member/client).

As a rule of thumb, your event listener should not implement heavy processes in its event methods that block the thread for a long time. If needed, you can use ExecutorService to transfer long running processes to another thread and thus offload the current listener thread.

In a failover scenario, events are not highly available and may get lost. However, you can perform workarounds such as configuring the event queue capacity as explained in the Global Event Configuration section.

Hazelcast offers the following event listeners.

For cluster events:

  • Membership Listener for cluster membership events

  • Distributed Object Listener for distributed object creation and destruction events

  • Migration Listener for partition migration start and completion events

  • Partition Lost Listener for partition lost events

  • Lifecycle Listener for HazelcastInstance lifecycle events

  • Client Listener for client connection events

For distributed object events:

  • Entry Listener for IMap and MultiMap entry events

  • Item Listener for IQueue, ISet and IList item events

  • Message Listener for ITopic message events

For Hazelcast JCache implementation:

For Hazelcast clients:

  • Lifecycle Listener

  • Membership Listener

  • Distributed Object Listener

8.1. Cluster Events

8.1.1. Listening for Member Events

The Membership Listener interface has methods that are invoked for the following events:

  • memberAdded: A new member is added to the cluster.

  • memberRemoved: An existing member leaves the cluster.

  • memberAttributeChanged: An attribute of a member is changed. See the Defining Member Attributes section to learn about member attributes.

To write a Membership Listener class, you implement the MembershipListener interface and its methods.

The following is an example Membership Listener class.

public class ClusterMembershipListener implements MembershipListener {

    public void memberAdded(MembershipEvent membershipEvent) {
        System.err.println("Added: " + membershipEvent);
    }

    public void memberRemoved(MembershipEvent membershipEvent) {
        System.err.println("Removed: " + membershipEvent);
    }

    public void memberAttributeChanged(MemberAttributeEvent memberAttributeEvent) {
        System.err.println("Member attribute changed: " + memberAttributeEvent);
    }
}

When a respective event is fired, the membership listener outputs the addresses of the members that joined and left, and also which attribute changed on which member.

Registering Membership Listeners

After you create your class, you can configure your cluster to include the membership listener. Below is an example using the method addMembershipListener.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
hazelcastInstance.getCluster().addMembershipListener( new ClusterMembershipListener() );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

Config config = new Config();
config.addListenerConfig(
new ListenerConfig( "com.yourpackage.ClusterMembershipListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
    ...
    <listeners>
        <listener>
            com.yourpackage.ClusterMembershipListener
        </listener>
    </listeners>
    ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:listeners>
    <hz:listener class-name="com.yourpackage.ClusterMembershipListener"/>
    <hz:listener implementation="MembershipListener"/>
</hz:listeners>

8.1.2. Listening for Distributed Object Events

The Distributed Object Listener methods distributedObjectCreated and distributedObjectDestroyed are invoked when a distributed object is created and destroyed throughout the cluster. To write a Distributed Object Listener class, you implement the DistributedObjectListener interface and its methods.

The following is an example Distributed Object Listener class.

public class ExampleDistObjListener implements DistributedObjectListener {

    @Override
    public void distributedObjectCreated(DistributedObjectEvent event) {
        DistributedObject instance = event.getDistributedObject();
        System.out.println("Created " + instance.getName() + ", service=" + instance.getServiceName());
    }

    @Override
    public void distributedObjectDestroyed(DistributedObjectEvent event) {
        System.out.println("Destroyed " + event.getObjectName() + ", service=" + event.getServiceName());
    }
}

When a respective event is fired, the distributed object listener outputs the event type, the object name and a service name (for example, for a Map object the service name is "hz:impl:mapService").

Registering Distributed Object Listeners

After you create your class, you can configure your cluster to include distributed object listeners. Below is an example using the method addDistributedObjectListener. You can also see this portion in the above class creation.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
ExampleDistObjListener example = new ExampleDistObjListener();

hazelcastInstance.addDistributedObjectListener( example );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register the listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

config.addListenerConfig(
new ListenerConfig( "com.yourpackage.ExampleDistObjListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
    ...
    <listeners>
        <listener>
            com.yourpackage.ExampleDistObjListener
        </listener>
    </listeners>
    ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:listeners>
    <hz:listener class-name="com.yourpackage.ExampleDistObjListener"/>
    <hz:listener implementation="DistributedObjectListener"/>
</hz:listeners>

8.1.3. Listening for Migration Events

The Migration Listener interface has methods that are invoked for the following events:

  • migrationStarted: The migration starts. A migration consists of a group of replica migrations which are planned together. The MigrationState parameter of the migrationStarted method shows information about the migration: start time of the process, number of the planned migrations, etc.

  • migrationFinished: The migration finishes. MigrationState parameter shows the result of the migration: number of the completed migrations, number of the remaining migrations, total elapsed time, etc.

  • replicaMigrationCompleted: A partition replica migration starts. Method’s parameter, ReplicaMigrationEvent, shows information about a replica migration: partition ID, replica index, source and destination members of the migration and elapsed time for this replica migration. Also it shows the progress of the overall migration: number of the completed and remaining replica migrations and total elapsed time.

  • replicaMigrationFailed: A partition replica migration fails. The MigrationEvent parameter shows the information about this replica migration and overall migration similar to the migrationCompleted method.

To write a Migration Listener class, you implement the MigrationListener interface and its methods.

The following is an example Migration Listener class.

public class ClusterMigrationListener implements MigrationListener {

    @Override
    public void migrationStarted(MigrationState state) {
        System.out.println("Migration Started: " + state);
    }

    @Override
    public void migrationFinished(MigrationState state) {
        System.out.println("Migration Finished: " + state);
    }

    @Override
    public void replicaMigrationCompleted(ReplicaMigrationEvent event) {
        System.out.println("Replica Migration Completed: " + event);
    }

    @Override
    public void replicaMigrationFailed(ReplicaMigrationEvent event) {
        System.out.println("Replica Migration Failed: " + event);
    }
}
Registering Migration Listeners

After you create your class, you can configure your cluster to include migration listeners. Below is an example using the method addMigrationListener.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

PartitionService partitionService = hazelcastInstance.getPartitionService();
partitionService.addMigrationListener( new ClusterMigrationListener() );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register the listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

config.addListenerConfig(
new ListenerConfig( "com.yourpackage.ClusterMigrationListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
    ...
    <listeners>
        <listener>
            com.yourpackage.ClusterMigrationListener
        </listener>
    </listeners>
    ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:listeners>
    <hz:listener class-name="com.yourpackage.ClusterMigrationListener"/>
    <hz:listener implementation="MigrationListener"/>
</hz:listeners>

8.1.4. Listening for Partition Lost Events

Hazelcast provides fault-tolerance by keeping multiple copies of your data. For each partition, one of your cluster members becomes the owner and some of the other members become replica members, based on your configuration. Nevertheless, data loss may occur if a few members crash simultaneously.

Let’s consider the following example with three members: N1, N2, N3 for a given partition-0. N1 is owner of partition-0. N2 and N3 are the first and second replicas respectively. If N1 and N2 crash simultaneously, partition-0 loses its data that is configured with less than two backups. For instance, if we configure a map with one backup, that map loses its data in partition-0 since both owner and first replica of partition-0 have crashed. However, if we configure our map with two backups, it does not lose any data since a copy of partition-0’s data for the given map also resides in N3.

The Partition Lost Listener notifies for possible data loss occurrences with the information of how many replicas are lost for a partition. It listens to PartitionLostEvent instances. Partition lost events are dispatched per partition.

Partition loss detection is done after a member crash is detected by the other members and the crashed member is removed from the cluster. Please note that false-positive PartitionLostEvent instances may be fired on the network split errors.

Writing a Partition Lost Listener Class

To write a Partition Lost Listener, you implement the PartitionLostListener interface and its partitionLost method, which is invoked when a partition loses its owner and all backups.

The following is an example Partition Lost Listener class.

public class ConsoleLoggingPartitionLostListener implements PartitionLostListener {
    @Override
    public void partitionLost(PartitionLostEvent event) {
        System.out.println(event);
    }
}

When a PartitionLostEvent is fired, the partition lost listener given above outputs the partition ID, the replica index that is lost and the member that has detected the partition loss. The following is an example output.

com.hazelcast.partition.PartitionLostEvent{partitionId=242, lostBackupCount=0,
eventSource=Address[192.168.2.49]:5701}
Registering Partition Lost Listeners

After you create your class, you can configure your cluster programmatically or declaratively to include the partition lost listener. Below is an example of its programmatic configuration.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
hazelcastInstance.getPartitionService().addPartitionLostListener( new ConsoleLoggingPartitionLostListener() );

The following is an example of the equivalent declarative configuration.

<hazelcast>
    ...
    <listeners>
        <listener>
            com.yourpackage.ConsoleLoggingPartitionLostListener
        </listener>
    </listeners>
    ...
</hazelcast>

8.1.5. Listening for Lifecycle Events

The Lifecycle Listener notifies for the following events:

  • STARTING: A member is starting.

  • STARTED: A member started.

  • SHUTTING_DOWN: A member is shutting down.

  • SHUTDOWN: A member’s shutdown has completed.

  • MERGING: A member is merging with the cluster.

  • MERGED: A member’s merge operation has completed.

  • CLIENT_CONNECTED: A Hazelcast Client connected to the cluster.

  • CLIENT_DISCONNECTED: A Hazelcast Client disconnected from the cluster.

The following is an example Lifecycle Listener class.

public class NodeLifecycleListener implements LifecycleListener {
     @Override
     public void stateChanged(LifecycleEvent event) {
         System.err.println(event);
     }
}

This listener is local to an individual member. It notifies the application that uses Hazelcast about the events mentioned above for a particular member.

Registering Lifecycle Listeners

After you create your class, you can configure your cluster to include lifecycle listeners. Below is an example using the method addLifecycleListener.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
hazelcastInstance.getLifecycleService().addLifecycleListener( new NodeLifecycleListener() );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register the listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

config.addListenerConfig(
    new ListenerConfig( "com.yourpackage.NodeLifecycleListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
    ...
    <listeners>
        <listener>
            com.yourpackage.NodeLifecycleListener
        </listener>
    </listeners>
    ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:listeners>
    <hz:listener class-name="com.yourpackage.NodeLifecycleListener"/>
    <hz:listener implementation="LifecycleListener"/>
</hz:listeners>

8.1.6. Listening for Clients

The client listener is used by the Hazelcast cluster members. It notifies the cluster member when a client is connected to or disconnected from it, i.e., the clients fire an event from only one member they are connected to. Other cluster members do not fire a "client is connected" or "client is disconnected" event.

To write a client listener class, you implement the ClientListener interface and its methods clientConnected and clientDisconnected, which are invoked when a client is connected to or disconnected from the cluster. You can add your client listener as shown below.

hazelcastInstance.getClientService().addClientListener(new ExampleClientListener());

The following is the equivalent declarative configuration.

<hazelcast>
    ...
    <listeners>
        <listener>
            com.yourpackage.ExampleClientListener
        </listener>
    </listeners>
    ...
</hazelcast>

The following is the equivalent configuration in the Spring context.

<hz:listeners>
    <hz:listener class-name="com.yourpackage.ExampleClientListener"/>
    <hz:listener implementation="com.yourpackage.ExampleClientListener"/>
</hz:listeners>
You can also add event listeners to a Hazelcast client. See the Client Listenerconfig section for the related information.

8.2. Distributed Object Events

8.2.1. Listening for Map Events

You can listen to map-wide or entry-based events using the listeners provided by the Hazelcast’s eventing framework. To listen to these events, implement a MapListener sub-interface.

A map-wide event is fired as a result of a map-wide operation. For example, IMap.clear() or IMap.evictAll(). An entry-based event is fired after the operations that affect a specific entry. For example, IMap.remove() or IMap.evict().

Catching a Map Event

To catch an event, you should explicitly implement a corresponding sub-interface of a MapListener, such as EntryAddedListener or MapClearedListener.

The EntryListener interface still can be implemented (we kept it for backward compatibility reasons). However, if you need to listen to a different event, one that is not available in the EntryListener interface, you should also implement a relevant MapListener sub-interface.

Let’s take a look at the following class example.

public class Listen {

    public static void main( String[] args ) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, String> map = hz.getMap( "somemap" );
        map.addEntryListener( new MyEntryListener(), true );
        System.out.println( "EntryListener registered" );
    }

    static class MyEntryListener implements
            EntryAddedListener<String, String>,
            EntryRemovedListener<String, String>,
            EntryUpdatedListener<String, String>,
            EntryEvictedListener<String, String>,
            EntryLoadedListener<String,String>,
            MapEvictedListener,
            MapClearedListener   {
        @Override
        public void entryAdded( EntryEvent<String, String> event ) {
            System.out.println( "Entry Added:" + event );
        }

        @Override
        public void entryRemoved( EntryEvent<String, String> event ) {
            System.out.println( "Entry Removed:" + event );
        }

        @Override
        public void entryUpdated( EntryEvent<String, String> event ) {
            System.out.println( "Entry Updated:" + event );
        }

        @Override
        public void entryEvicted( EntryEvent<String, String> event ) {
            System.out.println( "Entry Evicted:" + event );
        }

        @Override
        public void entryLoaded( EntryEvent<String, String> event ) {
            System.out.println( "Entry Loaded:" + event );
        }

        @Override
        public void mapEvicted( MapEvent event ) {
            System.out.println( "Map Evicted:" + event );
        }

        @Override
        public void mapCleared( MapEvent event ) {
            System.out.println( "Map Cleared:" + event );
        }
    }
}

Now, let’s perform some modifications on the map entries using the following example code.

public class ModifyMap {

    public static void main( String[] args ) {
        HazelcastInstance hz = Hazelcast.newHazelcastInstance();
        IMap<String, String> map = hz.getMap( "somemap");
        String key = "" + System.nanoTime();
        String value = "1";
        map.put( key, value );
        map.put( key, "2" );
        map.delete( key );
    }
}

If you execute the Listen class and then the Modify class, you get the following output produced by the Listen class.

Entry Added:EntryEvent{entryEventType=ADDED, member=Member [192.168.1.100]]:5702
 - ffedb655-bbad-43ea-aee8-d429d37ce528, name='somemap', key=11455268066242,
 oldValue=null, value=1, mergingValue=null}

Entry Updated:EntryEvent{entryEventType=UPDATED, member=Member [192.168.1.100]]:5702
 - ffedb655-bbad-43ea-aee8-d429d37ce528, name='somemap', key=11455268066242,
 oldValue=1, value=2, mergingValue=null}

Entry Removed:EntryEvent{entryEventType=REMOVED, member=Member [192.168.1.100]]:5702
 - ffedb655-bbad-43ea-aee8-d429d37ce528, name='somemap', key=11455268066242,
 oldValue=null, value=null, mergingValue=null}
Please note that the method IMap.clear() does not fire an "EntryRemoved" event, but fires a "MapCleared" event.
Listeners have to offload all blocking operations to another thread (pool).

8.2.2. Listening for Lost Map Partitions

You can listen to MapPartitionLostEvent instances by registering an implementation of MapPartitionLostListener, which is also a sub-interface of MapListener.

Let’s consider the following example code:

public class ListenMapPartitionLostEvents {

    public static void main(String[] args) {
        Config config = new Config();
        // keeps its data if a single node crashes
        config.getMapConfig("map").setBackupCount(1);

        HazelcastInstance instance = HazelcastInstanceFactory.newHazelcastInstance(config);

        IMap<Object, Object> map = instance.getMap("map");
        map.put(0, 0);

        map.addPartitionLostListener(new MapPartitionLostListener() {
            @Override
            public void partitionLost(MapPartitionLostEvent event) {
                System.out.println(event);
            }
        });
    }
}

Within this example code, a MapPartitionLostListener implementation is registered to a map that is configured with one backup. For this particular map and any of the partitions in the system, if the partition owner member and its first backup member crash simultaneously, the given MapPartitionLostListener receives a corresponding MapPartitionLostEvent. If only a single member crashes in the cluster, there is no MapPartitionLostEvent fired for this map since backups for the partitions owned by the crashed member are kept on other members.

See the Listening for Partition Lost Events section for more information about partition lost detection and partition lost events.

Registering Map Listeners

After you create your listener class, you can configure your cluster to include map listeners using the method addEntryListener (as you can see in the example Listen class above). Below is the related portion from this code, showing how to register a map listener.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
IMap<String, String> map = hz.getMap( "somemap" );
map.addEntryListener( new MyEntryListener(), true );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register listeners in configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

mapConfig.addEntryListenerConfig(
new EntryListenerConfig( "com.yourpackage.MyEntryListener",
                                 false, false ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
    ...
    <map name="somemap">
        <entry-listeners>
            <entry-listener include-value="false" local="false">
                com.yourpackage.MyEntryListener
            </entry-listener>
        </entry-listeners>
    </map>
    ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:map name="somemap">
    <hz:entry-listeners>
        <hz:entry-listener include-value="true"
            class-name="com.hazelcast.spring.DummyEntryListener"/>
        <hz:entry-listener implementation="dummyEntryListener" local="true"/>
    </hz:entry-listeners>
</hz:map>
Map Listener Attributes

As you see, there are attributes of the map listeners in the above examples: include-value and local. The attribute include-value is a boolean attribute that is optional, and if you set it to true, the map event contains the map value. Its default value is true.

The attribute local is also a boolean attribute that is optional, and if you set it to true, you can listen to the map on the local member. Its default value is false.

8.2.3. Listening for MultiMap Events

You can listen to entry-based events in the MultiMap using EntryListener. The following is an example entry listener implementation for MultiMap.

public class ExampleEntryListener implements EntryListener<String, String> {
    @Override
    public void entryAdded(EntryEvent<String, String> event) {
        System.out.println("Entry Added: " + event);
    }
    @Override
    public void entryRemoved( EntryEvent<String, String> event ) {
        System.out.println( "Entry Removed: " + event );
    }
    @Override
    public void entryUpdated(EntryEvent<String, String> event) {
        System.out.println( "Entry Updated: " + event );
    }
    @Override
    public void entryEvicted(EntryEvent<String, String> event) {
        System.out.println( "Entry evicted: " + event );
    }
    @Override
    public void entryExpired(EntryEvent<String, String> event) {
        System.out.println( "Entry expired: " + event );
    }
    @Override
    public void mapCleared(MapEvent event) {
        System.out.println( "Map Cleared: " + event );
    }
    @Override
    public void mapEvicted(MapEvent event) {
        System.out.println( "Map Evicted: " + event );
    }
}
Registering MultiMap Listeners

After you create your listener class, you can configure your cluster to include MultiMap listeners using the method addEntryListener. Below is the related portion from a code, showing how to register a map listener.

HazelcastInstance hz = Hazelcast.newHazelcastInstance();
MultiMap<String, String> map = hz.getMultiMap( "somemap" );
map.addEntryListener( new ExampleEntryListener(), true );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

multiMapConfig.addEntryListenerConfig(
  new EntryListenerConfig( "com.yourpackage.ExampleEntryListener",
    false, false ) );
[source,xml]

The following is an example of the equivalent declarative configuration.

<hazelcast>
    ...
    <multimap name="somemap">
        <value-collection-type>SET</value-collection-type>
        <entry-listeners>
            <entry-listener include-value="false" local="false">
                com.yourpackage.ExampleEntryListener
            </entry-listener>
        </entry-listeners>
    </multimap>
    ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:multimap name="somemap" value-collection-type="SET">
    <hz:entry-listeners>
        <hz:entry-listener include-value="false"
            class-name="com.yourpackage.ExampleEntryListener"/>
        <hz:entry-listener implementation="EntryListener" local="false"/>
    </hz:entry-listeners>
</hz:multimap>
MultiMap Listener Attributes

As you see, there are attributes of the MultiMap listeners in the above examples: include-value and local. The attribute include-value is a boolean attribute that is optional, and if you set it to true, the MultiMap event contains the map value. Its default value is true.

The attribute local is also a boolean attribute that is optional, and if you set it to true, you can listen to the MultiMap on the local member. Its default value is false.

8.2.4. Listening for Item Events

The Item Listener is used by the Hazelcast IQueue, ISet and IList interfaces.

To write an Item Listener class, you implement the ItemListener interface and its methods itemAdded and itemRemoved. These methods are invoked when an item is added or removed.

The following is an example Item Listener class for an ISet structure.

public class ExampleItemListener implements ItemListener<Price> {

    @Override
    public void itemAdded(ItemEvent<Price> event) {
        System.out.println( "Item added:  " + event );
    }

    @Override
    public void itemRemoved(ItemEvent<Price> event) {
        System.out.println( "Item removed: " + event );
    }
}
You can use ICollection when creating any of the collection (queue, set and list) data structures, as shown above. You can also use IQueue, ISet or IList instead of ICollection.
Registering Item Listeners

After you create your class, you can configure your cluster to include item listeners. Below is an example using the method addItemListener for ISet (it applies also to IQueue and IList). You can also see this portion in the above class creation.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

ICollection<Price> set = hazelcastInstance.getSet( "default" );
// or ISet<Prices> set = hazelcastInstance.getSet( "default" );
set.addItemListener( new ExampleItemListener(), true );

With the above approach, there is the possibility of missing events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register listeners in the configuration. You can register listeners using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

setConfig.addItemListenerConfig(
new ItemListenerConfig( "com.yourpackage.ExampleItemListener", true ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
    ...
    <set>
        <item-listeners>
            <item-listener include-value="true">
                com.yourpackage.ExampleItemListener
            </item-listener>
        </item-listeners>
    </set>
    ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:set name="default" >
    <hz:item-listeners>
        <hz:item-listener include-value="true"
            class-name="com.yourpackage.ExampleItemListener"/>
    </hz:item-listeners>
</hz:set>
Item Listener Attributes

As you see, there is an attribute in the above examples: include-value. It is a boolean attribute that is optional, and if you set it to true, the item event contains the item value. Its default value is true.

There is also another attribute called local, which is not shown in the above examples. It is also a boolean attribute that is optional, and if you set it to true, you can listen to the items on the local member. Its default value is false.

8.2.5. Listening for Topic Messages

The Message Listener is used by the ITopic interface. It notifies when a message is received for the registered topic.

To write a Message Listener class, you implement the MessageListener interface and its method onMessage, which is invoked when a message is received for the registered topic.

The following is an example Message Listener class.

public class ExampleMessageListener implements MessageListener<MyEvent> {

    public void onMessage( Message<MyEvent> message ) {
        MyEvent myEvent = message.getMessageObject();
        System.out.println( "Message received = " + myEvent.toString() );
    }
}
Registering Message Listeners

After you create your class, you can configure your cluster to include message listeners. Below is an example using the method addMessageListener.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

ITopic topic = hazelcastInstance.getTopic( "default" );
topic.addMessageListener( new ExampleMessageListener() );

With the above approach, there is the possibility of missing messaging events between the creation of the instance and registering the listener. To overcome this race condition, Hazelcast allows you to register this listener in the configuration. You can register it using declarative, programmatic, or Spring configuration, as shown below.

The following is an example programmatic configuration.

topicConfig.addMessageListenerConfig(
  new ListenerConfig( "com.yourpackage.ExampleMessageListener" ) );

The following is an example of the equivalent declarative configuration.

<hazelcast>
    ...
    <topic name="default">
        <message-listeners>
            <message-listener>
                com.yourpackage.ExampleMessageListener
            </message-listener>
        </message-listeners>
    </topic>
    ...
</hazelcast>

The following is an example of the equivalent Spring configuration.

<hz:topic name="default">
    <hz:message-listeners>
        <hz:message-listener
            class-name="com.yourpackage.ExampleMessageListener"/>
    </hz:message-listeners>
</hz:topic>

8.3. Event Listeners for Hazelcast Clients

You can add event listeners to a Hazelcast Java client. You can configure the following listeners to listen to the events on the client side. See the respective content under the Cluster Events section for example codes.

  • Lifecycle Listener: Notifies when the client is starting, started, shutting down and shutdown.

  • Membership Listener: Notifies when a member joins to/leaves the cluster to which the client is connected, or when an attribute is changed in a member.

  • DistributedObject Listener: Notifies when a distributed object is created or destroyed throughout the cluster to which the client is connected.

See the Configuring Client Listeners section for more information.

8.4. Global Event Configuration

  • hazelcast.event.queue.capacity: default value is 1000000

  • hazelcast.event.queue.timeout.millis: default value is 250

  • hazelcast.event.thread.count: default value is 5

A striped executor in each cluster member controls and dispatches the received events. This striped executor also guarantees the event order. For all events in Hazelcast, the order in which events are generated and the order in which they are published are guaranteed for given keys. For map and multimap, the order is preserved for the operations on the same key of the entry. For list, set, topic and queue, the order is preserved for events on that instance of the distributed data structure.

To achieve the order guarantee, you make only one thread responsible for a particular set of events (entry events of a key in a map, item events of a collection, etc.) in StripedExecutor (within com.hazelcast.util.executor).

If the event queue reaches its capacity (hazelcast.event.queue.capacity) and the last item cannot be put into the event queue for the period specified in hazelcast.event.queue.timeout.millis, these events are dropped with a warning message, such as "EventQueue overloaded".

If event listeners perform a computation that takes a long time, the event queue can reach its maximum capacity and lose events. For map and multimap, you can configure hazelcast.event.thread.count to a higher value so that fewer collisions occur for keys, and therefore worker threads do not block each other in StripedExecutor. For list, set, topic and queue, you should offload heavy work to another thread. To preserve order guarantee, you should implement similar logic with StripedExecutor in the offloaded thread pool.

9. Hazelcast Jet

This chapter briefly describes Hazelcast Jet. For detailed information and Jet documentation, please visit jet.hazelcast.org.

9.1. Overview

Hazelcast Jet, built on top of the Hazelcast IMDG, is a distributed processing engine for fast stream and batch processing of large data sets. It reuses the features and services of Hazelcast IMDG, but it is a separate product with features not available in IMDG.

With Hazelcast IMDG providing storage functionality, Jet performs parallel execution in a Hazelcast Jet cluster, composed of Jet instances, to enable data-intensive applications to operate in near real-time. Jet uses green threads (threads that are scheduled by a runtime library or VM) to achieve this parallel execution.

Since Jet uses Hazelcast IMDG’s discovery mechanisms, it can be used both on-premises and on the cloud environments. Hazelcast Jet typically runs on several machines that form a cluster.

9.1.1. How You Can Use It

The Pipeline API is the primary high-level API of Hazelcast Jet for batch and stream processing. This API is easy-to-use and set-up providing you with the tools to compose batch computations from building blocks such as filters, aggregators and joiners - saving time and resource. With Pipeline API, you can build bounded and unbounded data pipelines on a variety of sources and sinks.

In addition to the Pipeline API, Jet also offers a distributed implementation of java.util.stream. You can express your computation over any data source Jet supports using the familiar API from the JDK 8. This distributed implementation can be used for simple transform and reduce operations on top of IMap and IList.

There is also Jet’s Core API for advanced users to build custom data sources and sinks, to have a low-level control over the data flow, to fine-tune performance and build DSLs.

See the Work with Jet section in the Hazelcast Jet Reference Manual to see a simple example.

9.1.2. Where You Can Use It

Hazelcast Jet is appropriate for applications that require a near real-time experience such as operations in IoT architectures (house thermostats, lighting systems, etc.), in-store e-commerce systems and social media platforms. Typical use cases include the following:

  • Real-time (low-latency) stream processing

  • Fast batch processing

  • Streaming analytics

  • Complex event processing

  • Implementing event sourcing and CQRS (Command Query Responsibility Segregation)

  • Internet-of-things (IoT) data ingestion, processing and storage

  • Data processing microservice architectures

  • Online trading

  • Social media platforms

  • System log events

The aforementioned use cases require huge amounts of data to be processed in near real-time. Hazelcast Jet achieves this by processing the incoming records as soon as possible, hence lowering the latency, and ingesting the data at high-velocity. Jet’s execution model and keeping both the computation and data storage in memory enables high application speeds.

9.1.3. Data Processing Styles

The data processing is traditionally divided into batch and stream processing.

Batch data is considered as bounded, i.e., finite, and fast batch processing typically may refer to running a job on a data set which is available in a data center. You simply provide one or more pre-existing datasets and order Hazelcast Jet to mine them for the information you need.

Stream data is considered as unbounded, i.e., infinite, and infinite stream processing deals with in-flight data before it is stored. It offers lower latency; data is processed on-the-fly and you do not have to wait for the whole data set to arrive in order to run a computation.

9.2. Relationship with Hazelcast IMDG

Hazelcast Jet leans on Hazelcast IMDG for cluster management and deployment, data partitioning and networking; all the services of IMDG are available to your Jet Jobs (units of work which are executed). A Jet instance is also a fully functional Hazelcast IMDG instance and a Jet cluster is also a Hazelcast IMDG cluster.

A Jet job is implemented as a Hazelcast IMDG proxy, similar to the other services and data structures in Hazelcast. Hazelcast operations are used for different actions that can be performed on a job. Jet can also be used with the Hazelcast Client, which uses the Hazelcast Open Binary Protocol to communicate different actions to the server instance.

In the Hazelcast Jet world, Hazelcast IMDG can be used for data ingestion prior to processing, connecting multiple Jet jobs, enriching processed events, caching the remote data, distributing Jet-processed data and running advanced data processing tasks on top of IMDG data structures.

Hazelcast Jet can use Hazelcast IMDG’s IMap, ICache and IList on the embedded cluster as sources (data structures from which Jet reads data) and sinks (data structures to which Jet writes data). IMap and ICache are partitioned data structures distributed across the cluster and Jet members can read from these structures by having each member read just its local partitions. Hazelcast IMDG’s IList is stored on a single partition; all the data is read on the single member that owns that partition. See the IMap and ICache and IList sections in the Hazelcast Jet Reference Manual to learn how Jet uses these IMDG data structures. In addition to these data structures, Jet can also process a stream of changes of IMap and ICache, using the Event Journal.

You can use Hazelcast Jet with embedded Hazelcast IMDG or a remote Hazelcast IMDG cluster. Benefits of using Hazelcast Jet with embedded Hazelcast IMDG are as follows:

  • sharing the processing state among Jet Jobs

  • caching intermediate processing results

  • enriching processed events; cache remote data, e.g., fact tables from a database, on the Jet members

  • running advanced data processing tasks on top of Hazelcast data structures

  • improving development processes by making start up of a Jet cluster simple and fast

Jet Jobs use Hazelcast IMDG connector by allowing reading and writing records to/from a remote Hazelcast IMDG instance. You can use a remote Hazelcast IMDG cluster for the following cases:

  • distributing data across IMap, ICache and IList structures

  • sharing state or intermediate results among more Jet clusters

  • isolating the processing cluster (Jet) from operational data storage cluster (IMDG)

  • publishing intermediate results, e.g., to show real-time processing stats on a dashboard

9.3. Hazelcast IMDG Computing vs. Hazelcast Jet

As described in the Aggregations section Hazelcast IMDG has native support for aggregation operations on the contents of its distributed map (IMap) instances.

Aggregations are a good fit for simple operations (count, distinct, sum, avg, min, max, etc.). However, they may not be sufficient for operations that group data by key and produce the results of size O(keyCount). The architecture of Hazelcast aggregations is not well suited to this use case, although it still works even for moderately sized results (up to 100 MB, as a ballpark figure). Hazelcast Jet can be the preferred choice for larger sized results and whenever something more than a single aggregation step is needed. See the Jet Compared with New Aggregations section.

Another Hazelcast IMDG computing feature is Entry Processors. They are used for fast mutating operations in an atomic way, in which the map entry is mutated by executing logic directly on the JVM where the data resides. And this means the network hops are reduced and atomicity is provided in a single step. Keeping this in mind, you can use Hazelcast IMDG Entry Processors when they perform bulk mutations of an IMap, where the processing function is fast and involves a single map entry per call. On the other hand, you can prefer to use Hazelcast Jet when the processing involves multiple entries (aggregations, joins, etc.), or involves multiple computing steps to be made parallel, or when the data source and sink are not a single IMap instance.

10. Distributed Computing

This chapter explains Hazelcast’s executor service, durable/scheduled executor services and entry processor implementations.

10.1. Executor Service

One of the coolest features of Java is the Executor framework, which allows you to asynchronously execute your tasks (logical units of work), such as database queries, complex calculations and image rendering.

The default implementation of this framework (ThreadPoolExecutor) is designed to run within a single JVM (cluster member). In distributed systems, this implementation is not desired since you may want a task submitted in one JVM and processed in another one. Hazelcast offers IExecutorService for you to use in distributed environments. It implements java.util.concurrent.ExecutorService to serve the applications requiring computational and data processing power.


Note that you may want to use Hazelcast Jet if you want to process batch or real-time streaming data. See the Fast Batch Processing and Real-Time Stream Processing use cases for Hazelcast Jet.

With IExecutorService, you can execute tasks asynchronously and perform other useful tasks. If your task execution takes longer than expected, you can cancel the task execution. Tasks should be Serializable since they are distributed.

In the Java Executor framework, you implement tasks two ways: Callable or Runnable.

  • Callable: If you need to return a value and submit it to Executor, implement the task as java.util.concurrent.Callable.

  • Runnable: If you do not need to return a value, implement the task as java.util.concurrent.Runnable.

Note that, the distributed executor service (IExecutorService) is intended to run processing where the data is hosted: on the server members. In general, you cannot run a Java Runnable or Callable on the clients as the clients may not be Java. Also, the clients do not host any data, so they would have to fetch what data they need from the servers potentially. If you want something to run on all or some clients connected to your cluster, you could implement this using the publish/subscribe mechanism; a payload could be sent to an ITopic with the necessary execution parameters, and clients listening can act on the message.

10.1.1. Implementing a Callable Task

In Hazelcast, when you implement a task as java.util.concurrent.Callable (a task that returns a value), you implement Callable and Serializable.

Below is an example of a Callable task. SumTask prints out map keys and returns the summed map values.

public class SumTask
        implements Callable<Integer>, Serializable, HazelcastInstanceAware {

    private transient HazelcastInstance hazelcastInstance;

    public void setHazelcastInstance( HazelcastInstance hazelcastInstance ) {
        this.hazelcastInstance = hazelcastInstance;
    }

    public Integer call() throws Exception {
        IMap<String, Integer> map = hazelcastInstance.getMap( "map" );
        int result = 0;
        for ( String key : map.localKeySet() ) {
            System.out.println( "Calculating for key: " + key );
            result += map.get( key );
        }
        System.out.println( "Local Result: " + result );
        return result;
    }
}

Another example is the Echo callable below. In its call() method, it returns the local member and the input passed in. Remember that instance.getCluster().getLocalMember() returns the local member and toString() returns the member’s address (IP + port) in String form, just to see which member actually executed the code for our example. Of course, the call() method can do and return anything you like.

public class Echo implements Callable<String>, Serializable, HazelcastInstanceAware {
    String input = null;

    private transient HazelcastInstance hazelcastInstance;

    public Echo() {
    }

    public void setHazelcastInstance( HazelcastInstance hazelcastInstance ) {
        this.hazelcastInstance = hazelcastInstance;
    }

    public Echo(String input) {
        this.input = input;
    }

    public String call() {
        return hazelcastInstance.getCluster().getLocalMember().toString() + ":" + input;
    }
}
Executing a Callable Task

To execute a callable task:

  • retrieve the Executor from HazelcastInstance

  • submit a task which returns a Future

  • after executing the task, you do not have to wait for the execution to complete, you can process other things

  • when ready, use the Future object to retrieve the result as shown in the code example below.

Below, the Echo task is executed.

public class MasterMember {

    public static void main( String[] args ) throws Exception {
        HazelcastInstance instance = Hazelcast.newHazelcastInstance();
        IExecutorService executorService = instance.getExecutorService( "executorService" );
        Future<String> future = executorService.submit( new Echo( "myinput") );
        //while it is executing, do some useful stuff
        //when ready, get the result of your execution
        String result = future.get();
    }
}

Please note that the Echo callable in the above example also implements a Serializable interface, since it may be sent to another member to be processed.

When a task is deserialized, HazelcastInstance needs to be accessed. To do this, the task should implement HazelcastInstanceAware interface. See the HazelcastInstanceAware Interface section for more information.

10.1.2. Implementing a Runnable Task

In Hazelcast, when you implement a task as java.util.concurrent.runnable (a task that does not return a value), you implement Runnable and Serializable.

Below is Runnable example code. It is a task that waits for some time and echoes a message.

public class EchoTask implements Runnable, Serializable {

    private final String msg;

    public EchoTask( String msg ) {
        this.msg = msg;
    }

    @Override
    public void run() {
        try {
            Thread.sleep( 5000 );
        } catch ( InterruptedException e ) {
        }
        System.out.println( "echo:" + msg );
    }
}
Executing a Runnable Task

To execute the runnable task:

  • retrieve the Executor from HazelcastInstance

  • submit the tasks to the Executor.

Now let’s write a class that submits and executes these echo messages. Executor is retrieved from HazelcastInstance and 1000 echo tasks are submitted.

public class RunnableMasterMember {

    public static void main( String[] args ) throws Exception {
        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
        IExecutorService executor = hazelcastInstance.getExecutorService( "exec" );
        for ( int k = 1; k <= 1000; k++ ) {
            Thread.sleep( 1000 );
            System.out.println( "Producing echo task: " + k );
            executor.execute( new EchoTask( String.valueOf( k ) ) );
        }
        System.out.println( "EchoTaskMain finished!" );
    }
}

10.1.3. Scaling The Executor Service

You can scale the Executor service both vertically (scale up) and horizontally (scale out).

To scale up, you should improve the processing capacity of the cluster member (JVM). You can do this by increasing the pool-size property mentioned in Configuring Executor Service (i.e., increasing the thread count). However, please be aware of your member’s capacity. If you think it cannot handle such an additional load caused by increasing the thread count, you may want to consider improving the member’s resources (CPU, memory, etc.). As an example, set the pool-size to 5 and run the above MasterMember. You will see that EchoTask is run as soon as it is produced.

To scale out, add more members instead of increasing only one member’s capacity. In reality, you may want to expand your cluster by adding more physical or virtual machines. For example, in the EchoTask example in the Runnable section, you can create another Hazelcast instance. That instance automatically gets involved in the executions started in MasterMember and start processing.

10.1.4. Executing Code in the Cluster

The distributed executor service is a distributed implementation of java.util.concurrent.ExecutorService. It allows you to execute your code in the cluster. In this section, the code examples are based on the Echo class above (please note that the Echo class is Serializable). The code examples show how Hazelcast can execute your code (Runnable, Callable):

  • echoOnTheMember: On a specific cluster member you choose with the IExecutorService submitToMember method.

  • echoOnTheMemberOwningTheKey: On the member owning the key you choose with the IExecutorService submitToKeyOwner method.

  • echoOnSomewhere: On the member Hazelcast picks with the IExecutorService submit method.

  • echoOnMembers: On all or a subset of the cluster members with the IExecutorService submitToMembers method.

public void echoOnTheMember( String input, Member member ) throws Exception {
    Callable<String> task = new Echo( input );
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService executorService =
      hazelcastInstance.getExecutorService( "default" );

    Future<String> future = executorService.submitToMember( task, member );
    String echoResult = future.get();
}
public void echoOnTheMemberOwningTheKey( String input, Object key ) throws Exception {
    Callable<String> task = new Echo( input );
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService executorService =
      hazelcastInstance.getExecutorService( "default" );

    Future<String> future = executorService.submitToKeyOwner( task, key );
    String echoResult = future.get();
}
public void echoOnSomewhere( String input ) throws Exception {
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService executorService =
      hazelcastInstance.getExecutorService( "default" );

    Future<String> future = executorService.submit( new Echo( input ) );
    String echoResult = future.get();
}
public void echoOnMembers( String input, Set<Member> members ) throws Exception {
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService executorService =
      hazelcastInstance.getExecutorService( "default" );

    Map<Member, Future<String>> futures = executorService
      .submitToMembers( new Echo( input ), members );

    for ( Future<String> future : futures.values() ) {
        String echoResult = future.get();
        // ...
    }
}
You can obtain the set of cluster members via HazelcastInstance.getCluster().getMembers() call.

10.1.5. Canceling an Executing Task

A task in the code that you execute in a cluster might take longer than expected. If you cannot stop/cancel that task, it keeps eating your resources.

To cancel a task, you can use the standard Java executor framework’s cancel() API. This framework encourages us to code and design for cancellations, a highly ignored part of software development.

Example Task to Cancel

The Fibonacci callable class below calculates the Fibonacci number for a given number. In the calculate method, we check if the current thread is interrupted so that the code can respond to cancellations once the execution is started.

int input = 0;

public FibonacciCallable( int input ) {
    this.input = input;
}

public Long call() {
    return calculate( input );
}

private long calculate( int n ) {
    if ( Thread.currentThread().isInterrupted() ) {
        return 0;
    }
    if ( n <= 1 ) {
        return n;
    } else {
        return calculate( n - 1 ) + calculate( n - 2 );
    }
}
Example Method to Execute and Cancel the Task

The fib() method below submits the Fibonacci calculation task above for number 'n' and waits a maximum of 3 seconds for the result. If the execution does not completed in three seconds, the future.get() method throws a TimeoutException and upon catching it, we cancel the execution, saving some CPU cycles.

long fib( int n ) throws Exception {
    HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
    IExecutorService es = hazelcastInstance.getExecutorService("es");
    Future<Long> future = es.submit( new FibonacciCallable( n ) );
    try {
        long result = future.get( 3, TimeUnit.SECONDS );
        System.out.println(result);
    } catch ( TimeoutException e ) {
        future.cancel( true );
    }
    return -1;
}

fib(20) probably takes less than 3 seconds. However, fib(50) takes much longer. (This is not an example for writing better Fibonacci calculation code, but for showing how to cancel a running execution that takes too long.) The method future.cancel(false) can only cancel execution before it is running (executing), but future.cancel(true) can interrupt running executions provided that your code is able to handle the interruption. If you are willing to cancel an already running task, then your task should be designed to handle interruptions. If the calculate (int n) method did not have the (Thread.currentThread().isInterrupted()) line, then you would not be able to cancel the execution after it is started.

10.1.6. Callback When Task Completes

You can use the ExecutionCallback offered by Hazelcast to asynchronously be notified when the execution is done. To be notified when your task completes without an error, implement the onResponse method. To be notified when your task completes with an error, implement the onFailure method.

Example Task to Callback

Let’s use the Fibonacci series to explain this. The example code below is the calculation that is executed. Note that it is Callable and Serializable.

public class Fibonacci2 implements Callable<Long>, Serializable {

    private final int input;

    public Fibonacci2(int input) {
        this.input = input;
    }

    public Long call() {
        return calculate(input);
    }

    private long calculate(int n) {
        if (Thread.currentThread().isInterrupted()) {
            System.out.println(