This manual is for an old version of Hazelcast IMDG, use the latest stable version.
Chapter 17. Internals

Chapter 17. Internals

Table of Contents

17.1. Internals 1: Threads
17.2. Internals 2: Serialization
17.3. Internals 3: Cluster Membership
17.4. Internals 4: Distributed Map

17.1. Internals 1: Threads

In this section, we will go over the Hazelcast's internal threads, the client threads executing Hazelcast API and how these threads work together in Hazelcast. When developing Hazelcast, you should know which thread will execute your code, which variables are local to that thread, and how you should interact with other threads.

1. Client Threads:

Client threads are your threads, user's application threads, and or user's application/web server's threads that are executing Hazelcast API. User's threads that are client to Hazelcast. For example, Hazelcast.getQueue("myqueue"), map.put(key, value), set.size() calls are initiated and finalized in the client threads. Serialization of the objects also happens in the client threads. This also eliminates the problems associated with classloaders. Client threads initiate the calls, serialize the objects into Hazelcast com.hazelcast.nio.Data object which holds ajava.nio.ByteBuffer. Data object is the binary representation of the application objects (key, value, item, callable objects). All Hazelcast threads such asServiceThread, InThread and OutThread work with Data objects; they don't know anything about the actual application objects. When the calls are finalized, if the return type is an object, Data object is returned to the client thread and client thread then will deserialize the Data (binary representation) back to the application objects.

For each client thread, there is a com.hazelcast.impl.ThreadContext thread-local instance attached which contains thread context information such as transaction.

2.ServiceThread:

ServiceThread, implemented atcom.hazelcast.impl.ClusterService, is the most significant thread in Hazelcast. Almost all none-IO operations happens in this thread. ServiceThread serves to local client calls and the calls from other members.

If you look at the ClusterService class you will see it is constantly dequeueing its queue and processing local and remote events. ClusterService queue receives local events in the form of com.hazelcast.impl.BaseManager.Processable instances and remote events in the form of com.hazelcast.nio.PacketQueue.Packet instances fromInThread.

All distributed data structures (queue, map, set) are eventually modified in this thread so there is -no- synchronization needed when data structures are accessed/modified.

It is important to understand that client threads initiates/finalizes the calls, in/out threads handles the socket read/writes and ServiceThread does the actually manipulation of the data structures. There is no other threads allowed to touch the data structures (maps, queues) so that there is no need to protect the data structures from multithread access: no synchronization when accessing data structures. It may sound inefficient to allow only one thread to do all data structure updates but here is the logic behind it (Please note that numbers given here are not exact but should give you an idea.): If there is only one member (no IO), ServiceThread utilization will be about 95% and it will process between 30K and 120K operations per second, depending on the server. As the number of members in the cluster increases, IO Threads will be using more CPU and eventually ServiceThread will go down to 35% CPU utilization so ServiceThread will process between 10K and 40K operations per second. ServiceThread CPU utilization will be at about 35% regardless of the size of the cluster. (The only thing that can affect that would be the network utilization.) This also means that total number of operations processed by the entire cluster will be between N*10K and N*40K; N being the number of nodes in the cluster. About half of these operations will be backup operations (assuming one backup) so client threads will realize between N*5K and N*20K operations per second in total. Since there is only one thread accessing the data structures, increase in the number of nodes doesn't create any contention so access to the data structures will be always at the same speed. This is very important for Hazelcast's scalability.

This also makes writing code super easy because significant portion of the code is actually single-threaded so it is less error-prone and easily maintainable.

No synchronization or long running jobs are allowed in this thread. All operations running in this thread have to complete in microseconds.

3. InThread andOutThread:

Hazelcast separates reads and writes by having two separate threads; one for reading, and the other for writing. Eache IO thread uses its own NIO selector instance. InThread handles OP_ACCEPT and OP_READ socket operations while OutThread handles OP_CONNECT and OP_WRITE operations.

Each thread has its queue that they dequeue to register these operations with their selectors so operation registrations and operation processing happens in the same threads.

InThread's runnable is the com.hazelcast.nio.InSelector andOutThread's runnable is thecom.hazelcast.nio.OutSelector. They both extends SelectorBase which constantly processes its registration queue ('selectorQueue') and the selectedKeys.

Members are connected to each other viaTCP/IP. If there are N number of members in the cluster then there will be N*(N-1) connection end point and (N*(N-1))/2 connections. There can be only one connection between two members, meaning, if m2 creates a connection to m1, m1 doesn't create another connection to m2 and the rule here is that new members connect to the older members.

If you look at thecom.hazelcast.nio.Connection, you will see that each connection is representing a socket channel and has com.hazelcast.nio.ReadHandler and WriteHandler instances which are attached to the socket channel's OP_READ and OP_WRITE operation selectionKeys respectively. When InSelector selects OP_READ selection key (when this operation is ready for the selector), InSelector will get the attached ReadHandler instance from the selectionKey and call its ReadHandler.handle() method. Same for theOutSelector. When OutSelector selects OP_WRITE selection key (when this operation is ready for the selector), OutSelector will get the attached WriteHandler instance from the selectionKey and call its WriteHandler.handle() method.

When ServiceThread wants to send an Invocation instance to a member, it will

  1. get the Connection for this member by calling com.hazelcast.nio.ConnectionManager.get().getConnection(address)

  2. check if the connection is live; Connection.live()

  3. if live, it will get the WriteHandler instance of the Connection

  4. enqueue the invocation into theWriteHandler's queue

  5. and add registration task intoOutSelector's queue, if necessary

  6. OutSelector processes the OP_WRITE operation registration with its selector

  7. when the selector is ready for the OP_WRITE operation, OutSelector will get the WriteHandler instance from selectionKey and call its WriteHandler.handle()

see com.hazelcast.impl.BaseManager.send(Packet, Address).

see com.hazelcast.nio.InOutSelector.run().

Connections are always registered/interested for OP_READ operations. When InSelector is ready for reading from a socket channel, it will get the ReadHandler instance from the selectionKey and call its handle() method. handle() method will read Invocation instances from the underlying socket and when an Invocation instance is fully read, it will enqueue it intoServiceThread's ( ClusterService class) queue to be processed.

4.MulticastThread:

If multicast discovery is enabled (this is the default), and node is the master (oldest member) in the cluster then MulticastThread is started to listen for join requests from the new members. When it receives join request ( com.hazelcast.nio.MulticastService.JoinInfo class), it will check if the node is allowed to join, if so, it will send its address to the sender so that the sender node can create a TCP/IP connection to the master and send aJoinRequest.

5. Executor Threads:

Each node employs a local ExecutorService threads which handle the event listener calls and distributed executions. Number of core and max threads can be configured.