com.hazelcast.mapreduce (Hazelcast Root 3.4.2 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

@Beta

Package com.hazelcast.mapreduce

This package contains the MapReduce API definition for Hazelcast.
All map reduce operations running in a distributed manner inside the active Hazelcast cluster.

See:
Description

Interface Summary
Collator<ValueIn,ValueOut>	This interface can be implemented to define a Collator which is executed after calculation of the MapReduce algorithm on remote cluster nodes but before returning the final result. Collator can for example be used to sum up a final value.
CombinerFactory<KeyIn,ValueIn,ValueOut>	A CombinerFactory implementation is used to build `Combiner` instances per key. An implementation needs to be serializable by Hazelcast since it is distributed together with the `Mapper` implementation to run alongside.
Context<K,V>	The Context interface is used for emitting keys and values to the intermediate working space of the MapReduce algorithm.
Job<KeyIn,ValueIn>	This interface describes a mapreduce Job that is build by `JobTracker.newJob(KeyValueSource)`. It is used to execute mappings and calculations on the different cluster nodes and reduce or collate these mapped values to results.
JobCompletableFuture<V>	This is a special version of ICompletableFuture to return the assigned job id of the submit operation.
JobPartitionState	An implementation of this interface contains current information about the status of an process piece while operation is executing.
JobProcessInformation	This interface holds basic information about a running map reduce job like state of the different partitions and the number of currently processed records. The number of processed records is not a real time value but updated on regular base (after 1000 processed elements per node).
JobTracker	The JobTracker interface is used to create instances of `Job`s depending on the given data structure / data source.
KeyPredicate<Key>	This interface is used to pre evaluate keys before spreading the MapReduce task to the cluster.
LifecycleMapper<KeyIn,ValueIn,KeyOut,ValueOut>	The LifecycleMapper interface is a more sophisticated version of `Mapper` normally used for complexer algorithms with a need of initialization and finalization.
Mapper<KeyIn,ValueIn,KeyOut,ValueOut>	The interface Mapper is used to build mappers for the `Job`.
MappingJob<EntryKey,KeyIn,ValueIn>	This interface describes a mapping mapreduce Job. For further information `Job`.
PartitionIdAware	This interface can be used to mark implementation being aware of the data partition it is currently working on.
ReducerFactory<KeyIn,ValueIn,ValueOut>	A ReducerFactory implementation is used to build `Reducer` instances per key. An implementation needs to be serializable by Hazelcast since it might be distributed inside the cluster to do parallel calculations of reducing step.
ReducingJob<EntryKey,KeyIn,ValueIn>	This interface describes a reducing mapreduce Job. For further information `Job`.
ReducingSubmittableJob<EntryKey,KeyIn,ValueIn>	This interface describes a submittable mapreduce Job. For further information `Job`.
TrackableJob<V>	This interface describes a trackable job.

Class Summary
Combiner<ValueIn,ValueOut>	The abstract Combiner class is used to build combiners for the `Job`. Those Combiners are distributed inside of the cluster and are running alongside the `Mapper` implementations in the same node. Combiners are called in a threadsafe way so internal locking is not required.
KeyValueSource<K,V>	The abstract KeyValueSource class is used to implement custom data sources for mapreduce algorithms. Default shipped implementations contains KeyValueSources for Hazelcast data structures like `IMap` and `MultiMap`.
LifecycleMapperAdapter<KeyIn,ValueIn,KeyOut,ValueOut>	The abstract LifecycleMapperAdapter superclass is used to ease building mappers for the `Job`.
Reducer<ValueIn,ValueOut>	The abstract Reducer class is used to build reducers for the `Job`. Reducers may be distributed inside of the cluster but there is always only one Reducer per key.

Enum Summary
JobPartitionState.State	Definition of the processing states
TopologyChangedStrategy	This enum class is used to define how a map reduce job behaves if the job owner recognizes a topology changed event. When members are leaving the cluster it might happen to loose processed data chunks that were already send to the reducers on the leaving node. In addition to that on any topology change there is a redistribution of the member assigned partitions which means that a map job might have a problem to finish it's currently processed partition. The default behavior is immediately cancelling the running task and throwing an `TopologyChangedException` but it is possible to submit the same job configuration again if `JobTracker.getTrackableJob(String)` returns null for the requested job id.

Enum Summary

JobPartitionState.State Definition of the processing states

TopologyChangedStrategy This enum class is used to define how a map reduce job behaves if the job owner recognizes a topology changed event.
When members are leaving the cluster it might happen to loose processed data chunks that were already send to the reducers on the leaving node.
In addition to that on any topology change there is a redistribution of the member assigned partitions which means that a map job might have a problem to finish it's currently processed partition.
The default behavior is immediately cancelling the running task and throwing an TopologyChangedException but it is possible to submit the same job configuration again if JobTracker.getTrackableJob(String) returns null for the requested job id.

Exception Summary
RemoteMapReduceException	This exception class is used to show stacktraces of multiple failed remote operations at once.
TopologyChangedException	This exception is thrown when a topology change happens during the execution of a map reduce job and the `TopologyChangedStrategy` is set to `TopologyChangedStrategy.CANCEL_RUNNING_OPERATION`.

Package com.hazelcast.mapreduce Description

This package contains the MapReduce API definition for Hazelcast.
All map reduce operations running in a distributed manner inside the active Hazelcast cluster. Therefor Mapper, Combiner and Reducer implementations need to be fully serializable by Hazelcast. Any of the existing serialization patterns are available for those classes, too.
If custom KeyValueSource is provided above statement also applies to this implementation.

For a basic idea how to use this framework see Job or Mapper, Combiner or Reducer.

Since:: 3.2