com.hazelcast.mapreduce
Interface Mapper<KeyIn,ValueIn,KeyOut,ValueOut>

Type Parameters:
KeyIn - The type of key used in the KeyValueSource
ValueIn - The type of value used in the KeyValueSource
KeyOut - The key type for mapped results
ValueOut - The value type for mapped results
All Superinterfaces:
Serializable
All Known Subinterfaces:
LifecycleMapper<KeyIn,ValueIn,KeyOut,ValueOut>
All Known Implementing Classes:
LifecycleMapperAdapter

@Beta
public interface Mapper<KeyIn,ValueIn,KeyOut,ValueOut>
extends Serializable

The interface Mapper is used to build mappers for the Job. Most mappers will only need to implement this interface and the map(Object, Object, Context) method to collect and emit needed key-value pairs.
For more complex algorithms there is the possibility to implement the LifecycleMapper interface and override the LifecycleMapper.initialize(Context) and LifecycleMapper.finalized(Context) methods as well.

A simple mapper could look like the following example:

 public class MyMapper extends Mapper<Integer, Integer, String, Integer>
 {
   public void map( Integer key, Integer value, Context<String, Integer> context )
   {
     context.emit( String.valueOf( key ), value );
   }
 }
 

If you want to know more about the implementation of MapReduce algorithms read the Google Whitepaper on MapReduce.

Since:
3.2

Method Summary
 void map(KeyIn key, ValueIn value, Context<KeyOut,ValueOut> context)
          The map method is called for every single key-value pair in the bound KeyValueSource instance on this cluster node and partition.
Due to it's nature of a DataGrid Hazelcast distributes values all over the cluster and so this method is executed on multiple servers at the same time.
If you want to know more about the implementation of MapReduce algorithms read the Google Whitepaper on MapReduce.
 

Method Detail

map

void map(KeyIn key,
         ValueIn value,
         Context<KeyOut,ValueOut> context)
The map method is called for every single key-value pair in the bound KeyValueSource instance on this cluster node and partition.
Due to it's nature of a DataGrid Hazelcast distributes values all over the cluster and so this method is executed on multiple servers at the same time.
If you want to know more about the implementation of MapReduce algorithms read the Google Whitepaper on MapReduce.

Parameters:
key - key to map
value - value to map
context - Context to be used for emitting values


Copyright © 2014 Hazelcast, Inc.. All Rights Reserved.