com.hazelcast.mapreduce
Class KeyValueSource<K,V>

java.lang.Object
  extended by com.hazelcast.mapreduce.KeyValueSource<K,V>
Type Parameters:
K - key type
V - value type
All Implemented Interfaces:
Closeable
Direct Known Subclasses:
ListKeyValueSource, MapKeyValueSource, MultiMapKeyValueSource, SetKeyValueSource

@Beta
public abstract class KeyValueSource<K,V>
extends Object
implements Closeable

The abstract KeyValueSource class is used to implement custom data sources for mapreduce algorithms.
Default shipped implementations contains KeyValueSources for Hazelcast data structures like IMap and MultiMap. Custom implementations could be external files, URLs or any other data source can be visualized as key-value pairs.

Since:
3.2

Constructor Summary
KeyValueSource()
           
 
Method Summary
abstract  Map.Entry<K,V> element()
          Returns the current index' element Calls to this method won't change state.
static
<V> KeyValueSource<String,V>
fromList(IList<V> list)
          A helper method to build a KeyValueSource implementation based on the specified IList.
The key returned by this KeyValueSource implementation is ALWAYS the name of the list itself, whereas the value are the entries of list one by one.
static
<K,V> KeyValueSource<K,V>
fromMap(IMap<K,V> map)
          A helper method to build a KeyValueSource implementation based on the specified IMap
static
<K,V> KeyValueSource<K,V>
fromMultiMap(MultiMap<K,V> multiMap)
          A helper method to build a KeyValueSource implementation based on the specified MultiMap
static
<V> KeyValueSource<String,V>
fromSet(ISet<V> set)
          A helper method to build a KeyValueSource implementation based on the specified ISet.
The key returned by this KeyValueSource implementation is ALWAYS the name of the set itself, whereas the value are the entries of set one by one.
 Collection<K> getAllKeys()
           If isAllKeysSupported() returns true a call to this method has to return all clusterwide available keys.
protected  Collection<K> getAllKeys0()
          This method is meant for overriding to implement collecting of all clusterwide available keys and returning them from getAllKeys().
abstract  boolean hasNext()
          Called to request if at least one more key-value pair is available from this data source.
 boolean isAllKeysSupported()
           If it is possible to collect all clusterwide available keys for this KeyValueSource implementation then this method should return true.
If true is returned a call to getAllKeys() must return all available keys to execute a preselection of interesting partitions / nodes based on returns keys.
abstract  K key()
          Returns the current index' key for KeyPredicate analysis.
abstract  boolean open(NodeEngine nodeEngine)
          This method is called before accessing the key-value pairs of this KeyValueSource
abstract  boolean reset()
          This method need to reset all internal state as it would be a new instance at all.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.io.Closeable
close
 

Constructor Detail

KeyValueSource

public KeyValueSource()
Method Detail

open

public abstract boolean open(NodeEngine nodeEngine)
This method is called before accessing the key-value pairs of this KeyValueSource

Parameters:
nodeEngine - nodeEngine of this cluster node
Returns:
true if operation succeed otherwise false

hasNext

public abstract boolean hasNext()
Called to request if at least one more key-value pair is available from this data source. If so this method returns true otherwise it false. Calls to this method will change the state, more specifically if an element is found, the index will be set to the found element. Subsequent calls to the key() and element() methods will return that element.

Returns:
true if at least one more key-value pair is available from this data source, false otherwise.

key

public abstract K key()
Returns the current index' key for KeyPredicate analysis. This is called to prevent a possible deserialization of unneeded values because the key is not interesting for the running mapreduce algorithm. Calls to this method won't change state.

Returns:
the current index key for KeyPredicate analysis

element

public abstract Map.Entry<K,V> element()
Returns the current index' element Calls to this method won't change state.

Returns:
the current index element

reset

public abstract boolean reset()
This method need to reset all internal state as it would be a new instance at all. The same instance of the KeyValueSource may be used multiple times in a row depending on the internal implementation, especially when the KeyValueSource implements PartitionIdAware.
If the instance is reused a sequence of reset(), open(com.hazelcast.spi.NodeEngine) and Closeable.close() is called multiple times with the other methods between open(...) and close().

Returns:
true if reset was successful otherwise false

getAllKeys

public final Collection<K> getAllKeys()

If isAllKeysSupported() returns true a call to this method has to return all clusterwide available keys. If there is no chance to precollect all keys do to partitioning of the data isAllKeysSupported() must return false.

If this functionality is not available and Job.onKeys(Object[]), Job.onKeys(Iterable) or Job.keyPredicate(KeyPredicate) is used a preselection of the interesting partitions / nodes is not available and the overall processing speed my be degraded.

If isAllKeysSupported() returns false this method throws an UnsupportedOperationException.

Returns:
a collection of all clusterwide available keys

isAllKeysSupported

public boolean isAllKeysSupported()

If it is possible to collect all clusterwide available keys for this KeyValueSource implementation then this method should return true.
If true is returned a call to getAllKeys() must return all available keys to execute a preselection of interesting partitions / nodes based on returns keys.

If this functionality is not available and Job.onKeys(Object[]), Job.onKeys(Iterable) or Job.keyPredicate(KeyPredicate) is used a preselection of the interesting partitions / nodes is not available and the overall processing speed my be degraded.

Returns:
true if collecting clusterwide keys is available otherwide false

getAllKeys0

protected Collection<K> getAllKeys0()
This method is meant for overriding to implement collecting of all clusterwide available keys and returning them from getAllKeys().

Returns:
a collection of all clusterwide available keys

fromMap

public static <K,V> KeyValueSource<K,V> fromMap(IMap<K,V> map)
A helper method to build a KeyValueSource implementation based on the specified IMap

Type Parameters:
K - key type of the map
V - value type of the map
Parameters:
map - map to build a KeyValueSource implementation with
Returns:
KeyValueSource implementation based on the specified map

fromMultiMap

public static <K,V> KeyValueSource<K,V> fromMultiMap(MultiMap<K,V> multiMap)
A helper method to build a KeyValueSource implementation based on the specified MultiMap

Type Parameters:
K - key type of the multiMap
V - value type of the multiMap
Parameters:
multiMap - multiMap to build a KeyValueSource implementation with
Returns:
KeyValueSource implementation based on the specified multiMap

fromList

public static <V> KeyValueSource<String,V> fromList(IList<V> list)
A helper method to build a KeyValueSource implementation based on the specified IList.
The key returned by this KeyValueSource implementation is ALWAYS the name of the list itself, whereas the value are the entries of list one by one. So this implementation behaves like a MultiMap with a single key but multiple values.

Type Parameters:
V - value type of the list
Parameters:
list - list to build a KeyValueSource implementation with
Returns:
KeyValueSource implementation based on the specified list

fromSet

public static <V> KeyValueSource<String,V> fromSet(ISet<V> set)
A helper method to build a KeyValueSource implementation based on the specified ISet.
The key returned by this KeyValueSource implementation is ALWAYS the name of the set itself, whereas the value are the entries of set one by one. So this implementation behaves like a MultiMap with a single key but multiple values.

Type Parameters:
V - value type of the set
Parameters:
set - set to build a KeyValueSource implementation with
Returns:
KeyValueSource implementation based on the specified set


Copyright © 2015 Hazelcast, Inc.. All Rights Reserved.