com.hazelcast.mapreduce
Class KeyValueSource<K,V>

java.lang.Object
  extended by com.hazelcast.mapreduce.KeyValueSource<K,V>
Type Parameters:
K - key type
V - value type
All Implemented Interfaces:
Closeable
Direct Known Subclasses:
ListKeyValueSource, MapKeyValueSource, MultiMapKeyValueSource, SetKeyValueSource

@Beta
public abstract class KeyValueSource<K,V>
extends Object
implements Closeable

The abstract KeyValueSource class is used to implement custom data sources for mapreduce algorithms.
Default shipped implementations contains KeyValueSources for Hazelcast data structures like IMap and MultiMap. Custom implementations could be external files, URLs or any other data source that can be visualized as key-value pairs.

Since:
3.2

Constructor Summary
KeyValueSource()
           
 
Method Summary
abstract  Map.Entry<K,V> element()
          Returns the current index element Calls to this method won't change state.
static
<V> KeyValueSource<String,V>
fromList(IList<V> list)
          A helper method to build a KeyValueSource implementation based on the specified IList.
The key returned by this KeyValueSource implementation is ALWAYS the name of the list itself, whereas the value are the entries of the list, one by one.
static
<K,V> KeyValueSource<K,V>
fromMap(IMap<K,V> map)
          A helper method to build a KeyValueSource implementation based on the specified IMap
static
<K,V> KeyValueSource<K,V>
fromMultiMap(MultiMap<K,V> multiMap)
          A helper method to build a KeyValueSource implementation based on the specified MultiMap
static
<V> KeyValueSource<String,V>
fromSet(ISet<V> set)
          A helper method to build a KeyValueSource implementation based on the specified ISet.
The key returned by this KeyValueSource implementation is ALWAYS the name of the set itself, whereas the value are the entries of the set, one by one.
 Collection<K> getAllKeys()
           If isAllKeysSupported() returns true, a call to this method returns all clusterwide available keys.
protected  Collection<K> getAllKeys0()
          This method is meant to be overridden to implement collecting of all clusterwide available keys and return them from getAllKeys().
abstract  boolean hasNext()
          Called to request if at least one more key-value pair is available from this data source.
 boolean isAllKeysSupported()
           If it is possible to collect all clusterwide available keys for this KeyValueSource implementation then this method returns true.
If true is returned, a call to getAllKeys() must return all available keys to execute a preselection of interesting partitions / nodes based on returns keys.
abstract  K key()
          Returns the current index key for KeyPredicate analysis.
abstract  boolean open(NodeEngine nodeEngine)
          This method is called before accessing the key-value pairs of this KeyValueSource.
abstract  boolean reset()
          This method resets all internal state to be a new instance.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.io.Closeable
close
 

Constructor Detail

KeyValueSource

public KeyValueSource()
Method Detail

open

public abstract boolean open(NodeEngine nodeEngine)
This method is called before accessing the key-value pairs of this KeyValueSource.

Parameters:
nodeEngine - nodeEngine of this cluster node
Returns:
true if the operation succeeded, false otherwise

hasNext

public abstract boolean hasNext()
Called to request if at least one more key-value pair is available from this data source. If so, this method returns true, otherwise it returns false. Calls to this method will change the state, more specifically if an element is found, the index will be set to the found element. Subsequent calls to the key() and element() methods will return that element.

Returns:
true if at least one more key-value pair is available from this data source, false otherwise.

key

public abstract K key()
Returns the current index key for KeyPredicate analysis. This is called to prevent a possible deserialization of unneeded values because the key is not interesting for the running mapreduce algorithm. Calls to this method won't change state.

Returns:
the current index key for KeyPredicate analysis

element

public abstract Map.Entry<K,V> element()
Returns the current index element Calls to this method won't change state.

Returns:
the current index element

reset

public abstract boolean reset()
This method resets all internal state to be a new instance. The same instance of the KeyValueSource may be used multiple times in a row depending on the internal implementation, especially when the KeyValueSource implements PartitionIdAware.
If the instance is reused, a sequence of reset(), open(com.hazelcast.spi.NodeEngine) and Closeable.close() is called multiple times with the other methods between open(...) and close().

Returns:
true if reset was successful, false otherwise

getAllKeys

public final Collection<K> getAllKeys()

If isAllKeysSupported() returns true, a call to this method returns all clusterwide available keys. If there is no chance to precollect all keys due to partitioning of the data isAllKeysSupported(), this method returns false.

If this functionality is not available and Job.onKeys(Object[]), Job.onKeys(Iterable), or Job.keyPredicate(KeyPredicate) is used, a preselection of the interesting partitions / nodes is not available and the overall processing speed my be degraded.

If isAllKeysSupported() returns false this method throws an UnsupportedOperationException.

Returns:
a collection of all clusterwide available keys

isAllKeysSupported

public boolean isAllKeysSupported()

If it is possible to collect all clusterwide available keys for this KeyValueSource implementation then this method returns true.
If true is returned, a call to getAllKeys() must return all available keys to execute a preselection of interesting partitions / nodes based on returns keys.

If this functionality is not available and Job.onKeys(Object[]), Job.onKeys(Iterable), or Job.keyPredicate(KeyPredicate) is used, a preselection of the interesting partitions / nodes is not available and the overall processing speed my be degraded.

Returns:
true if collecting clusterwide keys is available, false otherwise

getAllKeys0

protected Collection<K> getAllKeys0()
This method is meant to be overridden to implement collecting of all clusterwide available keys and return them from getAllKeys().

Returns:
a collection of all clusterwide available keys

fromMap

public static <K,V> KeyValueSource<K,V> fromMap(IMap<K,V> map)
A helper method to build a KeyValueSource implementation based on the specified IMap

Type Parameters:
K - key type of the map
V - value type of the map
Parameters:
map - map to build a KeyValueSource implementation
Returns:
KeyValueSource implementation based on the specified map

fromMultiMap

public static <K,V> KeyValueSource<K,V> fromMultiMap(MultiMap<K,V> multiMap)
A helper method to build a KeyValueSource implementation based on the specified MultiMap

Type Parameters:
K - key type of the multiMap
V - value type of the multiMap
Parameters:
multiMap - multiMap to build a KeyValueSource implementation
Returns:
KeyValueSource implementation based on the specified multiMap

fromList

public static <V> KeyValueSource<String,V> fromList(IList<V> list)
A helper method to build a KeyValueSource implementation based on the specified IList.
The key returned by this KeyValueSource implementation is ALWAYS the name of the list itself, whereas the value are the entries of the list, one by one. This implementation behaves like a MultiMap with a single key but multiple values.

Type Parameters:
V - value type of the list
Parameters:
list - list to build a KeyValueSource implementation
Returns:
KeyValueSource implementation based on the specified list

fromSet

public static <V> KeyValueSource<String,V> fromSet(ISet<V> set)
A helper method to build a KeyValueSource implementation based on the specified ISet.
The key returned by this KeyValueSource implementation is ALWAYS the name of the set itself, whereas the value are the entries of the set, one by one. This implementation behaves like a MultiMap with a single key but multiple values.

Type Parameters:
V - value type of the set
Parameters:
set - set to build a KeyValueSource implementation
Returns:
KeyValueSource implementation based on the specified set


Copyright © 2015 Hazelcast, Inc.. All Rights Reserved.