public final class HdfsSources extends Object
Modifier and Type | Method and Description |
---|---|
static <K,V> Source<Map.Entry<K,V>> |
hdfs(org.apache.hadoop.mapred.JobConf jobConf)
Convenience for
hdfs(JobConf, DistributedBiFunction)
with Map.Entry as its output type. |
static <K,V,E> Source<E> |
hdfs(org.apache.hadoop.mapred.JobConf jobConf,
DistributedBiFunction<K,V,E> mapper)
Returns a source that reads records from Apache Hadoop HDFS and emits
the results of transforming each record (a key-value pair) with the
supplied mapping function.
|
@Nonnull public static <K,V,E> Source<E> hdfs(@Nonnull org.apache.hadoop.mapred.JobConf jobConf, @Nonnull DistributedBiFunction<K,V,E> mapper)
This source splits and balances the input data among Jet processors
, doing its best to achieve
data locality. To this end the Jet cluster topology should be aligned
with Hadoop's — on each Hadoop member there should be a Jet
member.
This source does not save any state to snapshot. If the job is restarted, all entries will be emitted again.
K
- key type of the recordsV
- value type of the recordsE
- the type of the emitted valuejobConf
- JobConf for reading files with the appropriate input format and pathmapper
- mapper which can be used to map the key and value to another valueCopyright © 2017 Hazelcast, Inc.. All Rights Reserved.