public final class HdfsSinks extends Object
Modifier and Type | Method and Description |
---|---|
static <K,V> Sink<Map.Entry<K,V>> |
hdfs(org.apache.hadoop.mapred.JobConf jobConf)
Convenience for
hdfs(JobConf, DistributedFunction,
DistributedFunction) which expects Map.Entry<K, V> as
input and extracts its key and value parts to be written to HDFS. |
static <E,K,V> Sink<E> |
hdfs(org.apache.hadoop.mapred.JobConf jobConf,
DistributedFunction<? super E,K> extractKeyF,
DistributedFunction<? super E,V> extractValueF)
Returns a sink that writes to Apache Hadoop HDFS.
|
@Nonnull public static <E,K,V> Sink<E> hdfs(@Nonnull org.apache.hadoop.mapred.JobConf jobConf, @Nonnull DistributedFunction<? super E,K> extractKeyF, @Nonnull DistributedFunction<? super E,V> extractValueF)
JobConf
.
The sink creates a number of files in the output path, identified by the
cluster member ID and the Processor
ID. Unlike MapReduce, the
data in the files is not sorted by key.
The supplied JobConf
must specify an OutputFormat
with
a path.
No state is saved to snapshot for this sink. After the job is restarted, the items will likely be duplicated, providing an at-least-once guarantee.
Default local parallelism for this processor is 2 (or less if less CPUs are available).
E
- stream item typeK
- type of key to write to HDFSV
- type of value to write to HDFSjobConf
- JobConf
used for output format configurationextractKeyF
- mapper to map a key to another keyextractValueF
- mapper to map a value to another value@Nonnull public static <K,V> Sink<Map.Entry<K,V>> hdfs(@Nonnull org.apache.hadoop.mapred.JobConf jobConf)
hdfs(JobConf, DistributedFunction,
DistributedFunction)
which expects Map.Entry<K, V>
as
input and extracts its key and value parts to be written to HDFS.Copyright © 2018 Hazelcast, Inc.. All rights reserved.