public class HadoopWordCount extends Object
For more details about the word count pipeline itself, please see the JavaDoc
WordCount class in
HadoopSources.inputFormat(Configuration, BiFunctionEx) is a source
that can be used for reading from HDFS given a
with input paths and input formats. The files in the input folder
will be split among Jet processors, using
HadoopSinks.outputFormat(Configuration, FunctionEx, FunctionEx)
writes the output to the given output path, with each
processor writing to a single file within the path. The files are
identified by the member ID and the local ID of the writing processor.
Unlike in MapReduce, the data in the output files is not sorted by key.
In this example, files are read from and written to using
TextOutputFormat respectively, but the
example can be adjusted to be used with any input/output format.
|Constructor and Description|
|Modifier and Type||Method and Description|
Copyright © 2020 Hazelcast, Inc.. All rights reserved.