The hazelcast-jet-hadoop module provides read and write capabilities to
Apache Hadoop.
The ReadHdfsP and WriteHdfsP classes provide source and sink processors
which can be used for reading and writing, respectively. The processors
take a JobConf as parameters which can be used to specify the
InputFormat, OutputFormat and their respective paths.
Example:
JobConf jobConf = new JobConf();
jobConf.setInputFormat(TextInputFormat.class);
jobConf.setOutputFormat(TextOutputFormat.class);
TextInputFormat.addInputPath(jobConf, inputPath);
TextOutputFormat.setOutputPath(jobConf, outputPath);
Vertex source = dag.newVertex("source", ReadHdfsP.readHdfs(jobConf));
Vertex sink = dag.newVertex("sink", WriteHdfsP.writeHdfs(jobConf));
...
See the HDFS code sample for a fully worked example.
