One of the main concerns when writing custom sources is that the source is typically distributed across multiple machines and partitions, and the work needs to be distributed across multiple members and processors.
Jet provides a flexible ProcessorMetaSupplier and ProcessorSupplier
API which can be used to control how a source is distributed across the
network.
The procedure for generating Processor instances is as follows:
- The ProcessorMetaSupplierfor theVertexis serialized and sent to the coordinating member.
- The coordinator calls ProcessorMetaSupplier.get()once for each member in the cluster and aProcessorSupplieris created for each member.
- The ProcessorSupplierfor each member is serialized and sent to that member.
- Each member will call their own ProcessorSupplierwith the correct count parameter, which corresponds to thelocalParallelismsetting of that vertex.
