One of the main concerns when writing custom sources is that the source is typically distributed across multiple machines and partitions, and the work needs to be distributed across multiple members and processors.
Jet provides a flexible
API which can be used to control how a source is distributed across the
The procedure for generating
Processor instances is as follows:
Vertexis serialized and sent to the coordinating member.
- The coordinator calls
ProcessorMetaSupplier.get()once for each member in the cluster and a
ProcessorSupplieris created for each member.
ProcessorSupplierfor each member is serialized and sent to that member.
- Each member will call their own
ProcessorSupplierwith the correct count parameter, which corresponds to the
localParallelismsetting of that vertex.