Package com.hazelcast.jet.pipeline
Interface Stage
- All Known Subinterfaces:
BatchStage<T>
,GeneralStage<T>
,SinkStage
,StreamStage<T>
public interface Stage
The basic element of a Jet
pipeline
, represents
a computation step. It accepts input from its upstream stages (if any)
and passes its output to its downstream stages (if any). Jet
differentiates between batch stages
that represent
finite data sets (batches) and stream stages
that
represent infinite data streams. Some operations only make sense on a
batch stage and vice versa.
To build a pipeline, start with pipeline.readFrom()
to
get the initial stage and then use its methods to attach further
downstream stages. Terminate the pipeline by calling stage.writeTo(sink)
, which will attach a
SinkStage
.
- Since:
- Jet 3.0
-
Method Summary
Modifier and TypeMethodDescriptionReturns thePipeline
this stage belongs to.name()
Returns the name of this stage.setLocalParallelism
(int localParallelism) Sets the preferred local parallelism (number of processors per Jet cluster member) this stage will configure its DAG vertices with.Overrides the default name of the stage with the name you choose and returns the stage.
-
Method Details
-
getPipeline
Pipeline getPipeline()Returns thePipeline
this stage belongs to. -
setLocalParallelism
Sets the preferred local parallelism (number of processors per Jet cluster member) this stage will configure its DAG vertices with. Jet always uses the same number of processors on each member, so the total parallelism automatically increases if another member joins the cluster.While most stages are backed by 1 vertex, there are exceptions. If a stage uses two vertices, each of them will have the given local parallelism, so in total there will be twice as many processors per member.
The default value is -1 and it signals to Jet to figure out a default value. Jet will determine the vertex's local parallelism during job initialization from the global default and the processor meta-supplier's preferred value.
- Returns:
- this stage
-
setName
Overrides the default name of the stage with the name you choose and returns the stage. This can be useful for debugging purposes, to better distinguish pipeline stages in the diagnostic output.- Parameters:
name
- the stage name- Returns:
- this stage
-
name
Returns the name of this stage. It's used in diagnostic output.
-