public interface Processor
Processor
on each cluster member to do the work of a given vertex. The
vertex's localParallelism
property controls the number of
processors per member.
The processor is a single-threaded processing unit that performs the computation needed to transform zero or more input data streams into zero or more output streams. Each input/output stream corresponds to an edge on the vertex. The correspondence between a stream and an edge is established via the edge's ordinal.
The special case of zero input streams applies to a source vertex, which gets its data from the environment. The special case of zero output streams applies to a sink vertex, which pushes its data to the environment.
The processor accepts input from instances of Inbox
and pushes
its output to an instance of Outbox
.
See the isCooperative()
for important restrictions to how the
processor should work.
Modifier and Type | Interface and Description |
---|---|
static interface |
Processor.Context
Context passed to the processor in the
init() call. |
Modifier and Type | Method and Description |
---|---|
default void |
close()
Called as the last method in the processor lifecycle.
|
default boolean |
complete()
Called after all the inbound edges' streams are exhausted.
|
default boolean |
completeEdge(int ordinal)
Called after the edge input with the supplied
ordinal is
exhausted. |
default boolean |
finishSnapshotRestore()
Called after a job was restarted from a snapshot and the processor
has consumed all the snapshot data.
|
default void |
init(Outbox outbox,
Processor.Context context)
Initializes this processor with the outbox that the processing methods
must use to deposit their output items.
|
default boolean |
isCooperative()
Tells whether this processor is able to participate in cooperative
multithreading.
|
default void |
process(int ordinal,
Inbox inbox)
Called with a batch of items retrieved from an inbound edge's stream.
|
default void |
restoreFromSnapshot(Inbox inbox)
Called when a batch of items is received during the "restore from
snapshot" operation.
|
default boolean |
saveToSnapshot()
Stores its snapshotted state by adding items to the outbox's snapshot bucket.
|
default boolean |
tryProcess()
This method will be called periodically and only when the current batch
of items in the inbox has been exhausted.
|
default boolean |
tryProcessWatermark(Watermark watermark)
Tries to process the supplied watermark.
|
default boolean isCooperative()
A cooperative processor should also not attempt any blocking operations,
such as I/O operations, waiting for locks/semaphores or sleep
operations. Violations of this rule will manifest as less than 100% CPU
usage under maximum load. The processor must also return as soon as the
outbox rejects an item (that is when the offer()
method returns false
).
If this processor declares itself cooperative, it will share a thread with other cooperative processors. Otherwise it will run in a dedicated Java thread.
Jet prefers cooperative processors because they result in a greater overall throughput. A processor should be non-cooperative only if it involves blocking operations, which would cause all other processors on the same shared thread to starve.
Processor instances of a single vertex are allowed to return different values, but a single processor instance must always return the same value.
The default implementation returns true
.
default void init(@Nonnull Outbox outbox, @Nonnull Processor.Context context)
process(int, Inbox)
and complete()
).
The default implementation does nothing.
default void process(int ordinal, @Nonnull Inbox inbox)
If the method returns with items still present in the inbox, it will be called again before proceeding to call any other methods. There is at least one item in the inbox when this method is called.
The default implementation does nothing.
ordinal
- ordinal of the inbound edgeinbox
- the inbox containing the pending itemsdefault boolean tryProcessWatermark(@Nonnull Watermark watermark)
maximum retention time
has elapsed.
The implementation may choose to process only partially and return
false
, in which case it will be called again later with the same
timestamp
before any other processing method is called. When the
method returns true
, the watermark is forwarded to the
downstream processors.
The default implementation just returns true
.
watermark
- watermark to be processedtrue
if this watermark has now been processed,
false
otherwise.default boolean tryProcess()
If the call returns false
, it will be called again before proceeding
to call any other method. Default implementation returns true
.
default boolean completeEdge(int ordinal)
ordinal
is
exhausted. If it returns false
, it will be called again before
proceeding to call any other method.true
if the processor is now done completing the edge,
false
otherwise.default boolean complete()
false
, it will be invoked again until it returns true
.
For example, a streaming source processor will return false
forever. Unlike other methods which guarantee that no other method is
called until they return true
, saveToSnapshot()
can be
called even though this method returned false
. If you returned
because Outbox.offer()
returned false
, make sure to
first offer the pending item to the outbox in saveToSnapshot()
before continuing to offer to
snapshot.
After this method is called, no other processing methods are called on
this processor, except for saveToSnapshot()
.
Non-cooperative processors are required to return from this method from time to time to give the system a chance to check for snapshot requests and job cancellation. The time the processor spends in this method affects the latency of snapshots and job cancellations.
true
if the completing step is now done, false
otherwise.default boolean saveToSnapshot()
false
, it will be called again before proceeding to call any
other method.
This method will only be called after a call to process()
returns and the inbox is empty. After all the input is
exhausted, it may also be called between complete()
calls. Once
complete()
returns true
, this method won't be called
anymore.
Note: if you returned from complete()
because some of
the Outbox.offer()
method returned false, you need to make sure
to re-offer the pending item in this method before offering any items to
Outbox.offerToSnapshot(java.lang.Object, java.lang.Object)
.
The default implementation takes no action and returns true
.
default void restoreFromSnapshot(@Nonnull Inbox inbox)
Map.Entry
. May emit items to the outbox.
If it returns with items still present in the inbox, it will be called again before proceeding to call any other methods. It is never called with an empty inbox.
The default implementation throws an exception.
default boolean finishSnapshotRestore()
If it returns false
, it will be called again before proceeding
to call any other methods.
The default implementation takes no action and returns true
.
default void close() throws Exception
ProcessorSupplier.close(java.lang.Throwable)
is called on this member. The method might get
called even if init(com.hazelcast.jet.core.Outbox, com.hazelcast.jet.core.Processor.Context)
method was not yet called.
The method will be called right after complete()
returns true
, that is before the job is finished. The job might still be
running other processors.
If this method throws an exception, it is logged but it won't be reported as a job failure or cause the job to fail.
Exception
Copyright © 2018 Hazelcast, Inc.. All rights reserved.