Interface Job
DAG
or
Pipeline
to the cluster. See JetService
for methods to
submit jobs and to get a handle to an existing job.- Since:
- Jet 3.0
-
Method Summary
Modifier and TypeMethodDescriptionaddStatusListener
(JobStatusListener listener) Associates the given listener to this job.void
cancel()
Makes a request to cancel this job and returns.Exports and saves a state snapshot with the given name, and then cancels the job without processing any more data after the barrier (graceful cancellation).exportSnapshot
(String name) Exports a state snapshot and saves it under the given name.Returns the configuration this job was submitted with.Gets the future associated with the job.long
getId()
Returns the ID of this job.default String
Returns the string representation of this job's ID.Returns a snapshot of the current values of all job-specific metrics.getName()
Returns the name of this job ornull
if no name was supplied.Returns the current status of this job.long
Returns the time when the job was submitted to the cluster.Return adescription of the cause
that has led to the suspension of the job.boolean
Returnstrue
if this instance represents a light job.boolean
Returns true, if the job is user-cancelled.default void
join()
Waits for the job to complete and throws an exception if the job completes with an error.boolean
Stops delivering all events to the listener with the given registration id.void
restart()
Gracefully stops the current execution and schedules a new execution with the current member list of the Jet cluster.void
resume()
Resumes a suspended job.void
suspend()
Gracefully suspends the current execution of the job.updateConfig
(DeltaJobConfig deltaConfig) Applies the specified delta configuration to asuspended
job and returns the updated configuration.
-
Method Details
-
isLightJob
boolean isLightJob()Returnstrue
if this instance represents a light job. For a light job, many of the methods in this interface throwUnsupportedOperationException
. -
getId
long getId()Returns the ID of this job. -
getIdString
Returns the string representation of this job's ID. -
join
default void join()Waits for the job to complete and throws an exception if the job completes with an error. Never returns for streaming (unbounded) jobs, unless they fail or are cancelled. In rare cases it can happen that after this method returns, the job is not yet fully cleaned up.Joining a suspended job will block until that job is resumed and completes.
Shorthand for
job.getFuture().join()
.- Throws:
CancellationException
- if the job was cancelled
-
getFuture
Gets the future associated with the job. The returned future is not cancellable. To cancel the job, thecancel()
method should be used.- Throws:
IllegalStateException
- if the job has not started yet.
-
cancel
void cancel()Makes a request to cancel this job and returns. The job will complete after its execution has stopped on all the nodes, which can happen some time after this method returns.After cancellation,
join()
will throw aCancellationException
.If the job is already suspended, Jet will delete its runtime resources and snapshots and it won't be able to resume again.
NOTE: if the cluster becomes unstable (a member leaves or similar) while the job is in the process of cancellation, it may end up getting restarted after the cluster has stabilized and won't be cancelled. Call
getStatus()
to find out and possibly try to cancel again.The job status will be
JobStatus.FAILED
after cancellation.See
cancelAndExportSnapshot(String)
to cancel with a terminal snapshot.- Throws:
IllegalStateException
- if the cluster is not in a state to restart the job, for example when coordinator member left and new coordinator did not yet load job's metadata.JobNotFoundException
- for light jobs, if the job already completed
-
getSubmissionTime
long getSubmissionTime()Returns the time when the job was submitted to the cluster.The time is assigned by reading
System.currentTimeMillis()
of the coordinator member that executes the job for the first time. It doesn't change on restart. -
getName
Returns the name of this job ornull
if no name was supplied.Jobs can be named through
JobConfig.setName(String)
prior to submission. For light jobs it always returnsnull
. -
getStatus
-
isUserCancelled
boolean isUserCancelled()Returns true, if the job is user-cancelled. Returns false, if it completed normally or failed due to another error. Jobs running in clusters before version 5.3 lack this information and will always return false.- Throws:
IllegalStateException
- if the job is not done.- Since:
- 5.3
-
addStatusListener
Associates the given listener to this job. The listener is automatically removed after a terminal event.- Returns:
- The registration id
- Throws:
UnsupportedOperationException
- if the cluster version is less than 5.3IllegalStateException
- if the job is completed or failed- Since:
- 5.3
-
removeStatusListener
Stops delivering all events to the listener with the given registration id.- Returns:
- Whether the specified registration was removed
- Throws:
UnsupportedOperationException
- if the cluster version is less than 5.3- Since:
- 5.3
-
getConfig
Returns the configuration this job was submitted with. Changes made to the returned config object will not have any effect. -
updateConfig
Applies the specified delta configuration to asuspended
job and returns the updated configuration.- Throws:
IllegalStateException
- if this job is not suspendedUnsupportedOperationException
- if called for a light job- Since:
- 5.3
-
getSuspensionCause
Return adescription of the cause
that has led to the suspension of the job. Throws anIllegalStateException
if the job is not currently suspended. Not supported for light jobs.- Throws:
UnsupportedOperationException
- if called for a light job- Since:
- Jet 4.3
-
getMetrics
Returns a snapshot of the current values of all job-specific metrics. Not supported for light jobs.While the job is running the metric values are updated periodically (see metrics collection frequency), assuming that both global metrics collection and per-job metrics collection are enabled. Otherwise empty metrics will be returned.
Keep in mind that the collections may occur at different times on each member, metrics from various members aren't from the same instant.
When a job is restarted (or resumed after being previously suspended) the metrics are reset too, their values will reflect only updates from the latest execution of the job.
Once a job completes successfully, the metrics will have their most recent values (i.e. the last metric values from the moment before the job completed), assuming that
metrics storage
was enabled. If a job fails, is cancelled or suspended, empty metrics will be returned.- Throws:
UnsupportedOperationException
- if called for a light job- Since:
- Jet 3.2
-
restart
void restart()Gracefully stops the current execution and schedules a new execution with the current member list of the Jet cluster. Can be called to manually make use of added members, if auto scaling is disabled. Only a running job can be restarted; a suspended job must be resumed.Conceptually this call is equivalent to
suspend()
&resume()
. Not supported for light jobs.- Throws:
IllegalStateException
- if the job is not running, for example it has already completed, is not yet running, is already restarting, suspended etc.UnsupportedOperationException
- if called for a light job
-
suspend
void suspend()Gracefully suspends the current execution of the job. The job's status will becomeJobStatus.SUSPENDED
. To resume the job, callresume()
. Not supported for light jobs.You can suspend a job even if it's not configured for snapshotting. Such a job will resume with empty state, as if it has just been started.
This call just initiates the suspension process and doesn't wait for it to complete. Suspension starts with creating a terminal state snapshot. Should the terminal snapshot fail, the job will suspend anyway, but the previous snapshot (if there was one) won't be deleted. When the job resumes, its processing starts from the point of the last snapshot.
NOTE: if the cluster becomes unstable (a member leaves or similar) while the job is in the process of being suspended, it may end up getting immediately restarted. Call
getStatus()
to find out and possibly try to suspend again.- Throws:
IllegalStateException
- if the job is not runningUnsupportedOperationException
- if called for a light job
-
resume
void resume()Resumes a suspended job. The job will resume from the last known successful snapshot, if there is one. Not supported for light jobsIf the job is not suspended, it does nothing.
- Throws:
UnsupportedOperationException
- if called for a light job
-
cancelAndExportSnapshot
Exports and saves a state snapshot with the given name, and then cancels the job without processing any more data after the barrier (graceful cancellation). It's similar tosuspend()
followed by acancel()
, except that it won't process any more data after the snapshot. Not supported for light jobs.You can use the exported snapshot as a starting point for a new job. The job doesn't need to execute the same Pipeline as the job that created it, it must just be compatible with its state data. To achieve this, use
JobConfig.setInitialSnapshotName(String)
.Unlike
exportSnapshot(java.lang.String)
method, when a snapshot is created using this method Jet will commit the external transactions because this snapshot is the last one created for the job and it's safe to use it to continue the processing.If the terminal snapshot fails, Jet will suspend this job instead of cancelling it.
NOTE: if the cluster becomes unstable (a member leaves or similar) while the job is in the process of being cancelled, it may end up getting immediately restarted. Call
getStatus()
to find out and possibly try to cancel again. Should this restart happen, created snapshot cannot be regarded as exported terminal snapshot. It shall be neither overwritten nor deleted until there is a new snapshot created, either automatic or via cancelAndExportSnapshot method, otherwise data can be lost. If cancelAndExportSnapshot is used again, ensure that there was a regular snapshot made or use different snapshot name.You can call this method for a suspended job, too: in that case it will export the last successful snapshot and cancel the job.
The method call will block until it has fully exported the snapshot, but may return before the job has stopped executing.
For more information about "exported state" see
exportSnapshot(String)
.The job status will be
JobStatus.FAILED
after cancellation,join()
will throw aCancellationException
.- Parameters:
name
- name of the snapshot. If name is already used, it will be overwritten- Throws:
JetException
- if the job is in an incorrect state: completed, cancelled or is in the process of restarting or suspending.UnsupportedOperationException
- if called for a light job
-
exportSnapshot
Exports a state snapshot and saves it under the given name. You can start a new job using the exported state usingJobConfig.setInitialSnapshotName(String)
. Not supported for light jobs.The snapshot will be independent of the job that created it. Jet won't automatically delete the IMap it is exported into. You must manually call snapshot.destroy() to delete it. If your state is large, make sure you have enough memory to store it. The snapshot created using this method will also not be used for automatic restart - should the job fail, the previous automatically saved snapshot will be used.
For transactional sources or sinks (that is those which use transactions to confirm reads or to commit writes), Jet will not commit the transactions when creating a snapshot using this method. The reason for this is that such connectors only achieve exactly-once guarantee if the job restarts from the latest snapshot. But, for example, if the job fails after exporting a snapshot but before it creates a new automatic one, the job would restart from the previous automatic snapshot and the stored internal and committed external state will be from a different point in time and a data loss will occur.
If a snapshot with the same name already exists, it will be overwritten. If a snapshot is already in progress for this job (either automatic or user-requested), the requested one will wait and start immediately after the previous one completes. If a snapshot with the same name is requested for two jobs at the same time, their data will likely be damaged (similar to two processes writing to the same file).
You can call this method on a suspended job: in that case it will export the last successful snapshot. You can also export the state of non-snapshotted jobs (those with
ProcessingGuarantee.NONE
).If you issue any graceful job-control actions such as a graceful member shutdown or suspending a snapshotted job while Jet is exporting a snapshot, they will wait in a queue for this snapshot to complete. Forceful job-control actions will interrupt the export procedure.
You can access the exported state using
JetService.getJobStateSnapshot(String)
.The method call will block until it has fully exported the snapshot.
- Parameters:
name
- name of the snapshot. If name is already used, it will be overwritten- Throws:
JetException
- if the job is in an incorrect state: completed, cancelled or is in the process of restarting or suspending.UnsupportedOperationException
- if called for a light job
-