public interface Job
DAG
or
Pipeline
to the cluster. See JetService
for methods to
submit jobs and to get a handle to an existing job.Modifier and Type | Method and Description |
---|---|
void |
cancel()
Makes a request to cancel this job and returns.
|
JobStateSnapshot |
cancelAndExportSnapshot(String name)
Exports and saves a state snapshot with the given name, and then cancels
the job without processing any more data after the barrier (graceful
cancellation).
|
JobStateSnapshot |
exportSnapshot(String name)
Exports a state snapshot and saves it under the given name.
|
JobConfig |
getConfig()
Returns the configuration this job was submitted with.
|
CompletableFuture<Void> |
getFuture()
Gets the future associated with the job.
|
long |
getId()
Returns the ID of this job.
|
default String |
getIdString()
Returns the string representation of this job's ID.
|
JobMetrics |
getMetrics()
Returns a snapshot of the current values of all job-specific metrics.
|
default String |
getName()
Returns the name of this job or
null if no name was supplied. |
JobStatus |
getStatus()
Returns the current status of this job.
|
long |
getSubmissionTime()
Returns the time when the job was submitted to the cluster.
|
JobSuspensionCause |
getSuspensionCause()
Return a
description of the cause that has
led to the suspension of the job. |
boolean |
isLightJob()
Returns
true if this instance represents a light job. |
default void |
join()
Waits for the job to complete and throws an exception if the job
completes with an error.
|
void |
restart()
Gracefully stops the current execution and schedules a new execution
with the current member list of the Jet cluster.
|
void |
resume()
Resumes a suspended job.
|
void |
suspend()
Gracefully suspends the current execution of the job.
|
boolean isLightJob()
true
if this instance represents a light job. For a light job, many of
the methods in this interface throw UnsupportedOperationException
.long getId()
@Nonnull default String getIdString()
default void join()
Joining a suspended job will block until that job is resumed and completes.
Shorthand for job.getFuture().join()
.
CancellationException
- if the job was cancelled@Nonnull CompletableFuture<Void> getFuture()
cancel()
method
should be used.IllegalStateException
- if the job has not started yet.void cancel()
After cancellation, join()
will throw a CancellationException
.
If the job is already suspended, Jet will delete its runtime resources and snapshots and it won't be able to resume again.
NOTE: if the cluster becomes unstable (a member leaves
or similar) while the job is in the process of cancellation, it may end
up getting restarted after the cluster has stabilized and won't be
cancelled. Call getStatus()
to find out and possibly try to
cancel again.
The job status will be JobStatus.FAILED
after cancellation.
See cancelAndExportSnapshot(String)
to cancel with a terminal
snapshot.
IllegalStateException
- if the cluster is not in a state to
restart the job, for example when coordinator member left and new
coordinator did not yet load job's metadata.JobNotFoundException
- for light jobs, if the job already
completedlong getSubmissionTime()
The time is assigned by reading System.currentTimeMillis()
of
the coordinator member that executes the job for the first time. It
doesn't change on restart.
@Nullable default String getName()
null
if no name was supplied.
Jobs can be named through JobConfig.setName(String)
prior to
submission. For light jobs it always returns null
.
@Nonnull JobConfig getConfig()
@Nonnull JobSuspensionCause getSuspensionCause()
description of the cause
that has
led to the suspension of the job. Throws an IllegalStateException
if the job is not currently suspended. Not supported for light jobs.UnsupportedOperationException
- if called for a light job@Nonnull JobMetrics getMetrics()
While the job is running the metric values are updated periodically (see metrics collection frequency), assuming that both global metrics collection and per-job metrics collection are enabled. Otherwise empty metrics will be returned.
Keep in mind that the collections may occur at different times on each member, metrics from various members aren't from the same instant.
When a job is restarted (or resumed after being previously suspended) the metrics are reset too, their values will reflect only updates from the latest execution of the job.
Once a job completes successfully, the metrics will have their most
recent values (i.e. the last metric values from the moment before the
job completed), assuming that metrics storage
was enabled. If a job fails, is cancelled or suspended, empty metrics
will be returned.
UnsupportedOperationException
- if called for a light jobvoid restart()
Conceptually this call is equivalent to suspend()
& resume()
. Not supported for light jobs.
IllegalStateException
- if the job is not running, for example it
has already completed, is not yet running, is already restarting,
suspended etc.UnsupportedOperationException
- if called for a light jobvoid suspend()
JobStatus.SUSPENDED
. To resume the job, call resume()
. Not supported for light jobs.
You can suspend a job even if it's not configured for snapshotting. Such a job will resume with empty state, as if it has just been started.
This call just initiates the suspension process and doesn't wait for it to complete. Suspension starts with creating a terminal state snapshot. Should the terminal snapshot fail, the job will suspend anyway, but the previous snapshot (if there was one) won't be deleted. When the job resumes, its processing starts from the point of the last snapshot.
NOTE: if the cluster becomes unstable (a member leaves or
similar) while the job is in the process of being suspended, it may end up
getting immediately restarted. Call getStatus()
to find out and
possibly try to suspend again.
IllegalStateException
- if the job is not runningUnsupportedOperationException
- if called for a light jobvoid resume()
If the job is not suspended, it does nothing.
UnsupportedOperationException
- if called for a light jobJobStateSnapshot cancelAndExportSnapshot(String name)
suspend()
followed by a cancel()
, except that it won't process any more data after the
snapshot. Not supported for light jobs.
You can use the exported snapshot as a starting point for a new job. The
job doesn't need to execute the same Pipeline as the job that created it,
it must just be compatible with its state data. To achieve this, use
JobConfig.setInitialSnapshotName(String)
.
Unlike exportSnapshot(java.lang.String)
method, when a snapshot is created using
this method Jet will commit the external transactions because this
snapshot is the last one created for the job and it's safe to use it to
continue the processing.
If the terminal snapshot fails, Jet will suspend this job instead of cancelling it.
You can call this method for a suspended job, too: in that case it will export the last successful snapshot and cancel the job.
The method call will block until it has fully exported the snapshot, but may return before the job has stopped executing.
For more information about "exported state" see exportSnapshot(String)
.
The job status will be JobStatus.FAILED
after cancellation,
join()
will throw a CancellationException
.
name
- name of the snapshot. If name is already used, it will be
overwrittenJetException
- if the job is in an incorrect state: completed,
cancelled or is in the process of restarting or suspending.UnsupportedOperationException
- if called for a light jobJobStateSnapshot exportSnapshot(String name)
JobConfig.setInitialSnapshotName(String)
. Not supported for light jobs.
The snapshot will be independent from the job that created it. Jet won't automatically delete the IMap it is exported into. You must manually call snapshot.destroy() to delete it. If your state is large, make sure you have enough memory to store it. The snapshot created using this method will also not be used for automatic restart - should the job fail, the previous automatically saved snapshot will be used.
For transactional sources or sinks (that is those which use transactions to confirm reads or to commit writes), Jet will not commit the transactions when creating a snapshot using this method. The reason for this is that such connectors only achieve exactly-once guarantee if the job restarts from the latest snapshot. But, for example, if the job fails after exporting a snapshot but before it creates a new automatic one, the job would restart from the previous automatic snapshot and the stored internal and committed external state will be from a different point in time and a data loss will occur.
If a snapshot with the same name already exists, it will be overwritten. If a snapshot is already in progress for this job (either automatic or user-requested), the requested one will wait and start immediately after the previous one completes. If a snapshot with the same name is requested for two jobs at the same time, their data will likely be damaged (similar to two processes writing to the same file).
You can call this method on a suspended job: in that case it will export
the last successful snapshot. You can also export the state of
non-snapshotted jobs (those with ProcessingGuarantee.NONE
).
If you issue any graceful job-control actions such as a graceful member shutdown or suspending a snapshotted job while Jet is exporting a snapshot, they will wait in a queue for this snapshot to complete. Forceful job-control actions will interrupt the export procedure.
You can access the exported state using JetService.getJobStateSnapshot(String)
.
The method call will block until it has fully exported the snapshot.
name
- name of the snapshot. If name is already used, it will be
overwrittenJetException
- if the job is in an incorrect state: completed,
cancelled or is in the process of restarting or suspending.UnsupportedOperationException
- if called for a light jobCopyright © 2022 Hazelcast, Inc.. All rights reserved.