Class DAG

java.lang.Object
com.hazelcast.jet.core.DAG
All Implemented Interfaces:
DataSerializable, IdentifiedDataSerializable, Iterable<Vertex>

public class DAG extends Object implements IdentifiedDataSerializable, Iterable<Vertex>
Describes a computation to be performed by the Jet computation engine. A vertex represents a unit of data processing and an edge represents the path along which the data travels to the next vertex.

The work of a single vertex is parallelized and distributed, so that there are several instances of the Processor type on each member corresponding to it. Whenever possible, each instance should be tasked with only a slice of the total data and a partitioning strategy can be employed to ensure that the data sent to each vertex is collated by a partitioning key.

There are three basic kinds of vertices:

  1. source with just outbound edges;
  2. processor with both inbound and outbound edges;
  3. sink with just inbound edges.
Data travels from sources to sinks and is transformed and reshaped as it passes through the processors.

Note that {@link #iterator()) must be invoked at least once in order to validate the DAG and check against cycles.

Since:
Jet 3.0
  • Constructor Details

    • DAG

      public DAG()
  • Method Details

    • newVertex

      @Nonnull public Vertex newVertex(@Nonnull String name, @Nonnull SupplierEx<? extends Processor> simpleSupplier)
      Creates a vertex from a Supplier<Processor> and adds it to this DAG.
      Parameters:
      name - the unique name of the vertex
      simpleSupplier - the simple, parameterless supplier of Processor instances
      See Also:
    • newUniqueVertex

      @Nonnull public Vertex newUniqueVertex(@Nonnull String namePrefix, @Nonnull SupplierEx<? extends Processor> simpleSupplier)
      Creates a vertex from a Supplier<Processor> and adds it to this DAG. The vertex will be given a unique name created from the namePrefix.
      Parameters:
      namePrefix - the prefix for unique name of the vertex
      simpleSupplier - the simple, parameterless supplier of Processor instances
      Since:
      Jet 4.4
      See Also:
    • newVertex

      @Nonnull public Vertex newVertex(@Nonnull String name, @Nonnull ProcessorSupplier processorSupplier)
      Creates a vertex from a ProcessorSupplier and adds it to this DAG.
      Parameters:
      name - the unique name of the vertex
      processorSupplier - the supplier of Processor instances which will be used on all members
      See Also:
    • newUniqueVertex

      @Nonnull public Vertex newUniqueVertex(@Nonnull String namePrefix, @Nonnull ProcessorSupplier processorSupplier)
      Creates a vertex from a ProcessorSupplier and adds it to this DAG. The vertex will be given a unique name created from the namePrefix.
      Parameters:
      namePrefix - the prefix for unique name of the vertex
      processorSupplier - the supplier of Processor instances which will be used on all members
      Since:
      Jet 4.4
      See Also:
    • newVertex

      @Nonnull public Vertex newVertex(@Nonnull String name, @Nonnull ProcessorMetaSupplier metaSupplier)
      Creates a vertex from a ProcessorMetaSupplier and adds it to this DAG.
      Parameters:
      name - the unique name of the vertex
      metaSupplier - the meta-supplier of ProcessorSuppliers for each member
      See Also:
    • newUniqueVertex

      @Nonnull public Vertex newUniqueVertex(@Nonnull String namePrefix, @Nonnull ProcessorMetaSupplier metaSupplier)
      Creates a vertex from a ProcessorMetaSupplier and adds it to this DAG. The vertex will be given a unique name created from the namePrefix.
      Parameters:
      namePrefix - the prefix for unique name of the vertex
      metaSupplier - the meta-supplier of ProcessorSuppliers for each member
      Since:
      Jet 4.4
      See Also:
    • vertex

      @Nonnull public DAG vertex(@Nonnull Vertex vertex)
      Adds a vertex to this DAG. The vertex name must be unique.
    • edge

      @Nonnull public DAG edge(@Nonnull Edge edge)
      Adds an edge to this DAG. The vertices it connects must already be present in the DAG. It is an error to connect an edge to a vertex at the same ordinal as another existing edge. However, inbound and outbound ordinals are independent, so there can be two edges at the same ordinal, one inbound and one outbound.

      Jet supports multigraphs, that is you can add two edges between the same two vertices. However, they have to have different ordinals.

    • getInboundEdges

      @Nonnull public List<Edge> getInboundEdges(@Nonnull String vertexName)
      Returns the inbound edges connected to the vertex with the given name.
    • getOutboundEdges

      @Nonnull public List<Edge> getOutboundEdges(@Nonnull String vertexName)
      Returns the outbound edges connected to the vertex with the given name.
    • getVertex

      @Nullable public Vertex getVertex(@Nonnull String vertexName)
      Returns the vertex with the given name, or null if there is no vertex with that name.
    • vertices

      @Nonnull public Set<Vertex> vertices()
      Returns a copy of the DAG's vertices. Adding a vertex to or removing a vertex from the returned Set will not be reflected in the DAG, and vice-versa.
    • iterator

      @Nonnull public Iterator<Vertex> iterator()
      Validates the DAG and returns an iterator over the DAG's vertices in topological order.

      Note that this method must be invoked at least once in order to validate the DAG and check against cycles.

      Specified by:
      iterator in interface Iterable<Vertex>
    • edgeIterator

      @Nonnull public Iterator<Edge> edgeIterator()
      Returns an iterator over the DAG's edges in unspecified order.
      Since:
      5.4
    • toString

      @Nonnull public String toString()
      Overrides:
      toString in class Object
    • toString

      @Nonnull public String toString(int defaultLocalParallelism)
      Returns a string representation of the DAG.
      Parameters:
      defaultLocalParallelism - the local parallelism that will be shown if neither overridden on the vertex nor the preferred parallelism is defined by meta-supplier
    • toJson

      @Nonnull public com.hazelcast.internal.json.JsonObject toJson(int defaultLocalParallelism)
      Returns a JSON representation of the DAG.

      Note: the exact structure of the JSON is unspecified.

      Parameters:
      defaultLocalParallelism - the local parallelism that will be shown if neither overridden on the vertex nor the preferred parallelism is defined by meta-supplier
    • toDotString

      @Nonnull public String toDotString()
      Returns a DOT format (graphviz) representation of the DAG.
    • toDotString

      @Nonnull public String toDotString(int defaultLocalParallelism, int defaultQueueSize)
      Returns a DOT format (graphviz) representation of the DAG, annotates the vertices using supplied default parallelism value, and the edges using supplied default queue size value.
      Parameters:
      defaultLocalParallelism - the local parallelism that will be shown if neither overridden on the vertex nor the preferred parallelism is defined by meta-supplier
      defaultQueueSize - the queue size that will be shown if not overridden on the edge
    • writeData

      public void writeData(ObjectDataOutput out) throws IOException
      Description copied from interface: DataSerializable
      Writes object fields to output stream
      Specified by:
      writeData in interface DataSerializable
      Parameters:
      out - output
      Throws:
      IOException - if an I/O error occurs. In particular, an IOException may be thrown if the output stream has been closed.
    • readData

      public void readData(ObjectDataInput in) throws IOException
      Description copied from interface: DataSerializable
      Reads fields from the input stream
      Specified by:
      readData in interface DataSerializable
      Parameters:
      in - input
      Throws:
      IOException - if an I/O error occurs. In particular, an IOException may be thrown if the input stream has been closed.
    • getFactoryId

      public int getFactoryId()
      Description copied from interface: IdentifiedDataSerializable
      Returns DataSerializableFactory factory ID for this class.
      Specified by:
      getFactoryId in interface IdentifiedDataSerializable
      Returns:
      factory ID
    • getClassId

      public int getClassId()
      Description copied from interface: IdentifiedDataSerializable
      Returns type identifier for this class. It should be unique per DataSerializableFactory.
      Specified by:
      getClassId in interface IdentifiedDataSerializable
      Returns:
      type ID
    • lock

      @PrivateApi public void lock()
      Used to prevent further mutations to the DAG after submitting it for execution.

      It's not a public API, can be removed in the future.