Class FileSources

java.lang.Object
com.hazelcast.jet.pipeline.file.FileSources

public final class FileSources extends Object
Contains factory methods for the Unified File Connector.
Since:
Jet 4.4
  • Method Details

    • files

      public static FileSourceBuilder<String> files(String path)
      The main entry point to the Unified File Connector.

      Returns a FileSourceBuilder configured with default values, see its documentation for more options.

      The path specifies the filesystem type (for example s3a://, hdfs://) and the path to the files. If it doesn't specify a file system, a local file system is used - in this case the path must be absolute. By "local" we mean local to each Jet cluster member, not to the client submitting the job.

      The following file systems are supported:

      • s3a:// (Amazon S3)
      • hdfs:// (HDFS)
      • wasbs:// (Azure Cloud Storage)
      • adl:// (Azure Data Lake Gen 1)
      • abfs:// (Azure Data Lake Gen 2)
      • gs:// (Google Cloud Storage)

      The path must point to a directory. All files in the directory are processed. Subdirectories are not processed recursively. The path must not contain any wildcard characters.

      Example usage:

      
       Pipeline p = Pipeline.create();
               p.readFrom(FileSources.files("/path/to/directory").build())
                .map(line -> LogParser.parse(line))
                .filter(log -> log.level().equals("ERROR"))
                .writeTo(Sinks.logger());
       
      Parameters:
      path - the path to the directory
      Returns:
      the builder object with fluent API