Enable fault-tolerant execution for queries in SEP #

You can configure SEP to be more resilient against query failure by enabling fault-tolerant execution. This allows SEP to handle larger queries such as batch operations without worker node interruptions causing the query to fail.

When configured, the SEP cluster buffers data used by workers during query processing. If a worker node fails for any reason, such as a network outage or running out of available resources, the coordinator issues the data to an available worker that can pick up query processing, allowing the query to continue.

Architecture #

The coordinator node uses a configured exchange manager service that buffers data for query processing in an external location such as an S3 bucket. Worker nodes send data to the buffer as they execute their query tasks.

Fault-tolerant cluster exchange architecture

Best practices and considerations #

A fault-tolerant cluster is best suited for large batch queries. Fault-tolerant execution of a high volume of short-running may experience increased latency. As such, it is recommended to run a dedicated fault-tolerant SEP cluster for handling batch operations, separate from a cluster that is designated for a higher query volume.

Catalogs for the following data sources support fault-tolerant execution of read and write operations:

  • Delta Lake
  • Hive
  • Iceberg
  • MySQL
  • PostgreSQL
  • SQL Server

Catalogs for other data sources only support fault-tolerant execution of read operations. When fault-tolerant execution is enabled on a cluster, write operations are disabled for any catalogs that do not support fault-tolerant execution of those operations.

The exchange manager may send a large amount of data to the exchange storage, resulting in high I/O load on that storage. You can configure multiple storage locations for use by the exchange manager to help balance the I/O load between them.

Configuration #

The following steps describe how to configure a SEP cluster for fault-tolerant execution with an S3 bucket-based exchange:

  1. Set up an S3 bucket to use as the exchange storage. For this example we are using an AWS S3 bucket, but other storage options are described in the reference documentation as well. You can use multiple S3 buckets for exchange storage.

    For each bucket in AWS, collect the following information:

    • S3 URI location for the bucket, such as s3://exchange-spooling-bucket

    • Region that the bucket is located in, such as us-west-1

    • AWS access and secret keys for the bucket

  2. Add the following exchange manager configuration in both the coordinator.etcFiles.properties and worker.etcFiles.properties sections of the Helm chart in a new exchange-manager.properties: entry, using the gathered S3 bucket information:

     coordinator:
       etcFiles:
         properties:
         ...
           exchange-manager.properties: |
             exchange-manager.name=filesystem
             exchange.base-directories=s3://exchange-spooling-bucket-1,s3://exchange-spooling-bucket-2
             exchange.s3.region=us-west-1
             exchange.s3.aws-access-key=example-access-key
             exchange.s3.aws-secret-key=example-secret-key
     ...
     worker:
       etcFiles:
         properties:
         ...
           exchange-manager.properties: |
             exchange-manager.name=filesystem
             exchange.base-directories=s3://exchange-spooling-bucket-1,s3://exchange-spooling-bucket-2
             exchange.s3.region=us-west-1
             exchange.s3.aws-access-key=example-access-key
             exchange.s3.aws-secret-key=example-secret-key
    

    In non-Kubernetes installations, the same properties must be defined in the etc/exchange-manager.properties configuration file on the coordinator and all worker nodes.

  3. Add the following configuration for fault-tolerant execution in both the coordinator.additionalProperties: and worker.additionalProperties: sections of the Helm chart:

     coordinator:
       additionalProperties:
         ...
         retry-policy=TASK
         exchange.compression-enabled=true
         query.low-memory-killer.delay=0s
     ...
     worker:
       additionalProperties:
         ...
         retry-policy=TASK
         exchange.compression-enabled=true
         query.low-memory-killer.delay=0s
    

    In non-Kubernetes installations, the same properties must be defined in the config.properties file on the coordinator and all worker nodes.

  4. Re-deploy your instance of SEP or, for non-Kubernetes installations, restart the cluster.

Your SEP cluster is now configured with fault-tolerant query execution. If a query run on the cluster would normally fail due to an interruption of query processing, fault-tolerant execution now resumes the query processing to ensure successful execution of the query.

Next steps #

For more information about fault-tolerant execution, including simple query retries that do not require an exchange manager and advanced configuration operations, see the reference documentation.