Starburst Galaxy

  •  Get started

  •  Working with data

  •  Developer tools

  • Starburst Galaxy UI
  •  Catalogs
  •  Clusters
  •  Admin
  •  Access control
  •  Cloud settings

  • Administration
  •  Security
  •  Single sign-on

  •  Troubleshooting

  • Galaxy status

  •  Reference
  • Fault-tolerant execution #

    Fault-tolerant execution (FTE) allows a cluster to retry queries or parts of query processing in the event of failures without having to start the whole query from the beginning. This is especially useful for long-running queries that are typical with batch processing and Extract Transform Load (ETL) queries.

    In fault-tolerant execution mode, intermediate exchange data is spooled and can be re-used by another worker. When queries require more memory than currently available in the cluster, they are still able to succeed. Multiple queries are able to share resources in a fair way, and make steady progress.

    Do not use fault-tolerant execution if most queries in the cluster are short-running, typically less than one minute for completion, and require smaller amounts of memory. Query processing in fault-tolerant execution mode can be slightly slower than normal operation.

    In Starburst Galaxy, specify fault-tolerant execution mode when creating or editing a cluster by selecting Fault tolerant in the Execution mode drop-down menu. No other configuration is required. You can make this designation either when creating a cluster or when editing an existing cluster. To take a cluster back to standard processing, select Standard from the same menu and restart the cluster.

    Fault-tolerant execution selection

    Fault-tolerant execution is not available for the Free cluster size. Not all catalogs support FTE; when a cluster is FTE-enabled, this enables all of the cluster’s catalogs that support FTE.

    Fault-tolerant execution is not designed to recover from broken queries or incorrect SQL.