Starburst Galaxy

  •  Get started

  •  Working with data

  •  Data engineering

  •  Developer tools

  •  Cluster administration

  •  Security and compliance

  •  Troubleshooting

  • Galaxy status

  •  Reference

  • Optimize queries #

    The Starburst Galaxy SQL engine relies on table statistics to make decisions on optimizations. Optimizing queries is about getting the best performance possible out of your cluster. With Starburst Galaxy, this is handled for you.

    However, there are still optimizations that you can make to improve query performance.

    General optimizations #

    While most optimizations are completed for you in Galaxy, check out the session properties that can be used to further optimize your queries.

    Cost-based optimizations #

    Cost-based optimizations aim to ensure your query operations are as efficient and effective as possible. Some of the ways Galaxy does this are discussed in the following sections.

    The EXPLAIN statement #

    In Starburst Galaxy, there are two SQL statements you can use to further optimize, analyze, and refine your queries.

    • EXPLAIN: Show plan structure and cost estimates.
    • EXPLAIN ANALYZE: Show plan structure, cost estimates and execution statistics.

    Use the EXPLAIN statement to view the execution plan and understand how the optimizer is interpreting your query.

    EXPLAIN (TYPE LOGICAL) SELECT regionkey, count(*) FROM nation GROUP BY 1;
    

    Prepend EXPLAIN to your SQL statement to show you the execution plan and cost estimates without running the query.

    Prepend EXPLAIN ANALYZE to your SQL statement to show you the execution plan and cost for each stage after the query runs.

    EXPLAIN ANALYZE SELECT count(*), clerk FROM orders
    WHERE orderdate > date '1995-01-01' GROUP BY clerk;
    

    Watch the following video that describes the EXPLAIN statement:

    Other useful resources #