Optimizing query performance #

Starburst Enterprise platform (SEP) is fast. But did you know that there are still many opportunities to make it even faster depending on how you write your queries?

Learn how to use EXPLAIN and ANALYZE to improve your query performance in this training video presented by one of our founders, Martin Traverso. For your convenience, we’ve divided the video training course up into topic sections, and provided links to the relevant parts of our documentation below.

The query lifecycle #

Knowing what’s happening under the hood in SQL can help you to write queries that capitalize on possible optimizations and avoid approaches that will cost you performance. This section provides an overview of what happens as a query is executed.

Topics:
  • Parsing
  • Analysis
  • Planning
  • Optimization
  • Scheduling and execution

Running time: ~12 min.


The EXPLAIN statement in detail #

If you want to understand what the SEP engine is basing its decisions on as it executes a query, you need to use the EXPLAIN statement. This section walks you through this very informative tool in detail.

Topics: Click the links to read more on that topic in our reference manual.

Running time: ~20 min.


General optimizations #

The content in this section is more technique-oriented, and is a complex subject. We strongly suggest watching it all the way through thoroughly first to gain a broad awareness of how you write a query can affect its performance before trying these on your own. For further reading, we recommend our pushdown documentation.

The SQL engine relies on table statistics to make decisions on optimizations. Enabling dynamic filtering can take optimizations even further. We recommend reading about these powerful features to ensure you are getting the best performance possible out of your cluster:

Topics:
  • Constant folding
  • Predicate pushdown
  • Predicate pushdown into the Hive connector
  • Hive partition pruning
  • Hive bucket pruning
  • Row group skipping for ORC and Parquet
  • Limit, partial limit, and aggregation pushdown
  • Skew

Running time: ~58 min.

SEP offers several properties to control how the optimizer handles certain operations.


Cost-based optimizations #

This section presents on overview of how cost-based optimizations work in SEP, and provides great context for the following recommended reading:

Topics:
  • Partitioned and broadcast joins
  • Disabling cost-based optimizations
  • Join reordering
  • Table statistics
  • Computing statistics with ANALYZE

Running time: ~13 min.