Starburst Galaxy

  •  Get started

  •  Working with data

  •  Data engineering

  •  Developer tools

  •  Cluster administration

  •  Security and compliance

  •  Troubleshooting

  • Galaxy status

  •  Reference

  • Understanding query planning and execution #

    Starburst Galaxy is fast and efficient. The following section provides an aexplanation to better understand the SQL processing behind Galaxy.

    Query lifecycle #

    Understanding how your queries are transformed from SQL commands to actionable insights is important step toward writing better queries.

    This section covers how your query is handled from the moment you submit a query through query execution.

    • Query parsing: When you submit a SQL statement, it is first turned into a syntax tree. Then the parser validates the SQL grammar and logic.
    • Analysis: The analysis step takes the syntax tree created by the parser and validates the semantics. For example, if you are doing a comparison between two values, the analyzer checks that the types of the values are compatible with each other with respect to the given operator.
    • Planning: The planning phase converts the validated syntax tree into an executable plan. The result is a detailed intermediate representation plan that illustrates the data flow such as a scan of table orders, filters of rows, aggregation, and output.
    • Optimization: Optimization is the process of applying a set of semantics that preserves transformations to the plan to produce an optimal, physical plan that can be executed.
    • Scheduling and execution: The scheduler translates the tasks across the worker nodes. It ensures that the query execution is balanced and non-disruptive to other system operations. Then the query executes, and the coordinator monitors the progress to display in the user interface.

    For a detailed explanation of the query lifecycle and optimization techniques, see the video lesson introduced on the Optimizing query performance page.