Load balancing#
Starburst Portal increases query performance and workload capacity by distributing queries across multiple Starburst Enterprise platform (SEP) clusters.
Overview#
When a routing group contains two or more active and healthy SEP clusters, Starburst Portal routes each query to the cluster that can best serve the query based on current conditions. This routing happens automatically to optimize query performance, and it does not require changes to your applications.
By default, Starburst Portal routes all queries to a routing group named
adhoc. If a routing group named adhoc does not exist or if it contains no
active and healthy clusters, queries fail unless you define routing rules that
direct them to a different routing group.
Important
For Starburst Portal to make optimal routing decisions, all clusters in a routing group must have the same configuration, including version, processing power, and worker node count.
Add or remove capacity#
To add capacity, add a cluster to the routing group using the Starburst Portal Clusters page. Once Starburst Portal detects that the cluster is active and healthy, the Portal includes the cluster in subsequent routing decisions.
To remove capacity, delete a cluster from the routing group or set the cluster’s status to inactive.
Verify query distribution#
Check the RoutedTo column on the Starburst Portal Query insights page to see which cluster and routing group handled each query.
Troubleshooting#
If you experience unexpected query behavior, check the following common issues.
All queries route to one cluster#
If all queries route to one cluster:
Verify the routing group contains more than one cluster.
Check if the queries share a transaction ID. Starburst Portal routes all queries in a transaction to the same cluster.
One cluster receives no queries#
If one cluster receives no queries, verify the cluster is active and healthy.
Queries are not evenly distributed#
If queries are not evenly distributed across clusters:
Allow more time for distribution to balance out.
Check if one cluster is servicing workloads directly, bypassing Starburst Portal.
Tail latencies are higher than expected#
If tail latencies are higher than expected, consider configuring resource groups or session properties on your clusters.
Queries fail with object not found error#
If queries fail with object not found or similar errors, confirm that all
catalog configurations, database objects, authorizations, and privileges are
identical across all clusters in the routing group.