Installation#

As of version 476-e, Starburst Gateway and Starburst data catalog are deployed together using a unified Helm chart called starburst-portal. This new deployment artifact simplifies installation and ensures compatibility with the latest features and fixes. The previous individual Helm chart for starburst-gateway is no longer supported.

This page provides instructions for installing and configuring Starburst Gateway.

Install Starburst Gateway with Helm.

Configuration#

The following sections describe the configuration requirements for Starburst Gateway.

Each component of Starburst Gateway has a corresponding node in the configuration YAML file.

Backend database#

Starburst Gateway requires a MySQL, PostgreSQL, or Oracle database.

The Gateway automatically initializes your database when you start it.

Starburst clusters#

Starburst Galaxy and the current and previous Starburst Enterprise LTS versions are compatible with Starburst Gateway.

Note

Starburst Gateway is not compatible with Trino or Amazon Athena.

SEP Configuration#

Starburst Gateway acts as a transparent proxy for one or more clusters.

Configure all clusters behind Starburst Gateway with the following settings.

Enable forward processing#

If you route all client and server communication through Starburst Gateway, enable process forwarded HTTP headers:

http-server.process-forwarded=true

Without this setting, initial requests may route correctly from your client to Starburst Gateway and then to Starburst Enterprise. However, subsequent requests for quests for query results use the cluster’s local URL instead of the Gateway URL, bypassing the Gateway entirely. If the cluster’s local URL is private to the network, these subsequent calls fail.

This setting is also required for SEP to authenticate when TLS terminates at the Gateway. Normally, SEP refuses to authenticate plain HTTP requests, but with http-server.process-forwarded=true, SEP authenticates over HTTP when the request includes X-Forwarded-Proto: HTTPS.

Configure header forwarding#

To prevent Starburst Gateway from sending X-Forwarded-* headers, set the following configuration property:

routing:
  addXForwardedHeaders: false

When proxying clusters that are behind firewalls or otherwise not directly routable from your client, set this property to true.

Environment variables#

Set environment variables in the command line or in your configuration file.

To manually set an environment variable in the command line:

export DB_PASSWORD=my-super-secret-pwd

To use an environment variable in your configuration file, reference it using the syntax ${ENV:VARIABLE}.

Routing rules#

Find more information in the routing rules documentation.

QueryCountBasedRouter#

QueryCountBasedRouter is the main routing algorithm Starburst Gateway uses. It routes queries to a backend within a routing group based on live cluster statistics and per-user queue pressure.

The algorithm:

  1. Filters to healthy backends in the requested routing group

  2. Chooses the best backend by comparing:

    • The queued query count on each backend (the lowest count wins)

    • If two or more queued query counts are equal, the backend’s total queued queries (the lowest count wins)

    • If all of the backend’s total queued queries are equal, the backend’s running queries (the lowest count wins)

This makes routing user-aware. When you submit a query, the algorithm routes it to the backend where it’s likely to start executing soonest. This balances query load across backends and optimizes for your individual query latency.

After selecting a backend, the router updates a local view of stats:

  • If there are already queued queries on the selected backend, the router increments your queued query count for that backend

  • Otherwise, the router increments the backend’s running query count.

The router’s local adjustments to the usage statistics help it make informed, balanced decisions in real-time between the periodic stats refreshes it receives from the configured monitor.

User queue awareness and monitor support#

The UI_API monitor is the only monitor type that provides per-user queue information. When you set clusterStatsConfiguration.monitorType to UI_API, the router receives a userQueuedCount map for each backend. This map associates each user with their number of queued queries on that backend. The QueryCountBasedRouter algorithm uses this information to prioritize the backend where the requesting user has the fewest queued queries, minimizing their wait time.

If you configure a different monitor type (INFO_API, JDBC, JMX, METRICS, or NOOP), the router does not receive per-user queue information. In this case, the QueryCountBasedRouter falls back to cluster-level metrics and selects the backend with:

  1. The lowest total number of queued queries across all users

  2. If two or more queued query counts are equal, the lowest total number of running queries across all users

Logging #

To configure the logging level for classes, specify the path to the log.properties file by using log.levels-file in serverConfig.

For additional configurations, use the log.* properties from the logging properties documentation and specify the properties in serverConfig.

Additional statement maths#

The SEP client protocol initiates queries with a POST to v1/statement. Starburst Gateway incorporates this into its routing logic by extracting and recording the query ID from responses to these requests.

If you use a build of SEP that supports additional endpoints, configure Starburst Gateway to treat them like /v1/statement by adding them under additionalStatementPaths. The standard /v1/statement path is always included:

additionalStatementPaths:
  - '/ui/api/insights/ide/statement'

Load balancer#

To deploy multiple instances of Starburst Gateway behind a generic load balancer, you must the serverConfig to enable process forwarded HTTP headers:

serverConfig:
  http-server.process-forwarded: true

Proxy response size#

Starburst Gateway reads responses from SEP in bytes (up to 32MB by default). To configure a larger size, set the following configuration property:

proxyResponseConfiguration:
  responseSize: 50MB

Deploy Gateway using Starburst Portal#

Use the following steps to deploy Starburst Gateway on Kubernetes with Helm.

Access the Helm Chart#

The Starburst Portal Helm chart is available in the Starburst Helm chart project. It contains both Starburst Gateway and Starburst data catalog.

Use the following commands to access the chart:

helm registry login harbor.starburstdata.net/starburstdata
# Enter your credentials
helm pull oci://harbor.starburstdata.net/starburstdata/charts/starburst-portal --version <version>

Create a secret#

Use the following command to create a secret for Harbor registry authentication:

kubectl create secret docker-registry harbor-auth \
  --docker-server=harbor.starburstdata.net \
  --docker-username=<your-username> \
  --docker-password=<your-password> \
  --docker-email=<your-email>

Deploy a PostgreSQL database#

Use the following command to create a PostgreSQL pod:

kubectl apply -f val-posgres.yaml

Install Starburst Gateway#

Use the following command to install Starburst Gateway via the Portal chart:

helm upgrade --install starburst-portal starburst-portal-<version>.tgz -f gateway-config.yaml

Enable Starburst Gateway#

By default, Starburst Gateway is disabled in the Portal deployment.

To enable it:

gateway.enabled=true
gateway.config.file=config.yaml

The values.yaml file defines how Starburst Portal is configured during Helm deployment. The following example shows a basic configuration:

etcFiles:
  config: |
    node.environment=starburst_portal
    http-server.http.port=8080
    http-server.http.enabled=true
    http-server.https.port=8443
    http-server.https.enabled=false
    credentials-provider.type=file
    credentials-provider.credentials-file-path=/opt/starburst/etc/catalog-credentials.json
    persistence.database.url=jdbc:postgresql://postgresql:5432/<db/schema name>
    persistence.database.user=<db username>
    persistence.database.password=<db password>

Access the Gateway UI#

User the following command to access the Starburst Gateway UI:

kubectl port-forward service/starburst-gateway 8080:8080

Access the UI at http://localhost:8080.

Configure routing rules#

To implement static routing rules, create a configMap from your routing rules YAML definition:

kubectl create cm routing-rules --from-file your-routing-rules.yaml

Then mount it to your container:

volumes:
    - name: routing-rules
      configMap:
          name: routing-rules
          items:
              name: your-routing-rules.yaml
              path: your-routing-rules.yaml

volumeMounts:
    - name: routing-rules
      mountPath: "/etc/routing-rules/your-routing-rules.yaml"
      subPath: your-routing-rules.yaml

Ensure the mountPath matches the rulesConfigPath you specify in your configuration. The subPath is optional. Without the subPath, the file mounts at mountPath/<configMap key>.

Standard Helm options such as replicaCount, image, imagePullSecrets, service, ingress and resources are supported in helm/values.yaml.

Health checks#

Starburst Gateway periodically performs health checks and maintains an in-memory health status for each backend. When a backend fails a health check, the Gateway marks it as UNHEALTHY and stops routing requests to it.

The health status differs from the active/inactive state of a backend. The active/inactive state indicates whether a backend is on or off, while health status is determined programmatically by the health check process. Health checks only run on active backends.

Starburst recommends using either INFO_API or METRICS for your health check. Other options may be deprecated in the future.

For more details, see the health status section.

To select the type of health check, set the following configuration property:

clusterStatsConfiguration:
  monitorType: ""

Choose from the following health check types.

INFO_API (default)#

By default, Starburst Gateway uses the v1/info REST endpoint. A successful requires a 200 response with starting: false. Configure connection timeout parameters through the monitor node:

monitor:
  connectTimeoutSeconds: 5
  requestTimeoutSeconds: 10
  idleTimeoutSeconds: 1
  retries: 1

All timeout parameters are optional.

METRICS#

This method pulls statistics from the OpenMetrics endpoint and retrieves the number of running and queued queries for use with QueryCountBasedRouter.

You must enable either METRICS or JDBC if you use QueryCountBasedRouter.

By default, METRICS uses trino_execution_name_QueryManager_RunningQueries and trino_execution_name_QueryManager_QueuedQueries to track the number of running and queued queries. See the following example:

monitor:
    runningQueriesMetricName: io_starburst_galaxy_name_GalaxyMetrics_RunningQueries
    queuedQueriesMetricName: io_starburst_galaxy_name_GalaxyMetrics_QueuedQueries

By default the monitor pulls metrics using the /metrics endpoint. To configure an alternative endpoint, use the metricsEndpoint property.

monitor:
    metricsEndpoint: /v1/metrics

This monitor supports custom health definitions by comparing metrics using two maps: metricMinimumValues and metricMaximumValues. The keys metric names, and the values are minimum or maximum values (inclusive) considered healthy. By default, the only metric populated is trino_metadata_name_DiscoveryNodeManager_ActiveNodeCount:

monitor:
    metricMinimumValues:
        trino_metadata_name_DiscoveryNodeManager_ActiveNodeCount: 1

For the cluster to be considered healthy, one worker node must be active. To increase the minimum worker count to 10 and disqualify clusters experiencing frequent major garbage collections, set the following configuration properties:

monitor:
    metricMinimumValues:
        trino_metadata_name_DiscoveryNodeManager_ActiveNodeCount: 10
    metricMaximumValues:
        io_airlift_stats_name_GcMonitor_MajorGc_FiveMinutes_count: 2

JDBC#

This method uses a JDBC connection to query system.runtime tables for cluster information. It is required for the query count-based routing strategy.

This method is recommended over UI_API since it does not restrict the Starburst Enterprise web UI authentication method of backend clusters.

Configure a username and password by adding backendState to your configuration:

backendState:
  username: "user"
  password: "password"

The credentials must be valid across all backends.

Starburst Gateway uses explicitPrepare=false by default. This property uses a single query for prepared statements instead of a PREPARE/EXECUTE pair. If you are using the JDBC health check option with older versions of Trino, set the following configuration property:

monitor:
   explicitPrepare: true

To set the query timeout, set the following configuration property:

monitor:
    queryTimeout: 10

Other timeout parameters do not apply to the JDBC connection.

JMX#

The method collects cluster information, required for QueryCountBasedRouterProvider using the v1/jmx/mbean endpoint on clusters.

To enable JMX monitoring, complete the following steps:

  1. Activate JMX monitoring on all Trino clusters:

jmx.rmiregistry.port=<port>
jmx.rmiserver.port=<port>
  1. Allow JMX endpoint access by adding rules to your file-based access control configuration:

{
  "catalogs": [
    {
      "user": "user",
      "catalog": "system",
      "allow": "read-only"
    }
  ],
  "system_information": [
    {
      "user": "user",
      "allow": ["read"]
    }
  ]
}
  1. Configure a username and password in the the backendState section:

backendState:
  username: "user"
  password: "password"

The credentials must be consistent across all backend clusters and have read rights on system_information. The JMX monitor uses these credentials to authenticate against the JMX endpoint of each cluster and collect metrics like running queries, queued queries, and worker nodes information.

UI_API#

This method pulls cluster information from the ui/api/stats REST endpoint. It only works with backend clusters using web-ui.authentication.type=FORM. Set a username and password using backendState as with the JDBC option.

NOOP#

This option disables health checks.