Starburst smart indexing and caching#

Starburst smart indexing and caching enables higher performance for catalogs using the Hive, Iceberg, or Delta Lake connectors by transparently adding an indexing and caching layer.

Note

Learn more about high level architecture and characteristics smart indexing and caching before using it.

Note

Smart indexing and caching and the connector are public preview features. Contact Starburst support with questions or feedback.

Requirements#

To use smart indexing and caching, you need:

Cluster configuration#

Smart indexing and caching requires your cluster to operate on the Kubernetes-based platform with the recommended node sizes. These sizes provide the necessary resources in terms of CPU, memory, and storage. Specifically, the included SSD storage is heavily used. You need to ensure that these devices are not used for any other purpose.

EKS node sizes:

  • r5d.4xlarge or larger

  • r5dn.4xlarge or larger

  • r6gd.4xlarge or larger

  • i3.4xlarge or larger

Create an EKS Managed Node Group with a specific size, and use it for all nodes in the cluster.

AKS node size:

  • Standard_L16s_v2 or larger with SSDs attached

Warning

Smart indexing and caching is not supported on a cluster running in Fault-tolerant execution mode.

Deployment and management is performed with the SEP Helm charts detailed in Deploying with Kubernetes and Configuring Starburst Enterprise in Kubernetes.

Add the following section in your values file for the specific cluster to enable smart indexing and caching:

siac:
  enabled: true

By default, it is disabled. The recommended memory allocation is automatically configured, but can be optionally overridden.

siac:
  enabled: true
  additionalHeapSizePercentage: 15

Smart indexing and caching requires privileged access to the attached storage on the nodes. Configure this for the coordinator and all workers.

Coordinator configuration for privileged access:

coordinator:
  securityContext:
    privileged: true

Worker configuration for privileged access:

worker:
  securityContext:
    privileged: true

As a last step, you need to ensure that you use a dedicated coordinator that is not scheduled for query processing, and adjust the query processing configuration to allow for more splits:

coordinator:
  additionalProperties: |
    ...
    node-scheduler.include-coordinator=false
    node-scheduler.max-splits-per-node=4096
    node-scheduler.max-unacknowledged-splits-per-task=1024
    node-scheduler.max-pending-splits-per-task=1024

Use Helm to update the values and restart the cluster nodes. Confirm the cluster is operating correctly with the new configuration, but without any adjusted catalogs, and then proceed to configure catalogs.

Related to the catalog usage, the cluster needs to allow internal communication between all workers, as well as with the coordinator on all the HTTP ports configured by the different values for http-rest-port in all catalogs.

When starting the cluster, smart indexing and caching parses all configuration parameters and can send invalid warnings such as Configuration property 'cache-service.password' was not used. You can safely ignore these warnings.

Catalog configuration#

After a successful Cluster configuration, you can configure the desired catalog to use smart indexing and caching.

Only catalogs using the Hive, Iceberg, or Delta Lake connectors can be accelerated:

  • connector.name=hive

  • connector.name=iceberg

  • connector.name=delta-lake

Table redirection from a catalog using the Hive connector to Delta Lake or Iceberg is not supported.

For example, update the example catalog that uses the Hive connector and AWS Glue as metastore in the values file.

catalogs:
  example: |
    connector.name=hive
    hive.metastore=glue
    ...

Enable smart indexing and caching on the catalog by updating the connector name to siac and adding the required configuration properties:

catalogs:
  example: |
    connector.name=siac
    siac.proxied-connector=HIVE
    siac.store.path=s3://example/path/
    siac.cluster-uuid=example-cluster-567891234567
    siac.workerdb.hibernate.connection.url=jdbc:hsqldb:file:/usr/lib/starburst/[CATALOG_NAME]/workerDB/db
    http-rest-port=8088
    hive.metastore=glue
    ...

For testing purposes, or alternatively for permanent usage of a new catalog name (for example, faster), in parallel to the existing catalog, you can copy the configuration of a catalog and update it:

catalogs:
  example: |
    connector.name=hive
    hive.metastore=glue
    ...
  faster: |
    connector.name=siac
    siac.proxied-connector=HIVE
    siac.store.path=s3://example/path/
    siac.cluster-uuid=example-cluster-567891234567
    siac.workerdb.hibernate.connection.url=jdbc:hsqldb:file:/usr/lib/starburst/[CATALOG_NAME]/workerDB/db
    http-rest-port=8088
    hive.metastore=glue
    ...

This allows you to query the same data with or without smart indexing and caching using different catalog names. However, existing scripts and statements that include the old catalog name are not accelerated.

The following table provides more information about the available catalog configuration properties:

Catalog configuration properties#

Property name

Description

connector.name

Required. Must be set to siac.

siac.proxied-connector

Required. The type of embedded connector that is used for accessing cold data through the smart indexing and caching connector. Defaults to HIVE. Valid values are HIVE, ICEBERG, or DELTA_LAKE. All properties supported by these connectors can be used to configure the catalog.

siac.cluster-uuid

Required. Unique identifier of the cluster. Used as the folder name in the store path. When creating a new cluster and the same siac.store.path and siac.cluster-uuid is used, then the cluster warmup rules for index and cache creation are imported into the newly created cluster.

siac.store.path

Required. The path of the object store where call-home data, metadata, and import-export data is managed. Required to use s3://. Write access privileges are necessary.

siac.call-home.enable

Enable pushing logs and metadata to the object store path configured at siac.store.path. Defaults to true.

siac.workerdb.hibernate.connection.url

Required. JDBC connection string for the internal database. Must be set to jdbc:hsqldb:file:/usr/lib/starburst/[CATALOG_NAME]/workerDB/db.

http-rest-port

Required. The port to use to run the REST server used for smart indexing and caching configuration. This port needs to be open across the cluster. Each catalog needs to use a different value. Defaults to 8088.

Indexing and caching management#

Smart caching and indexing automatically creates and manages its data based on processed queries, also called the default acceleration.

In addition, you can further configure it with the REST API exposed on the coordinator at the port configured with http-rest-port for each catalog. This port on the coordinator node needs to be exposed to the machine from where you call the REST API. Typically this is outside the cluster.

The following example call accesses the row-group/row-group-count endpoint for the catalog configured for port 8088 on the cluster at sep.example.com via HTTP with a GET command.

curl -X GET "http://sep.example.com:8088/row-group/row-group-count" -H "accept: application/json"

Alternatively you can use the curl installed on the coordinator, and execute commands towards the row-group/row-group-count endpoint of the API directly on the coordinator container with localhost and the HTTP port for the specific catalog:

kubectl exec <coordinator-pod> -c coordinator --  \
curl -X GET "http://localhost:8088/row-group/row-group-count" -H "accept: application/json"

The JSON-formatted response is returned on your machine, where you run kubectl. You can pipe results into a locally installed jq command to be able to read the result.

kubectl exec <coordinator-pod> -c coordinator --  \
curl -X GET "http://localhost:8088/row-group/row-group-count" -H "accept: application/json" \
| jq

Alternatively you can pipe the output into a file and then view it with an editor or other tool that improves the JSON file formatting.

kubectl exec <coordinator-pod> -c coordinator -- \
curl -X GET "http://localhost:8088/row-group/row-group-count" -H "accept: application/json" \
> result.json

The following sections detail the REST API and available endpoints. The example calls use plain curl calls to the endpoints at via HTTP on localhost. Adapt them to your usage by adding the kubectl invocation or update the coordinator URL.

Warming status#

You can determine the status of the warmup for smart indexing and caching with a GET operation of the /warming/status endpoint. It measures the warmup progress for splits across workers and if warming is currently taking place.

curl -X GET "http://localhost:8088/warming/status" -H "accept: application/json"

Example response:

{"nodesStatus":
  {"172.31.16.98": {"started":22136,"finished":22136},
   "172.31.25.207":{"started":20702,"finished":20702},
   "172.31.19.167":{"started":21116,"finished":21116},
   "172.31.22.28":{"started":20678,"finished":20678}},
   "warming":false}

The response shows that warmup started and finished on four workers, and is currently not in progress.

Debug tools#

The debug-tools endpoint requires an HTTP POST to specify the detailed command with a JSON payload to retrieve the desired data. You can use it to return the storage utilization:

curl -X POST "http://localhost:8088/debug-tools" -H "Content-Type: application/json" \
-d '{"commandName" : "all","@class" : "io.trino.plugin.siac.execution.debugtools.DebugToolData"}'

Example response:

{"coordinator-container":
  {"result":
    {"Storage_capacity":15000000,
     "Allocated 8k pages":1000000,
     "Num used stripes":0
    }
  }
}

Calculate the storage utilization percentage with (Allocated 8k pages / Storage_capacity) * 100.

Row group count#

A row group in smart indexing and caching is a collection of index and cache elements that are used to accelerate processing of Trino splits from the SSD storage.

A row group in smart indexing and caching is not equivalent to a Parquet row group or an ORC stripe, but a higher level artifact specific to smart indexing and caching. It can be related to a specific Parquet row group or ORC stripe but can also represent data from a whole file or more.

The row-group/row-group-count endpoint exposes all currently warmed up columns via an HTTP GET:

curl -X GET "http://localhost:8088/row-group/row-group-count" -H "accept: application/json"

The result is a list of columns specified by schema.table.column.warumuptype as the key. The value represents the corresponding count of accelerated row groups. Warmup types:

In the following example, 20 row groups of the tripid column of the trips_data table in the trips schema are accelerated with a data cache and an index.

{
  trips.trips_data.tripid.COL_WARM_UP_TYPE_DATA": 20,
  trips.trips_data.tripdid.COL_WARM_UP_TYPE_BASIC": 20
}

Create a warmup rule#

Use the warmup/warmup-rule-set endpoint with an HTTP POST and a JSON payload to create a new rule for a column of a table in a schema. This forces the creation of index and caching data for the column, independent of processed queries and the default acceleration.

curl -X POST -H "Content-Type: application/json" http://localhost:8088/warmup/warmup-rule-set \
-d '[ { "colNameId": "int_1", "schema": "tmp", "table": "aaa", "colWarmUpType": "COL_WARM_UP_TYPE_BASIC", "priority": 8, "ttl": "PT720H", "predicates": [ ] } ]'
Warmup rule properties#

Property name

Description

colNameid

Name of the column to which a warmup rule is attached.

schema

Name of the schema that contains the specified table.

table

Name of the table that contains the specified column.

predicates

Defaults to all partitions. Use the JSON array syntax ["example1", "example2"] to limit to specific partitions.

colWarmUpType

The materialization type performed on the specified column in the specified table. Valid values are COL_WARM_UP_TYPE_DATA for data cache acceleration, COL_WARM_UP_TYPE_BASIC for index acceleration, and COL_WARM_UP_TYPE_LUCENE for text search acceleration.

priority

Priority for the warmup rule. To ensure a column is accelerated even if storage capacity is exceeded, set the priority as high as 10. To ensure a column is never accelerated, set to a negative number such as -1. A negative value prevents data cache or index creation.

ttl

Duration for which the warmup rule remains active. Use PT0M to prevent expiration of the rule. Use duration specified in ISO-8601 duration format (PnDTnHnMn).

Get all warmup rules#

The warmup/warmup-rule-get endpoint exposes all defined warmup rules via an HTTP GET:

curl -X GET "http://localhost:8088/warmup/warmup-rule-get" -H "accept: application/json"

Response:

{
  "id":186229827,
  "schema":"ride_sharing_dataset",
  "table":"trips_data_big",
  "colNameId":"d_date",
  "column":
    {
      "classType":"RegularColumn",
      "key":"d_date"
     },
  "colWarmUpType":"COL_WARM_UP_TYPE_BASIC",
  "priority":8.0,
  "ttl":2592000.000000000,
  "predicates":[]
}

Delete a warmup rule#

The warmup/warmup-rule-delete endpoint allows you to delete a warmup rule via an HTTP DELETE. The identifier for the rule is a required parameter and can be seen from the result of warmup/warmup-rule-get in the id value.

curl -X DELETE "http://localhost:8088/warmup/warmup-rule-delete" -H "accept: application/json" -H "Content-Type: application/json" -d "[186229827]""

When you delete a warmup rule, the column index and cache data is de-prioritized to data from a default acceleration, and therefore is subject to earlier deletion.

SQL support#

All SQL statements and functions supported by the connector used in the accelerated catalog are supported:

When functions are applied on columns for the filtering operation, there is no acceleration for the filtering operation for some functions. For example, the filtering operation in the following example is not accelerated:

SELECT count(*)
FROM catalog.schema.table
WHERE lower(company) = 'starburst';

In addition, the following functions are accelerated by indexing when used on the left right or the right side of the predicate:

  • ceil(x) with real or double data type

  • in_nan(x) with real or double data type

  • cast(x as type) with double cast to real, or any type cast to varchar

  • day(d) and day_of_month(d) with date and timestamp data types

  • day_of_year(d) and doy(y) with date and timestamp data types

  • day_of_week(d) and dow(d) with date and timestamp data types

  • year(d) with date and timestamp data types

  • year_of_week(d) and yow(d) with date and timestamp data types

  • week(d) and week_of_year(d) with date and timestamp data types

  • LIKE, NOT LIKE with varchar data type

  • contains(arr_varchar, value) with array of varchar data type

The maximum supported string length for any data type used for caching and indexing is 50.000 characters.