Telemetry#

Starburst Enterprise platform (SEP) collects the following data about product performance and usage. This information is sent to Starburst, and informs our development efforts. Starburst can share the data with you on a per-request basis:

Environmental data
Configuration log data
Metrics
Optional anonymized query logs

Security#

The collected data is sent to the following endpoints in a Protobuf compressed, binary format:

https://telemetry.eng.starburstdata.net/v1/metrics
https://telemetry.eng.starburstdata.net/v1/logs

All connections use TLS, and are therefore secured and data is end-to-end encrypted.

telemetry.eng.starburstdata.net uses the following static IP addresses:

99.83.235.49
75.2.114.217

Configuration#

Telemetry is enabled by default, with the exclusion of query logging. You must opt-in to query logging by configuring it. All configuration properties listed here are set in the Config properties file.

Telemetry configuration properties#
Property name	Description
`telemetry.enabled`	Set to `false` to completely disable the telemetry module. Defaults to `true`.
`telemetry.metrics-export-enabled`	Set to `false` to disable collecting and exporting metrics. Defaults to `true`.
`telemetry.metrics-export-interval`	Frequency at which metrics are sent to Starburst. Defaults to `1h`, sending approximately 100 KB of data each time.
`telemetry.log-created`	Set to `true` to log queries when they are created. Defaults to `false`.
`telemetry.log-completed`	Set to `true` to log queries when they are completed. Defaults to `false`.
`telemetry.log-split`	Set to `true` to log splits from query processing. Defaults to `false`.
`telemetry.log-query-types`	A comma-separated list of query types to be logged. Possible query types are `SELECT`, `EXPLAIN`, `DESCRIBE`, `INSERT`, `UPDATE`, `DELETE`, `ANALYZE`, `DATA_DEFINITION`, and `ALTER_TABLE_EXECUTE`.
`telemetry.log-query-plan`	Set to `ORIGINAL` to log the full query plan. Defaults to `DISABLED`.
`telemetry.log-query-text`	Set to `ORIGINAL` to log the full query text. Defaults to `DISABLED`.
`telemetry.log-query-statistics`	Set to `ANONYMIZED` to log anonymized query statistics. Use `ORIGINAL` to log all query statistics information. Defaults to `DISABLED`.
`telemetry.log-query-io-metadata`	Set to `ANONYMIZED` to log anonymized query input-output metadata. Use `ORIGINAL` to log all query input-output metadata. Defaults to `DISABLED`.
`telemetry.log-query-failure-info`	Set to `ORIGINAL` to log the full query failure info. Defaults to `DISABLED`.
`telemetry.log-query-warnings`	Set to `ORIGINAL` to log full query warnings. Defaults to `DISABLED`.
`telemetry.config-export-enabled`	Set to `false` to disable collecting and exporting configuration. Defaults to `true`.
`telemetry.logs-export-interval`	Duration to wait and batch more logs, before sending them to Starburst. Defaults to `15m`.
`telemetry.logs-batch-size`	Maximum number of logs entries collected, before sending them to Starburst. Defaults to `1000`.
`telemetry.logs-batch-size`	Maximum size of the logs batch, before sending it to Starburst. Defaults to `1MB`.

All the data collected is annotated with the environmental data described in the next section.

Enable query logging#

SEP can be configured to collect the following information for every executed query:

Anonymized query statistics
Anonymized query input-output metadata
Query text, query plan, and failure information

Query statistics and input-output metadata do not contain any sensitive information and only expose performance-related metrics. This data helps Starburst improve the user experience as well as query performance. Therefore, we generally suggest to enable this feature in all your deployments.

To enable these logs use the following configurations:

telemetry.log-completed=true
telemetry.log-query-statistics=ANONYMIZED
telemetry.log-query-io-metadata=ANONYMIZED

Any query text and query plan may contain sensitive information, such as in WHERE and other predicate clauses. Anonymizing the data is an opt-in feature that must be specifically configured.

To configure SEP to collect the query statistics, input-output metadata, query text, query plan, and failure info of all completed SQL queries, use the following configuration:

telemetry.log-completed=true
telemetry.log-query-types=select
telemetry.log-query-statistics=ORIGINAL
telemetry.log-query-io-metadata=ORIGINAL
telemetry.log-query-plan=ORIGINAL
telemetry.log-query-text=ORIGINAL
telemetry.log-query-failure-info=ORIGINAL

Collected data#

Metrics, with the exception of environmental and configuration log data, are based on completed queries, whether successful or not. Examples of this data are provided in the following sections.

Environmental data#

The environmental data describes the ownership, licensing and service information of every SEP cluster.

SEP environment information#
Key	Value
`deployment.environment`	Environment name defined in SEP.
`license.hash`	The hash of the license file, if present.
`license.owner`	Defined by `owner` in the SEP license file, or from the account owner of the instance if deployed through a marketplace.
`license.type`	`JSON` for Kubernetes or manual deployments, or a string indicating the marketplace it was deployed through, such as `AWS`.
`service.instance.id`	A random UUID generated each time SEP starts.
`service.name`	Always set to `starburst-enterprise`.
`service.version`	The SEP version number of the cluster.
`telemetry.sdk.language`	Always set to `java`.
`telemetry.sdk.name`	Always set to opentelemetry.
`telemetry.sdk.version`	The version of the OpenTelemetry library used.
`service.start_time`	The ISO8601 date and time when the coordinator last started.

The following is an example of the data collected that describes the SEP environment:

"resource":{
    "attributes":[
      {
          "key":"deployment.environment",
          "value":{
            "string_value":"prod"
          }
      },
      {
          "key":"license.hash",
          "value":{
            "string_value":"5000eRAND0M967d0004a4eLICENSEa97b00006023dedeSTRING82460c8500055"
          }
      },
      {
          "key":"license.owner",
          "value":{
            "string_value":"Example Company"
          }
      },
      {
          "key":"license.type",
          "value":{
            "string_value":"JSON"
          }
      },
      {
          "key":"service.instance.id",
          "value":{
            "string_value":"6d35zzzz-2000-4628-zzzz-120000zzzzed"
          }
      },
      {
          "key":"service.name",
          "value":{
            "string_value":"starburst-enterprise"
          }
      },
      {
          "key":"service.version",
          "value":{
            "string_value":"prod"
          }
      },
      {
          "key":"telemetry.sdk.language",
          "value":{
            "string_value":"java"
          }
      },
      {
          "key":"telemetry.sdk.name",
          "value":{
            "string_value":"opentelemetry"
          }
      },
      {
          "key":"telemetry.sdk.version",
          "value":{
            "string_value":"1.6.0"
          }
      }
    ]
}

Configuration log data#

SEP collects configuration property names, and a representation of the value. Boolean values are recorded as-is. Binary values are rounded to the nearest base two magnitude. For example, 72 GB is recorded as 64 GB. Other numeric values, such as INTEGER and DOUBLE are rounded down to the nearest order of magnitude. For example, 54,321 is rounded to 100,000. For most configuration properties, SEP does not record text values, only that they are set.

Text values are recorded for the following configuration properties:

access-control.name
connector.name
delta.security
hive.security
http-server.authentication.oauth2.issuer
http-server.authentication.type
iceberg.security
password-authenticator.name
retry-policy
warp-speed.proxied-connector
web-ui.authentication.type

The following JSON snippet is an example of the data collected that describes SEP configuration properties:

"logs": [
  {
    "time_unix_nano": "1637193575209705000",
    "severity_number": "SEVERITY_NUMBER_INFO",
    "name": "bootstrap",
    "body": {
      "string_value": ""
    },
    "attributes": [
      {
        "key": "propertyName",
        "value": {
          "string_value": "cache-service.cache-ttl"
        }
      },
      {
        "key": "propertyValue",
        "value": {
          "string_value": "0.00ns"
        }
      }
    ]
  },
  {
    "time_unix_nano": "1637193575223536000",
    "severity_number": "SEVERITY_NUMBER_INFO",
    "name": "bootstrap",
    "body": {
      "string_value": ""
    },
    "attributes": [
      {
        "key": "propertyName",
        "value": {
          "string_value": "cache-service.uri"
        }
      }
    ]
  },
  {
    "time_unix_nano": "1637193575224097000",
    "severity_number": "SEVERITY_NUMBER_INFO",
    "name": "bootstrap",
    "body": {
      "string_value": ""
    },
    "attributes": [
      {
        "key": "propertyName",
        "value": {
          "string_value": "materialized-views.namespace"
        }
      }
    ]
  },
]

Metrics#

All metrics collected by SEP are aggregated for the time period starting at start_time_unix_nano and ending at time_unix_nano. These timestamps are repeated with the same value with most metrics.

Metrics are based on completed queries, whether successful or not. Examples of this data are provided below.

`queries_executed`#

SEP collects aggregated counts of specific query dimensions as described in the following table.

Query execution count dimensions#
Dimension	Description
`columnType`	Total queries per column type, across all sources.
`connector`	Total queries per connector.
`connector`, `queryType`	Total queries by connector and query type. Possible query types are `SELECT`, `EXPLAIN`, `DESCRIBE`, `INSERT`, `UPDATE`, `DELETE`, `ANALYZE`, `DATA_DEFINITION`, and `ALTER_TABLE_EXECUTE`.
`function`	Total queries by named function or UDF.
`sessionProperty`, `value`	Total queries using named session property or catalog session property, and a representation of the value. Boolean values are recorded as-is. Binary values are rounded to the nearest base 2 magnitude. For example, 72 GB is recorded as 64 GB. Other numeric values are rounded down to the nearest order of magnitude. For example, 54,321 is rounded to 100,000. Text values are not recorded, only the fact that they were set.
`source`	Total queries per named client, as supplied by client, such as “trino-cli”.

The following is an example of the collected dimensional query execution data:

"name":"queries_executed",
"unit":"1",
"sum":{
    "data_points":[
      {
          "start_time_unix_nano":"1635164762424772000",
          "time_unix_nano":"1635172027851773000",
          "as_int":"3",
          "attributes":[
            {
                "key":"source",
                "value":{
                  "string_value":"trino-cli"
                }
            }
          ]
      },
      {
          "start_time_unix_nano":"1635164762424772000",
          "time_unix_nano":"1635172027851773000",
          "as_int":"1",
          "attributes":[
            {
                "key":"function",
                "value":{
                  "string_value":"max"
                }
            }
          ]
      },
      {
          "start_time_unix_nano":"1635164762424772000",
          "time_unix_nano":"1635172027851773000",
          "as_int":"2",
          "attributes":[
            {
                "key":"connector",
                "value":{
                  "string_value":"postgresql"
                }
            },
            {
                "key":"queryType",
                "value":{
                  "string_value":"SELECT"
                }
            }
          ]
      },

`queries_failed`#

SEP collects aggregated counts of query failures.

Query failure count dimensions#
Dimension	Description
`errorCode`, `failureType`	Total failed queries by failure type. Error codes are numeric code values. FailureTypes are exception class names such as `io.trino.plugin.hive.ViewAlreadyExistsException` or a generic `io.trino.spi.TrinoException but also java.lang.NullPointerException`.

The following is an example of the collected data:

"name":"queries_failed",
"unit":"1",
"sum":{
    "data_points":[
      {
          "start_time_unix_nano":"1635164762424772000",
          "time_unix_nano":"1635172027851773000",
          "as_int":"3",
          "attributes":[
            {
                "key":"error_code",
                "value":{
                  "int_value":"400"
                }
            },
            {
                "key":"failure_type",
                "value":{
                  "string_value":"Can't create database 'foo'; database exists"
                }
            }
          ]
      },
}

`physical_input_bytes`#

SEP collects the aggregated byte count of data in all processed queries.

Physical input bytes dimension counts#
Dimension	Description
`connector`	Total input bytes by connector.

The following is an example of the collected data:

"name":"physical_input_bytes",
"unit":"byte",
"sum":{
    "data_points":[
      {
          "start_time_unix_nano":"1635164762424772000",
          "time_unix_nano":"1635172027851773000",
          "as_int":"300",
          "attributes":[
            {
                "key":"connector",
                "value":{
                  "string_value":"postgresql"
                }
            }
          ]
      },
      {
          "start_time_unix_nano":"1635164762424772000",
          "time_unix_nano":"1635172027851773000",
          "as_int":"300"
          ]
      }
    ]
}

`physical_input_rows`#

SEP collects the aggregated count of input rows of data in all processed queries.

Physical input rows dimension counts#
Dimension	Description
`connector`	Total input rows by connector

The following is an example of the collected data:

"name":"physical_input_rows",
"unit":"1",
"sum":{
    "data_points":[
      {
          "start_time_unix_nano":"1635164762424772000",
          "time_unix_nano":"1635172027851773000",
          "as_int":"300",
          "attributes":[
            {
                "key":"connector",
                "value":{
                  "string_value":"postgresql"
                }
            }
          ]
      },
      {
          "start_time_unix_nano":"1635164762424772000",
          "time_unix_nano":"1635172027851773000",
          "as_int":"300"
          ]
      }
    ]
}

Query performance and complexity metrics#

SEP collects aggregations of key performance and complexity measures of the queries it processes.

Query performance and complexity metrics#
Metric	Data type	Description
`analysis_time`	Histogram	Binned query analysis times for all queries in the collection time period.
`catalogs`	Histogram	Binned number of distinct catalogs used in a query for all queries in the collection time period.
`connectors`	Histogram	Binned number of distinct connectors used in a query for all queries in the collection time period.
`cpu_time`	Histogram	Binned total CPU time spent processing a query, for all queries in the collection time period.
`cumulative_memory`	Single value	Binned cumulative memory for a single query throughout its processing, for all queries in the collection time period. This is different from peak memory; not all of the cumulative memory may have been in use at the same time.
`cumulative_system_memory`	Single value	Cumulative memory used by queries in the collection period.
`execution_time`	Histogram	Binned query execution times for all queries in the collection time period.
`input_columns`	Histogram	Binned number of input columns used in a query for all queries in the collection time period.
`output_columns`	Histogram	Binned number of output columns resulting from a query for all queries in the collection time period.
`peak_task_total_memory`	Single value	Highest measured memory used by a task in the collection period.
`peak_task_user_memory`	Single value	Highest measured user memory used by a task in the collection period.
`planning_time`	Histogram	Binned resource waiting times for all queries in the collection time period.
`queued_time`	Histogram	Binned query queued times for all queries in the collection time period.
`resource_waiting_time`	Histogram	Binned resource waiting times for all queries in the collection time period.
`scheduled_time`	Histogram	Binned scheduled times for all queries in the collection time period.
`schemas`	Histogram	Binned number of distinct schemas used in a query for all queries in the collection time period.
`splits`	Single value	Total number of splits across all queries in the collection time period.
`stages`	Single value	Binned number of stages for a single query, for all queries in the collection time period.
`stage_max_tasks`	Histogram	Binned number of tasks in any given stage for a single query, for all queries in the collection time period.
`tables`	Histogram	Binned number of distinct tables used in a query for all queries in the collection time period.
`table_max_columns`	Histogram	Binned number of columns in a single table for all tables used in a query for all queries in the collection time period.
`wall_time`	Histogram	Binned query wall times for all queries in the collection time period. Wall time does not include queued time.

The following is an example of a single-value metric:

{
  "name":"peak_task_total_memory",
  "unit":"byte",
  "sum":{
      "data_points":[
        {
            "start_time_unix_nano":"1635164762424772000",
            "time_unix_nano":"1635172027851773000",
            "as_int":"66609"
        }
      ],
      "aggregation_temporality":"AGGREGATION_TEMPORALITY_CUMULATIVE",
      "is_monotonic":true
  }
}

Performance data that is presented in a histogram also includes count and sum values, where the count is equal to the number of instances represented in the histogram, and the sum is the metric aggregated across all instances, such as shown in the following example, where there were three queries with an aggregated analysis time of 1396.0 ms:

{
  "name":"analysis_time",
  "unit":"millisecond",
  "histogram":{
      "data_points":[
        {
            "start_time_unix_nano":"1635164762424772000",
            "time_unix_nano":"1635172027851773000",
            "count":"3",
            "sum":1396.0,
            "bucket_counts":[
              "0",
              "0",
              "2",
              "1",
              "0",
              "0",
              "0",
              "0",
              "0",
              "0",
              "0"
            ],
            "explicit_bounds":[
              10.0,
              100.0,
              500.0,
              1000.0,
              2000.0,
              10000.0,
              60000.0,
              300000.0,
              3600000.0,
              86400000.0
            ]
        }
      ],
      "aggregation_temporality":"AGGREGATION_TEMPORALITY_CUMULATIVE"
  }
}

Optional query log data#

If query log collection is enabled, each query processed results in one or more associated log entries. The following is an example of a query log entry:

"logs": [
  {
    "time_unix_nano": "1635515535751000000",
    "severity_number": "SEVERITY_NUMBER_INFO",
    "name": "queryCompletedEvent",
    "body": {
      "string_value": ""
    },
    "attributes": [
      {
        "key": "createTime",
        "value": {
          "string_value": "2021-10-29T13:52:13.288Z"
        }
      },
      {
        "key": "endTime",
        "value": {
          "string_value": "2021-10-29T13:52:15.654Z"
        }
      },
      {
        "key": "executionStartTime",
        "value": {
          "string_value": "2021-10-29T13:52:13.501Z"
        }
      },
      {
        "key": "failureInfo",
        "value": {
          "string_value": "null"
        }
      },
      {
        "key": "metadata.plan",
        "value": {
          "string_value": "Fragment 0 [SINGLE]\n    CPU: 18.33ms, Scheduled: 24.11ms, Input: 598 rows (65.56kB); per task: avg.: 598.00 std.dev.: 0.00, Output: 598 rows (57.21kB)\n    Output layout: [field, field_0, field_1, field_2, field_3, field_4]\n    ..."
        }
      },
      {
        "key": "metadata.query",
        "value": {
          "string_value": "SHOW FUNCTIONS"
        }
      },
      {
        "key": "statistics",
        "value": {
          "string_value": "{\"cpuTime\":0.097000000,...}"
        }
      },
      {
        "key": "inputIOMetadata",
        "value": {
          "string_value": "[{"connectorMetrics":{"Physical input read time":...}}]"
        }
      },
      {
        "key": "warnings",
        "value": {
          "string_value": "[]"
        }
      }
    ]
  }
]

Anonymized query statistics#

Anonymized query statistics collects metrics related to query execution without exposing any sensitive information.

Note

We generally suggest to enable this feature in all your deployments, to assist Starburst in improving query performance by using this data.

To enable it use the following properties:

telemetry.log-completed=true
telemetry.log-query-statistics=ANONYMIZED

{
  "cpuTime": 0.005,
  "failedCpuTime": 0,
  "wallTime": 0.113,
  "queuedTime": 0,
  "scheduledTime": 0.019,
  "failedScheduledTime": 0,
  "analysisTime": 0.01,
  "planningTime": 0.026,
  "executionTime": 0.103,
  "inputBlockedTime": 0,
  "failedInputBlockedTime": 0,
  "outputBlockedTime": 0,
  "failedOutputBlockedTime": 0,
  "peakUserMemoryBytes": 117,
  "peakTaskUserMemory": 117,
  "peakTaskTotalMemory": 117,
  "physicalInputBytes": 1512,
  "physicalInputRows": 4,
  "processedInputBytes": 31,
  "processedInputRows": 3,
  "internalNetworkBytes": 0,
  "internalNetworkRows": 0,
  "totalBytes": 1512,
  "totalRows": 3,
  "outputBytes": 31,
  "outputRows": 3,
  "writtenBytes": 0,
  "writtenRows": 0,
  "cumulativeMemory": 0,
  "failedCumulativeMemory": 0,
  "stageGcStatistics": [
    {
      "stageId": 0,
      "tasks": 1,
      "fullGcTasks": 0,
      "minFullGcSec": 0,
      "maxFullGcSec": 0,
      "totalFullGcSec": 0,
      "averageFullGcSec": 0
    }
  ],
  "completedSplits": 2,
  "complete": true,
  "cpuTimeDistribution": [
    {
      "stageId": 0,
      "tasks": 1,
      "p25": 5,
      "p50": 5,
      "p75": 5,
      "p90": 5,
      "p95": 5,
      "p99": 5,
      "min": 5,
      "max": 5,
      "total": 5,
      "average": 5
    }
  ],
  "operatorSummaries": [
    "{\n  \"stageId\" : 0,\n  \"pipelineId\" : 0,\n  \"operatorId\" : 0,\n  \"planNodeId\" : \"0\",\n  \"operatorType\" : \"TableScanOperator\",...}",
    "{\n  \"stageId\" : 0,\n  \"pipelineId\" : 0,\n  \"operatorId\" : 1,\n  \"planNodeId\" : \"6\",\n  \"operatorType\" : \"TaskOutputOperator\",...}"
  ],
  "planNodeStatsAndCosts": "{\n  \"stats\" : { },\n  \"costs\" : { }\n}",
  "resourceWaitingTime": 0.01
}

Anonymized query input-output metadata#

Anonymized query input-output metadata consists of connector metrics along with anonymized table-level metadata. Hence, it doesn’t expose any sensitive information. For instance, the catalog name tpch is anonymized to catalog_1.

Note

We generally suggest to enable this feature in all your deployments, to assist Starburst in improving query performance by using this data.

To enable it use the following properties:

telemetry.log-completed=true
telemetry.log-query-io-metadata=ANONYMIZED

[
  {
    "catalogName": "catalog_dd42e04c",
    "schema": "schema_8fcc1516",
    "table": "table_dd3bdbd2",
    "columns": [
      "column_144ef030",
      "column_6c2bbea9"
    ],
    "connectorMetrics": {
      "Physical input read time": {
        "@class": "io.trino.plugin.base.metrics.DurationTiming",
        "duration": "1266330.00ns"
      },
      "OrcReaderCompressionFormat_ZLIB": {
        "@class": "io.trino.plugin.base.metrics.LongCount",
        "total": 112
      }
    },
    "physicalInputBytes": 1512,
    "physicalInputRows": 4
  }
]

Telemetry#

Security#

Configuration#

Enable query logging#

Collected data#

Environmental data#

Configuration log data#

Metrics#

queries_executed#

queries_failed#

physical_input_bytes#

physical_input_rows#

Query performance and complexity metrics#

Optional query log data#

Anonymized query statistics#

Anonymized query input-output metadata#

`queries_executed`#

`queries_failed`#

`physical_input_bytes`#

`physical_input_rows`#