Insights configuration#

Insights requires some additional configuration before you can access its features in the Starburst Enterprise web UI.

Requirements#

Insights requires:

Note

Usage metrics are surfaced and documented in Insights. However, usage metrics continue to be collected even when Insights is disabled.

Insights is configured in the config.properties file on the coordinator only. To avoid startup failures, do not add Insights properties to worker configuration files.

General configuration properties#

You must explicitly enable Insights to use the persisted data in the query logger database with insights.persistence-enabled=true. Persisted data provides the information needed for the query and cluster history features. Additionally, you can configure Insights to persist cluster metrics to the same database with the insights.metrics-persistence-enabled property.

For deployments using built-in access control, usage metrics and unrestricted access to query history are controlled through roles and privileges in the built-in access control system.

Warning

Do not use the deprecated insights.authorized-users and insights.authorized-groups with SEP’s built-in access control, as they provide administrative access outside of built-in access control. Use the built-in access control configuration properties instead.

Deployments using supported third-party access control integrations such as Ranger and Privacera must use the UI access controls and configuration properties available in SEP’s built-in access control to control which users can access specific UI elements if that level of control is desired.

For legacy deployments, usage metrics and unrestricted access to query history are controlled by the deprecated insights.authorized-users and insights.authorized-groups configuration properties. You must assign specific authorized users and/or groups, or enable everyone with a wildcard. If a user has not been granted access, they can only review queries made by that user account. See example authorization configurations in Authorization examples.

Cluster and query history-related configuration properties#

Property name

Description

insights.persistence-enabled

Enable the query history functionality. Defaults to false.

insights.metrics-persistence-enabled

Enable the usage metrics functionality. Defaults to false.

insights.authorized-users

Regular expression to match user names granted unrestricted access to the query history and to the usage metrics feature. See the examples for more details. Do not use if implementing SEP’s built-in access control. Use the built-in access control configuration properties instead.

insights.authorized-groups

Regular expression to match user groups granted unrestricted access to the query history and to the usage metrics feature. See the examples for more details. Do not use if implementing SEP’s built-in access control. Use the built-in access control configuration properties instead.

The following example config.properties configuration enables Insights on a coordinator that already has a query logger configured:

insights.persistence-enabled=true
insights.metrics-persistence-enabled=true

Data retention#

Insights’s data retention configuration properties allow you to define intervals at which SEP performs a sweep to purge history, as well as the range of data purged, from the following tables:

  • completed_queries

  • query_user_group

  • query_tables

  • query_views

  • query_plan

Note

Usage metrics, cluster metrics, query editor tab information, node anomalies, and recommendations are not purged.

Data retention policies are disabled by default, ensuring that no data is purged automatically. If data retention policies are enabled, SEP logs all deletions.

Insights data retention configuration properties#

Property name

Description

Example

insights.data-retention.max-age

Data older than this threshold is purged. Can be any SEP duration value, but is always rounded up to an integer number of days. If omitted, no data is deleted.

insights.data-retention.max-age=180d

insights.data-retention.sweep-schedule

Schedule to start sweep in unix cron format. If omitted, 0 0 * * * is used (each day at midnight).

insights.data-retention.sweep-schedule=0 3 * * *

insights.data-retention.sweep-schedule-timezone

Specifies the timezone in TZDB format to be used by sweep schedule. If omitted, the system timezone is used.

insights.data-retention.sweep-schedule-timezone=America/New_York

insights.data-retention.sweep-max-duration

Maximum sweep duration. Can be used along with sweep schedule to limit the time interval when data deletion occurs. Can be any SEP duration value. If omitted, or if the configured value exceeds time until the next schedule, the sweep runs until the next scheduled sweep.

insights.data-retention.sweep-max-duration=3h

Miscellaneous configuration properties#

The following optional properties allow you to fine-tune the functionality and behavior of Insights.

Miscellaneous Insights configuration properties#

Property name

Description

insights.jdbc.connection-pool.max-size

Maximum number of connections to the query logger database in the connection pool in SEP. Default is 10.

insights.jdbc.connection-pool.min-size

Minimum number of connections to the query logger database in the connection pool in SEP. Default is 1.

insights.metrics-collection-interval

How often query and cluster metrics are sampled for the overview page. Default is 15s.

insights.metrics-persistence-interval

How often query and cluster resource metrics are persisted for the cluster history page. Default is 60s.

insights.max-samples

The number of sample data points to store in memory for the graphs on the overview page. Default is 120.

Instructions for configuring Insights for Kubernetes deployments are available in the Kubernetes documentation.

Authorization examples#

You can grant unrestricted access to the Insights query overview and query history features, using regular expressions for authorized users and groups.

Note

The insights.* properties described on this page only manage access to the following panes in the Insights section of the Starburst Enterprise web UI:

  • Query overview (and the Query details sub-pages for each query)

  • Usage metrics

The properties that manage access to the built-in access control features are independent. They are managed with the starburst.access-control.* properties described in Access control users and groups.

To grant everyone unrestricted access to the Insights features noted previously, use the expression .*, which matches any user name.

insights.authorized-users=.*

To grant unrestricted access to users alice, bob, and charlie, use a regular expression with the pipe (|) separator:

insights.authorized-users=alice|bob|charlie

User names are provided from the cluster’s configured authentication system. Only PASSWORD-based authentication types are supported.

To grant unrestricted access to all users in the admin and super groups, include the following:

insights.authorized-groups=admin|super

Groups must be configured with File group provider or LDAP group provider in etc/group-provider.properties.

You can also configure specific users at the same time you configure groups. For example the following setup grants unrestricted access to any user in the admin group, and also to the user alice, even if she is not a member of the admin group:

insights.authorized-users=alice
insights.authorized-groups=admin

If both insights.authorized-users and insights.authorized-groups are empty, no user has access to the Insights usage metrics and query history features.