Security and Starburst Galaxy #

Starburst Galaxy provides the benefits of Trino, on an easy to use, fully-managed and enterprise-ready SaaS platform.

Data sources, catalogs, and clusters #

Your data sources for Starburst Galaxy are managed by yourself in a cloud provider infrastructure. The data sources remain under your control. Only queried data is accessed by Starburst Galaxy.

Data source access is configured in catalogs in Starburst Galaxy. Catalogs use authentication and authorization configured by you in the data source of your cloud provider to access the data.

These catalogs can be used in one or more clusters. The clusters are within cloud platform regions of your choice. ElasticIPs are whitelisted for Starburst’s NAT gateways, to connect publicly to a customer’s resource. All access to data sources originates from these clusters.

Control plane #

The control plane of Starburst Galaxy manages the overall application, provides configuration storage and all other aspects of managing the system for all users. The control plane is deployed and managed by Starburst in our cloud environments. All storage is encrypted and separated per customer. Only a limited number of privileged users at Starburst are granted access to the control plane.

Authentication and authorization system #

Starburst Galaxy includes a role-based access control (RBAC) system to support Starburst Galaxy, the clusters, and the configured catalogs with the data from the data sources for every user.

Starburst Galaxy provides a hosted login experience allowing users to sign in with standard username and password credentials. You can manage all users for your organization with the Starburst Galaxy user interface.

Users are assigned one or more roles. A role has a name and an optional description, and can be assigned privileges on entities, such as cluster management, user creation, audit log viewing, and others. You can manage users, roles, and privileges in the Starburst Galaxy user interface.

Starburst Galaxy includes an attribute-based access control (ABAC) system that uses policies and attributes, such as tags, to help further manage role access to entities like catalogs, schemas, tables, and views. You can manage policies and tags in the Starburst Galaxy user interface.

Access to the Starburst Galaxy user interface, and directly to clusters with clients, is secured with Transport Layer Security (TLS) and globally trusted certificates.

Starburst Galaxy follows the recommendations and guidelines from the National Institute of Standards and Technology, specifically the digital identity guidelines from NIST Special Publication 800-63:

More information is available in the Starburst Galaxy security documentation.

Logging and monitoring #

Starburst Galaxy includes comprehensive logging of events and end-to-end user activities. It automates health and performance monitoring to provide observability to ensure services are functioning optimally.

Audit and compliance #

Starburst audits all actions that are taken on your account. Audit logs are maintained within the user interface and are available to you.

Usage information #

Starburst strives to access and collect only the minimum amount of information needed to provide our products and services. In some instances, Starburst staff may be required to access customer information via the Starburst Galaxy user interface to provide customer support, to fulfill legal requirements, or for other legitimate business purposes. Employees with data access undergo regular appropriate use training and our environment is protected with robust security measures and controls.

Starburst Galaxy subprocessors #

Starburst Galaxy uses third-party subprocessors to assist in providing services. For details, see Starburst Galaxy subprocessors.

Catalog explorer #

Starburst follows the security best practice of data minimization and has made specific provisions so that only metadata is accessed by the catalog explorer feature. No personal information (PI) is involved, and no data is cached by Starburst.

Starburst Warp Speed #

Clusters with Starburst Warp Speed acceleration use caches that reside on solid-state drive (SSD) storage attached to cluster nodes that are a part of Starburst Galaxy infrastructure. These caches can contain personal information (PI). No additional security is needed, however, as information is encrypted at rest and there is no means of direct access for end users. All access to cached data is subject to all applicable, existing access control policies. When nodes are destroyed, any data residing in the attached SSDs is also destroyed.

Cloudflare integration #

Starburst adds Cloudflare integration, providing robust protection for Starburst Galaxy, ensuring consistent speed, availability, and security. With Cloudflare’s global threat protection network, Starburst Galaxy can handle traffic spikes, fend off attacks, and stay online for a smooth user experience.

Customer data privacy FAQ #

Does Starburst Galaxy have access to my personal information?

In some instances, Starburst Galaxy staff may be required to access customer information via the Starburst Galaxy user interface to provide customer support, fulfill legal requirements, or for other legitimate business purposes. Your data sources for Starburst Galaxy are managed by you in a cloud provider infrastructure.

What data in customer data sources is visible to Starburst?

Starburst may observe metadata and statistics about query submission and completion. This information includes query text, session properties, user agent, start and end times, errors encountered, and high level query statistics such as overall memory consumption and total IO bytes scanned. This is used to power the query history product feature, and a reduced subset is kept for product analytics. However, in customer data sources this is never observed or recorded by Starburst, unless one of the following features are opted-in:

Customer-supplied Galaxy application configuration is approval gated, and only permitted for emergency situations or customer-initiated troubleshooting. Customer credentials are not accessible.

What customer data sources does Starburst have access to when the FTE,
accelerated clusters, and result set caching opt-in features are enabled?

Starburst never has access to a customer’s data sources. However, there are three opt-in features where customer table data may be buffered temporarily in Galaxy on a persistent storage medium. This data arbitrarily consists of whatever the customer is querying at the moment. Starburst provides no restrictions on the content of this buffered data, and thus may contain PII under the direction of the customer. No staff of Starburst see the aforementioned buffered data. All data is stored in an encrypted format.

This data is stored for the following amounts of time:

  • FTE: For the duration of the query.
  • Warp Speed: For as long as the index exists, dependent upon customer cluster usage.
  • Result set caching: Up to 1 day.
Do any Starburst staff have access to my business’s catalog data within
my production environment?

Only a limited number of privileged users at Starburst are granted access to the production environment. Access to confidential data is granted on a need-to-know basis. Access to catalog data is only permitted for troubleshooting or to resolve any emergency situations.

What data, specifically, does the staff of Starburst access?

An employee of Starburst cannot view queried data. Metadata and statistics about query submission and completion are captured, but the processed data is not observable to employees or retained. There are a few additional opt-in features (such as accelerated clusters or result set caching) that may temporarily store queried data by design, but this data is stored in an encrypted format.

Is there anything sensitive, besides queried data that an employee of
Starburst could view when a customer purchases Galaxy?

Metadata and statistics about query submission and completion are used to power the query history product feature, and a reduced subset of that is kept for product analytics.

Approximately how many Starburst employees have access to the production
environment?

Starburst utilizes Entitle which is a just-in-time access provisioning security practice that grants users an appropriate level of access, for a limited amount of time, as needed to complete tasks. Entitle will reduce the risk posed by accounts that are “always on”, and we expect completion by the end of 2023. This ensures that employees who need additional access above their base level for work are able to provide production critical support if necessary. At this time (August 2023), only 13 employees have access to the database that houses the query context that is submitted, however even these staff have no access to the customer data being queried. No data is being accessed.

Do support engineers have access to the Galaxy production environment
(AWS)?

Support engineers do not currently have access to the Galaxy production environment (AWS).

When a user enters a password to use Galaxy, do we store that
anywhere? If so, where?

All users credentials are stored in an encrypted format in a database. Starburst staff do not have access to a user’s plain text credentials.

Can Starburst access a user’s catalog credentials?

Starburst does not have access to credentials for those catalogs within a customer account. While Starburst may conduct debugging sessions, debugging is only done by screen sharing with customers when absolutely necessary.

Does Starburst share my personal data with third-party vendors?

In order to provide the services to you, Starburst Galaxy utilizes third-party vendors (subprocessors) for functions such as platform analytics, marketing services, and so forth. Starburst does not allow these third-party service providers to use your personal data for their own purposes.

What third-party vendors does Starburst share my personal information
with?

Visit Starburst subprocessors.

Does Starburst Galaxy have access to my payment card information (PCI)?

Starburst Galaxy does not collect or store credit card information. Any credit card payments you make for Starburst products are made through Stripe, although Starburst Galaxy also supports other payment methods, such as through AWS Marketplace.

Does Starburst Galaxy store any customer data that contains PI (personal
information)?

Starburst is considered a data processor. In unique situations, data may be stored temporarily in Starburst Galaxy. Within the Starburst Galaxy UI, if a customer selects the I have no metastore option, Starburst Galaxy will create a metastore for the customer and therefore all of their metadata would be hosted by Starburst. This metadata does not contain personal information (PI), unless it is configured to do so by the customer. With batch-optimized clusters, upon running a query, Starburst may temporarily store the data with our cloud provider. All data is encrypted and the data is deleted immediately after the query finishes. PI can be stored in clusters with Starburst Warp Speed enabled. Learn about PI in clusters configured to use Starburst Warp Speed.

Does Starburst sell data?

Starburst will not sell your personal data or allow a third party to use your personal data for its own commercial purpose.

If the Starburst control plane is compromised, could customer data be
exposed or exfiltrated?

No - Starburst has no data access, nor is any data stored on the Starburst side. Except where FTE, Warp Speed or result cache features are opted-in, only metadata and data about queries is stored on the Starburst side. In practice all data passes through the control plane, but the data that is stored by the control plane is all application data such as query history, configurations, billing info. The Trino plane is where the actual query data is stored, along with Warp Speed, FTE and result set caching.

How does Starburst secure access in Galaxy?
  • Only a limited number of privileged users at Starburst are granted access to the infrastructure hosting the production environment. Access to confidential PII data is granted on a need-to-know basis. In this case, PII is limited to account information such as names and email addresses of account holders, and account numbers.
  • Access to cluster and catalog configuration is only permitted for emergency situations or customer-initiated troubleshooting. Access to this information requires explicit approval from management. Note that Starburst employees do NOT have access to customer credentials. If necessary, Starburst will request to do a live debugging session with the customer; however, no catalog credentials are ever accessed by Starburst administrators.
  • MFA via Okta is in place for production infrastructure access.
  • User access requests to the production environment are documented and authorized by management prior to granting access. Note that the Starburst production environment detailed here is our own infrastructure and not the customer’s account.
Do you provide the following network security?
  • Ensure Compute clusters are secured within a private network? Yes.
  • Support for Security Groups/equivalent at individual node level? No, the security group is shared in the data plane, but each request to the Galaxy cluster, which has it’s own unique encryption context, is authenticated. Within the data plane, the clusters are namespaced and one or more clusters or workers cannot talk to another cluster or worker outside of that namespace.
  • Ensure Data shuffles do not traverse through internet at any stage? Data shuffle within the Galaxy cluster is not over the internet, but when we connect to the data source, it could be over the public network if AWS PrivateLink is not setup.
  • Support VPC private links or the equivalent within cloud service provider? Yes, we support VPC private links within AWS. Google Cloud and Azure are in development. We support AWS PrivateLink for data connectivity. We are working on client side PrivateLink support.
Does Starburst offer a shared VPC or a dedicated VPC?

Starburst is multi-tenant and operates a shared VPC. To ensure no unauthorized data source access, a tenant can only create connectors for their individual PrivateLink instance. This control ensures that no tenants can access the data source of other customers. In addition, Starburst has implemented shared VPC best practices, including cloud security posture management via Wiz, DDOS and firewall protection using Cloudflare, and robust access controls for both staff and customers.