Starburst Enterprise cluster basics #

Starburst Enterprise platform (SEP) is powerful and highly configurable. We have extensive documentation to help you ensure that SEP works as efficiently and securely as possible in your environment.

Architecture #

A SEP cluster consists of a coordinator and many workers. Users connect to the coordinator with their SQL query tool. The coordinator collaborates with the workers. The coordinator as well as all the workers access the connected data sources. This access is configured in catalogs.

You can learn more in the concepts section of the reference documentation.

Processing each query is a stateful operation. The workload is orchestrated by the coordinator and spread parallel across all workers in the cluster. Each node runs SEP in one JVM instance, and processing is parallelized further using threads.

Memory and CPU resource considerations #

Typical workloads for SEP require large amounts of memory and CPU for processing. For optimal scheduling all workers need to have the same large amount of memory allocated.

OS and software requirements #

SEP requires Linux, Java, and Python. Specific details vary for each version and deployment platform.

Networking #

SEP has a few networking aspects you need to consider:

  • Access to the coordinator for users requires HTTP/HTTPS access
  • Access from the cluster to any external authentication system such as LDAP
  • Access from the all cluster nodes to any queried data sources

Securing #

Starburst has an array of powerful, comprehensive security features to ensure that your data governance and security are top-notch. We strongly suggest you begin by watching the training video below. After that, our extensive security documentation will help you get started securing your data with SEP.

Configuration basics #

When you are ready to install, we’ve got you covered with detailed reference documentation covering everything from deployment to setting up data source connections:

Deployment options #

SEP can be deployed in several ways, depending on your organization’s infrastructure, skills, and existing tooling. Follow the link most appropriate to your environment to learn more about deploying SEP:

We highly suggest using a cloud provider marketplace offering, and self-managed Kubernetes deployments in the cloud for production clusters:

The following options are available for baremetal servers or virtual machines in your own data center or in a cloud environment: