Clusters #

A cluster in Starburst Galaxy provides the resources to run queries against numerous data sources. Clusters define the number of workers, the configuration for the JVM runtime, configured data sources, and other aspects.

The Starburst Galaxy platform (SGP) allows you to create, edit, and delete clusters from the interface. Access your clusters at any time by clicking Clusters on the left hand menu.

Add a new cluster #

Before you can create a cluster in the SGP, you need to add one or more data sources.

  1. On the Clusters page, select + New. If this is your first cluster, you can select + New cluster from the Dashboard or Clusters pages.
  2. Enter a unique Cluster name. Names can use lowercase letters, numbers, and hyphens.
    Add a unique cluster name

  3. Select your cluster profile (review cluster profile details below):
    • Cost optimized
    • General purpose
    • Memory optimized
    • Compute optimized
  4. From the Add data source(s) drop down menu, select the data sources for your cluster. If you don’t have a data source, click Connect a new data source to add one.
    Select one or more data sources

  5. Select your Availability zone.
    Select the availability zone from the dropdown menu

  6. Set your number of workers. Starburst Galaxy recommends a minimum of two workers. Check the box next to Use autoscaling to provide and minimum and maximum number of workers for your cluster.
    Set the minimum and maximum number of workers
  1. Select Create cluster to finish.
    Click create cluster button

You can start, stop, and edit your cluster at any time from the Clusters page.

Cluster profile overview #

Starburst Galaxy provides four cluster profiles to allow you to create a cluster that is right for your purposes. Review the cluster profile differences:

  • Cost optimized: Balance your workloads with total cost. Ideal for performing the most work for the price.
    • Instance type: r5a.large
    • vCPUs: 2
    • Memory (GiB): 16
    • Instance Storage (GiB): EBS Only
    • Network Bandwidth (Gbps): Up to 10
    • EBS Bandwidth (Mbps): Up to 650
  • General purpose: Balance your compute, memory, and networking resources. Suitable for a variety of diverse workloads.
    • Instance type: r5a.4xlarge
    • vCPUs: 16
    • Memory (GiB): 128
    • Instance Storage (GiB): EBS Only
    • Network Bandwidth (Gbps): Up to 10
    • EBS Bandwidth (Mbps): Up to 2,880
  • Memory optimized: Deliver fast performance for workloads that process large data sets in memory.
    • Instance type: m5a.12xlarge
    • vCPUs: 48
    • Memory (GiB): 192
    • Instance Storage (GiB): EBS Only
    • Network Bandwidth (Gbps): 10
    • EBS Bandwidth (Mbps): 6,780
  • Compute optimized: Ideal for compute bound applications that benefit from high performance processors.
    • Instance type: r5a.xlarge
    • vCPUs: 4
    • Memory (GiB): 32
    • Instance Storage (GiB): EBS Only
    • Network Bandwidth (Gbps): Up to 10
    • EBS Bandwidth (Mbps): Up to 1,085