Starburst Enterprise in GCP Marketplace #

Starburst Enterprise platform (SEP) is available directly through the Google Cloud Platform (GCP) Marketplace to run on a variety of instance types. Our GCP marketplace offering allows you to easily set up a monthly contract, after which you can deploy SEP using the command line or Google’s Click to Deploy on Google Kubernetes Engine (GKE).

Using command line deployment vs. Click to Deploy #

We strongly recommend that you deploy using the command line. Google’s Click to Deploy option is best suited for small proofs-of-concept. Customization options may be limited.

Once deployed via the command line, SEP is customizable, including connecting to services such as:

  • Hive Metastore (HMS)
  • Apache Ranger
  • Starburst Cache Service

Support for marketplace offerings #

Starburst offers the following support for our marketplace subscribers:

Set up your subscription #

Before you begin, you must have a GCP login with the ability to subscribe to services.

To subscribe to SEP through the GCP Marketplace:

  1. Log in with your billable subscriber account and access the Starburst Enterprise offering directly, or enter “Starburst Enterprise” in the marketplace search field and select Starburst Enterprise - Distributed SQL Query Engine.
  2. Click Configure.
  3. On the resulting screen, select either Deploy via command line (recommended), or Click to Deploy on GKE.

Deploy via command line #

  1. Reach out to Starburst Support to have your service account added with to the Starburst Google Container Registry (GCR) with the Storage Object Viewer role.
  2. Create a GKE Standard cluster with two nodepools. One nodepool is used for SEP while the other is used for Ranger and HMS. The following are the recommended minimal specifications to use for a proof-of-concept deployment:
    • Cluster name: my-sep-cluster
    • Location type: zonal (lower latency)
    • K8s version: 1.20.9-gke.1001
    • Primary nodepool name: default-nodepool
      • Number of nodes: 3
      • Machine configuration: e2-standard-16 (16 CPU and 64 GB RAM)
    • Supplementary nodepool name: nonsep
      • Number of nodes: 1
      • Machine configuration: e2-standard-8 (8 CPU and 32 GB RAM)
  3. Set the following environment variables to be used with kubectl commands as follows:
    $ export TAG=2.2.0
    $ export APP_INSTANCE_NAME=<APP_INSTANCE_NAME>
    
  4. Select the service account that you added to the Starburst GCR in Step 1, and click Generate license key. Download the resulting file.
  5. Apply the downloaded license file with the following command:
    $ kubectl apply -f license.yaml
    
  6. Confirm that the <APP_INSTANCE_NAME>-license secret has been created with the following command:
    $ kubectl describe secret $APP_INSTANCE_NAME-license
    
  7. Use the following commands to retrieve and unzip the GCP umbrella chart Helm chart:
    $ wget https://storage.googleapis.com/starburst-enterprise/helmCharts/sep-gcp/starburst-enterprise-platform-charts-$TAG.tgz
    $ tar -zxvf starburst-enterprise-platform-charts-$TAG.tgz
    
  8. Use kubectl to apply Application CRD to avoid harmless errors:
    $ kubectl apply -f "https://raw.githubusercontent.com/GoogleCloudPlatform/marketplace-k8s-a
    
  9. Generate the complete Kubernetes deployment manifest as follows. The command as shown assumes the nodepool names as described in previous steps:
    $ helm template "$APP_INSTANCE_NAME" . \
         --set deployerHelm.image="gcr.io/starburst-public/starburstdata/deployer:$TAG" \
         --set reportingSecret="$APP_INSTANCE_NAME-license" \
         --set starburst-enterprise.image.repository="gcr.io/starburst-public/starburstdata" \
         --set starburst-enterprise.image.tag="$TAG" \
         --set starburst-enterprise.initImage.repository="gcr.io/starburst-public/starburstdata/starburst-enterprise-init" \
         --set starburst-enterprise.initImage.tag="$TAG" \
         --set metricsReporter.image="gcr.io/starburst-public/starburstdata/metrics_reporter:$TAG" \
         --set imageUbbagent="gcr.io/cloud-marketplace-tools/metering/ubbagent:latest" \
         --set starburst-enterprise.coordinator.resources.limits.cpu=15 \
         --set starburst-enterprise.coordinator.resources.requests.cpu=15 \
         --set starburst-enterprise.coordinator.resources.memory="56Gi" \
         --set starburst-enterprise.worker.replicas=2 \
         --set starburst-enterprise.worker.resources.limits.cpu=15 \
         --set starburst-enterprise.worker.resources.requests.cpu=15 \
         --set starburst-enterprise.worker.resources.memory="56Gi" \
         --set starburst-ranger.admin.image.repository="gcr.io/starburst-public/starburstdata/starburst-ranger-admin" \
         --set starburst-ranger.admin.image.tag="$TAG" \
         --set starburst-ranger.usersync.image.repository="gcr.io/starburst-public/starburstdata/ranger-usersync" \
         --set starburst-ranger.usersync.image.tag="$TAG" \
         --set starburst-ranger.gcpExtraNodePool="nonsep" \
         --set starburst-ranger.enabled=true \
         --set starburst-hive.image.repository="gcr.io/starburst-public/starburstdata/hive" \
         --set starburst-hive.image.tag="$TAG" \
         --set starburst-hive.gcpExtraNodePool=nonsep \
         --set starburst-hive.enabled=true  > sep_manifest.yaml
    
  10. Apply the newly-generated K8s manifest with the following command:
    $ kubectl apply -f sep_manifest.yaml
    
  11. Check to see when the cluster is created and available:
    $ kubectl get pods -o wide
     NAME                                                     READY   STATUS      RESTARTS   AGE    IP           NODE                                             NOMINATED NODE   READINESS GATES
     coordinator-64cfdb94fd-v6bxv                             2/2     Running     0          4h8m   10.28.3.8    gke-test-mp-cluster-default-pool-5b89684f-6g3b   <none>           <none>
     hive-7c8b5b5495-v9gwz                                    2/2     Running     0          4h8m   10.28.0.27   gke-test-mp-cluster-nonsep-0be06378-b718         <none>           <none>
     ranger-7c6b59bdd5-b9v8s                                  2/2     Running     0          4h8m   10.28.0.28   gke-test-mp-cluster-nonsep-0be06378-b718         <none>           <none>
     starburst-enterprise-1-lic-secret-job-nfmqg              0/1     Completed   0          4h8m   10.28.2.31   gke-test-mp-cluster-default-pool-5b89684f-qfvp   <none>           <none>
     starburst-enterprise-1-metrics-reporter-9f9f5f77-7nvd2   2/2     Running     0          4h8m   10.28.2.30   gke-test-mp-cluster-default-pool-5b89684f-qfvp   <none>           <none>
     worker-76ff548b96-c2rwz                                  1/1     Running     0          4h8m   10.28.2.32   gke-test-mp-cluster-default-pool-5b89684f-qfvp   <none>           <none>
     worker-76ff548b96-wj996                                  1/1     Running     0          4h8m   10.28.1.12   gke-test-mp-cluster-default-pool-5b89684f-kfz2   <none>           <none>
    
  12. Once you have confirmed that the cluster is running, ensure that the metrics reporter is able to submit metrics to GCP metering service. The following command should return a Report submission status: 200 : OK message:
    $ kubectl logs deployment/$APP_INSTANCE_NAME-metrics-reporter -c metrics-reporter
    
  13. Verify that there are no errors reported by ubbagent:
    $ kubectl logs deployment/$APP_INSTANCE_NAME-metrics-reporter -c ubbagent
    

Next steps #

Review our guides to Catalogs and Connectors, and take advantage of our free comprehensive training videos that cover SEP security and performance:

Click to Deploy #

Google’s Click to Deploy option is best suited for small proof-of-concept deployments. Customization options may be limited. We strongly recommend that you begin with the *Deploy via command line** option.

  1. Select an existing cluster or create a GKE Standard cluster. SEP requires two nodepools. One nodepool is used for SEP while the other is used for Ranger and HMS. The following are the recommended minimal specifications to use for proofs-of-concept:
    • Cluster name: my-sep-cluster
    • Location type: zonal (lower latency)
    • K8s version: 1.20.9-gke.1001
    • Primary nodepool name: default-nodepool
      • Number of nodes: 3
      • Machine configuration: e2-standard-16 (16 CPU and 64 GB RAM)
    • Supplementary nodepool name: nonsep
      • Number of nodes: 1
      • Machine configuration: e2-standard-8 (8 CPU and 32 GB RAM)
  2. The default values displayed in the configuration screen are absolute minimums, and are not considered performant. You must change them to represent the resources available in the selected cluster.
  3. Select a namespace. Do not use “default” if there is another SEP already running in it.
  4. To enable global security with Apache Ranger:
    1. Check the Enable Global Security box. This creates a new Ranger instance that is configured to communicate with your SEP cluster. Note that you cannot connect to an existing Ranger instance with Click to Deploy.
    2. Provide an admin and service user password for Ranger. These must be at least 8 alpha-numeric characters in length.
  5. You must click to Enable Hive Metastore if you are connecting any Hive, Iceberg or Delta Lake data sources.
  6. Provide a reporting service account.
  7. Click Deploy. Once the application is deployed, it is available in the GCP Console.

Next steps #

The following pages introduce key concepts and features in SEP: