Starburst Enterprise in Google Cloud Marketplace#
Starburst Enterprise platform (SEP) is available directly through the Google Cloud Platform Marketplace to run on a variety of instance types. Our Google Cloud Marketplace offering allows you to easily set up a monthly contract, after which you can deploy SEP using the command line or Google’s Click to Deploy on Google Kubernetes Engine (GKE).
Deployment options#
We strongly recommend that you deploy using the command line. Google’s Click to Deploy option is best suited for small proofs-of-concept and limits your customization options.
After you have deployed using the command line, you can customize SEP including adding catalogs and services such as:
Hive Metastore (HMS)
Apache Ranger
Starburst Cache Service
Marketplace support#
Starburst Enterprise offers the following support for our marketplace subscribers:
Email-only support
Five email issues to gcpsupport@starburstdata.com per month
First response SLA of one business day
Support hours between 9 AM - 6 PM US Eastern Time
Set up your subscription#
Before you begin, you must have a Google Cloud login with the ability to subscribe to services.
Note
In the following steps, if the blue button says Purchase instead of Configure, the Google Cloud account you are using is not setup for subscription billing.
To subscribe to SEP through the Google Cloud Marketplace:
Log in with your billable subscriber account and access the Starburst Enterprise offering directly, or enter “Starburst Enterprise” in the marketplace search field and select Starburst Enterprise - Distributed SQL Query Engine.
Click Configure.
On the resulting screen, select Deploy via command line. Click to Deploy on GKE is not supported.
Set up the GKE cluster#
Create a GKE Standard cluster with two nodepools. One nodepool is for SEP while the other is for Ranger and HMS, so if you don’t need these additional services then the default nodepool for SEP is sufficient. The following are the recommended minimal specifications to use for a proof-of-concept deployment:
Cluster name: my-sep-cluster
Location type: zonal (lower latency)
K8s version: 1.20.9-gke.1001
Primary nodepool name: default-nodepool
Number of nodes: 3
Machine configuration: e2-standard-16 (16 CPU and 64 GB RAM)
Supplementary nodepool name: nonsep
Number of nodes: 1
Machine configuration: e2-standard-8 (8 CPU and 32 GB RAM)
SEP license requirements#
SEP on Google Cloud is licensed with a pay-as-you-go (PAYGO) model.
PAYGO requires the metering agent and the billing reporter agent, both of
which are preconfigured. You must pass the reference to your unique reporting
secret. No license file is embedded. Instead, the license must be created first,
then the secret must be referenced in the values.yaml
using the
starburst-enterprise.starburstPlatformLicense
configuration property.
Deploy from Google Cloud Marketplace#
The following CLI deployment sections describe how to deploy SEP with the Google Cloud Marketplace.
Get Marketplace license file#
Navigate to the Google Cloud Marketplace SEP offering and select Configure.
Set App instance name and switch to the Deploy via command line tab.
Select the appropriate reporting/cluster service account and click Generate license key.
Apply the downloaded license file with the following command:
$ kubectl apply -f license.yaml
Confirm the
starburst-enterprise-license-<unique_suffix>
reporting secret has been created:$ kubectl get secrets | grep starburst-enterprise-license $ kubectl describe secret starburst-enterprise-license-<unique_suffix>
Record the license name that was loaded into your cluster for later use. For example:
Name: starburst-enterprise-license-121212 Namespace: default Labels: <none> Annotations: <none> Type: Opaque
Get the umbrella Helm Chart#
The version of the Helm Chart is 3.7.3. The associated Starburst Enterprise version is 443.3.0.
Note
Do not extract the contents of the archive downloaded in these steps.
Download the chart as shown:
$ wget https://storage.googleapis.com/starburst-enterprise/helmCharts/sep-gcp/starburst-enterprise-platform-charts-3.7.3.tgz
Apply Application CRD to avoid errors:
$ kubectl apply -f "https://raw.githubusercontent.com/GoogleCloudPlatform/marketplace-k8s-app-tools/master/crd/app-crd.yaml"
Check the deployment files#
Check the content of the default values distributed with the umbrella Helm chart:
$ helm show values starburst-enterprise-platform-charts-3.7.3.tgz
Check the default values.yaml for subcharts to ensure that they are using the Helm chart version. The correct Helm chart version matches the number found in the marketplace listing page in the
Tag
field. (Note: Tag numbers may omit the revision number.) For example, if version 3.7.3 uses Helm charts in version 443.3.0, then for Starburst Enterprise Helm chart run:$ helm show values starburstdata/starburst-enterprise --version 443.3.0
The same command can be used to print the values.yaml for Starburst Hive and Ranger Helm charts.
Build the values.yaml
file#
The values.yaml
file contains the configuration for your SEP
cluster. You can either use the default content for a basic configuration to
start, or build the content yourself for a more custom initial deployment.
Deploy a basic SEP cluster#
Deploy the default values.yaml
content for a basic SEP
cluster configuration. This YAML requires the
starburst-enterprise.reportingSecret
value to deploy an SEP cluster
$ helm install starburst-enterprise starburst-enterprise-platform-charts-3.7.3.tgz --set starburst-enterprise.reportingSecret=starburst-enterprise-license-<unique_suffix>
To deploy with support for Starburst Warp Speed, you must append a secret
with an SEP license key to: --set starburst-enterprise.starburstPlatformLicense=<secret_with_sep_license>
.
Deploy a custom SEP cluster#
Create a values.yaml
file with the configuration you wish to deploy, following
the license requirements for your license type. You can include catalog
configuration for your data sources at this time, or add them later.
Example
Copy the below example template and overwrite the defaults with values specific to Google Cloud marketplace:
# Starburst Enterprise chart
starburst-enterprise:
reportingSecret: ENTERPRISE_LICENSE_NAME
catalogs:
bigquery: |
connector.name=bigquery
bigquery.project-id=GOOGLE_PROJECT_ID
hive: |
connector.name=hive
hive.metastore.uri=thrift://hive:9083
starburst-insights: |
connector.name=postgresql
connection-url=jdbc:postgresql://INSIGHTS_DATABASE_INSTANCE:5432/insights
connection-user=postgres
connection-password=INSIGHTS_DATABASE_PASSWORD
coordinator:
additionalProperties: |
insights.persistence-enabled=true
insights.metrics-persistence-enabled=true
insights.jdbc.url=jdbc:postgresql://INSIGHTS_DATABASE_INSTANCE:5432/insights
insights.jdbc.user=postgres
insights.jdbc.password=INSIGHTS_DATABASE_PASSWORD
insights.authorized-users=.*
etcFiles:
properties:
config.properties: |
http-server.authentication.allow-insecure-over-http=true
http-server.process-forwarded=true
password-authenticator.properties: |
password-authenticator.name=file
nodeSelector:
starburstpool: STARBURST_COORDINATOR_NODE_POOL
resources:
limits:
memory: 56Gi
requests:
cpu: 15
memory: 56Gi
expose:
type: clusterIp
ingress:
serviceName: starburst
servicePort: 8080
host: STARBURST_URL
path: "/"
pathType: Prefix
tls:
enabled: true
secretName: tls-secret-starburst
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt
userDatabase:
enabled: true
users:
- password: ADMIN_PASSWORD
username: ADMIN_USERNAME
worker:
autoscaling:
enabled: true
maxReplicas: 10
minReplicas: 1
targetCPUUtilizationPercentage: 80
deploymentTerminationGracePeriodSeconds: 30
nodeSelector:
starburstpool: STARBURST_WORKER_NODE_POOL
resources:
limits:
memory: 56Gi
requests:
cpu: 15
memory: 56Gi
starburstWorkerShutdownGracePeriodSeconds: 120
# Hive Chart
starburst-hive:
enabled: true
gcpExtraNodePool: EXTRA_NODE_POOL
database:
external:
driver: org.postgresql.Driver
jdbcUrl: jdbc:postgresql://HIVE_DATABASE_INSTANCE:5432/hive
user: postgres
password: HIVE_DATABASE_PASSWORD
type: external
objectStorage:
gs:
cloudKeyFileSecret: service-account-key
expose:
type: clusterIp
# Ranger Chart
starburst-ranger:
enabled: true
admin:
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 1
memory: 1Gi
serviceUser: ADMIN_USERNAME
gcpExtraNodePool: EXTRA_NODE_POOL
usersync:
enabled: true
database:
external:
databaseName: ranger
databasePassword: RANGER_DATABASE_PASSWORD
databaseRootPassword: RANGER_DATABASE_ROOT_PASSWORD
databaseRootUser: postgres
databaseUser: ranger
host: RANGER_DATABASE_INSTANCE
port: 5432
type: external
datasources:
- host: coordinator
name: starburst-enterprise
password: ADMIN_PASSWORD
port: 8080
username: ADMIN_USERNAME
expose:
type: clusterIp
loadBalancer:
name: ranger
ports:
http:
port: 6080
ingress:
serviceName: ranger
servicePort: 6080
host: RANGER_URL
path: "/"
pathType: Prefix
tls:
enabled: true
secretName: tls-secret-ranger
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt
initFile: files/initFile.sh
Confirm that you have set the following placeholder entries in the above yaml to match your environment and configuration. You can also include any other configuration required, such as SSO, LDAP, Ingress and custom catalogs to this file:
ENTERPRISE_LICENSE_NAME - The name of the license in
license.yaml
that was uploaded usingkubectl
.GOOGLE_PROJECT_ID - The Google Project you are deploying to.
INSIGHTS_DATABASE_INSTANCE - Hostname/IP for the ‘Starburst Insights’ database instance.
INSIGHTS_DATABASE_PASSWORD -
postgres
user password for the Insights database.STARBURST_COORDINATOR_NODE_POOL - Node Pool for the Coordinator (optional).
STARBURST_WORKER_NODE_POOL - Node Pool for the worker nodes. Can be the same as Coordinator (optional).
EXTRA_NODE_POOL - Node pool for Ranger and Hive (optional).
ADMIN_USERNAME - Starburst Enterprise web UI login user.
ADMIN_PASSWORD - Starburst Enterprise web UI login password.
HIVE_DATABASE_INSTANCE - Hostname/IP for the Hive database instance.
HIVE_DATABASE_PASSWORD -
postgres
user password for the Hive database.RANGER_DATABASE_INSTANCE - Hostname/IP for the Ranger database instance.
RANGER_DATABASE_ROOT_PASSWORD -
postgres
user password for the Ranger database.RANGER_DATABASE_PASSWORD -
ranger
user password for the Ranger database.
Run the Helm deployment#
After you have configured the values file for your environment, run the following:
$ helm install starburst-enterprise starburst-enterprise-platform-charts-3.7.3.tgz --values values.yaml
Validate your deployment#
You can verify that all pods are in a running state or a completed state:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coordinator-6f646d7996-dttk4 4/4 Running 0 25m 10.104.0.19 gke-test-mp-cluster-default-pool-5b89684f-6g3b <none> <none>
hive-7c8b5b5495-v9gwz 2/2 Running 0 25m 10.28.0.27 gke-test-mp-cluster-nonsep-0be06378-b718 <none> <none>
ranger-7c6b59bdd5-b9v8s 2/2 Running 0 25m 10.28.0.28 gke-test-mp-cluster-nonsep-0be06378-b718 <none> <none>
worker-76598b766c-dvgd7 3/3 Running 0 25m 10.104.0.18 gke-test-mp-cluster-default-pool-5b89684f-6g3b <none> <none>
worker-76598b766c-pqg9p 3/3 Running 0 25m 10.104.0.17 gke-test-mp-cluster-default-pool-5b89684f-6g3b <none> <none>
After deployment, confirm that the metrics reporter is able to submit metrics to Google Cloud metering service:
$ kubectl logs deployment/coordinator -c metrics-reporter 2022-11-17 08:57:01 INFO Starting a new job with Billing Handler... 2022-11-17 08:57:01 INFO Trying to get usage metrics from coordinator... 2022-11-17 08:57:01 ERROR Failure to get usage metrics from the coordinator because of metrics usage unavailability: Expecting value: line 1 column 1 (char 0) 2022-11-17 08:57:01 WARNING The coordinator failed to respond. Make sure it is up and running! Exiting... 2022-11-17 08:57:01 INFO Done 2022-11-17 08:58:02 INFO Starting a new job with Billing Handler... 2022-11-17 08:58:02 INFO Trying to get usage metrics from coordinator... 2022-11-17 08:58:02 INFO Number of cores in this 60 second cycle: 45 2022-11-17 08:58:02 INFO Report submission status: 200 : OK : 2022-11-17 08:58:02 INFO Trying to get status from the Billing agent... 2022-11-17 08:58:02 INFO Status response from the Billing agent: {"lastReportSuccess":"2022-11-17T08:56:48.771289323Z","currentFailureCount":0,"totalFailureCount":0} 2022-11-17 08:58:02 INFO Done
Every minute you should see:
Report submission status: 200 : OK
Number of cores in this 60 second cycle: 45:
1 coordinator * 15 vCPUs + 2 workers * 15 vCPUs = 45 vCPUs in total
Please also verify that there are no errors reported by
ubbagent
:
$ kubectl logs deployment/coordinator -c ubbagent
Listening locally on port 6080
I1117 08:56:48.616602 1 servicecontrol.go:88] ServiceControlEndpoint:Send(): serviceName: starburst-presto.mp-starburst-public.appspot.com body: {"operations":[{"consumerId":"project:pr-a19b90ff70ab666","endTime":"2022-11-17T08:56:48Z","metricValueSets":[{"metricName":"starburst-presto.mp-starburst-public.appspot.com/cpu_usage_in_seconds_pricing","metricValues":[{"endTime":"2022-11-17T08:56:48Z","int64Value":"0","startTime":"2022-11-17T08:56:48Z"}]}],"operationId":"71f2590e-27b6-4be0-9720-eee5918e4c00","operationName":"starburst-presto.mp-starburst-public.appspot.com/report","startTime":"2022-11-17T08:56:48Z","userLabels":{"goog-ubb-agent-id":"b33d9a75-56c8-4965-92ec-644346960142"}}]}
I1117 08:56:48.771219 1 servicecontrol.go:112] ServiceControlEndpoint:Send(): success
I1117 08:58:02.425575 1 servicecontrol.go:88] ServiceControlEndpoint:Send(): serviceName: starburst-presto.mp-starburst-public.appspot.com body: {"operations":[{"consumerId":"project:pr-a19b90ff70ab666","endTime":"2022-11-17T08:58:02Z","metricValueSets":[{"metricName":"starburst-presto.mp-starburst-public.appspot.com/cpu_usage_in_seconds_pricing","metricValues":[{"endTime":"2022-11-17T08:58:02Z","int64Value":"2700","startTime":"2022-11-17T08:58:02Z"}]}],"operationId":"e61490ef-6a1c-4aa3-8c2f-788de8b1a900","operationName":"starburst-presto.mp-starburst-public.appspot.com/report","startTime":"2022-11-17T08:58:02Z","userLabels":{"goog-ubb-agent-id":"b69d9a75-56c8-4965-92ec-644346960142"}}]}
I1117 08:58:02.499010 1 servicecontrol.go:112] ServiceControlEndpoint:Send(): success
Next steps#
Review our Kubernetes configuration documentation:
The following pages introduce key concepts and features in SEP: