Installing Starburst Enterprise in a Kubernetes cluster#

Now that you satisfied all the requirements you are ready to proceed with installing Starburst Enterprise platform (SEP).

Overview#

Installing charts with Helm on your cluster follows the same steps:

  • Establish access to the Helm chart repository

  • Configure credentials to access the Docker registry in a registry access YAML file for use in all of your Starburst k8s deployments.

  • Create a YAML file specific to each chart and cluster, for example sep-prod-setup.yaml for your production cluster, as in this example.

  • Ensure your Helm/kubectl configuration points at the correct cluster with kubectl cluster-info

  • Run Helm to install the chart

  • Access the cluster and check for success

Each Helm chart includes a values.yaml file that sets a reasonable set of default values. This default setup does not include catalog definitions and others that are necessary for your cluster.

You have to change your values file to add or update any configuration and run a Helm upgrade to apply the changes to the cluster.

Iterate on the configuration in the values YAML file with minimal setup until you have a working system. Depending on your cluster you have to adjust memory requirements for the worker and coordinator and other settings. Inspect the message with kubectl or Octant to determine details.

After you achieved a running cluster, ensure to store the values YAML file as a reference and then add more details as required for the specific cluster need.

SEP installation#

The SEP installation is managed with the starburst-enterprise Helm chart. Installation and upgrades are done with the helm upgrade command with the following options at a minimum:

The following example command assumes the registry access file, the production cluster configuration file and the catalog configuration file are located in the current directory:

$ helm upgrade  my-sep starburstdata/starburst-enterprise \
    --install \
    --version 360.15.0 \
    --values ./registry-access.yaml
    --values ./sep-prod-setup.yaml
    --values ./sep-prod-catalogs.yaml

The version value is available from the Helm repository.

The default values result in one coordinator and two worker nodes in the cluster, with very specific memory and CPU resources that likely will not match your cluster’s specifications. We strongly suggest that you initially install SEP with the minimal changes needed to reflect your cluster’s memory and CPU resources, then make small, focused customizations to suit your organization’s needs.

The following sections describe the initial installation and how to begin customizing SEP.

Create this one file before you begin#

No matter what other configurations you need for your deployments, create a registry-access.yaml file for reuse across all clusters and charts. This helps to ensure a smooth install and deployment experience.

In your registry-access.yaml file, add the following to configure access for Starburst’s Harbor registry:

registryCredentials:
  enabled: true
  registry: harbor.starburstdata.net/starburstdata
  username: <yourusername>
  password: <yourpassword>

The contents of this file override the default, empty values and ensure that the Helm charts can download the required Docker containers.

If you have multiple clusters, this same file is used for all of them. You can also use the same file for the optional Ranger and Hive Metastore Service charts. Other configurations should be managed in separate files.

As an alternative to using a username and password directly, you can use a Kubernetes secret:

  • Creating a secret containing the access token for your registry

  • Configuring your pod to use the secret

Initial installation checklist#

The following checklist describes the initial installation process:

  1. Gather repository credentials for the Helm chart repository and the Docker registry provided by Starburst Support.

  2. Create the registry-access.yaml file to override the default, empty values.

  3. Create your correctly-sized Kubernetes cluster .

  4. Ensure your Helm/kubectl configuration points at the correct cluster with kubectl cluster-info.

  5. Add your license file. We strongly suggest using a shared secret to add the license file. The following command assumes you are running it from the directory where your starburstdata.license file resides:

$ kubectl create secret generic starburstdata --from-file starburstdata.license
  1. Create a minimal YAML configuration file with the memory and CPU resource configurations for coordinator: and worker: that reflect your cluster’s available resources, for example sep-prod-setup.yaml.

Warning

Do not skip this step. The default values for memory and CPU resources likely vary significantly from your cluster’s available resources. If you attempt to run SEP with the defaults, SEP may not start.

  1. Run Helm to install the default chart, as well as any override YAML files using the --values argument, as in the following example:

$ helm upgrade sep-prod-cluster starburstdata/starburst-enterprise \
    --install \
    --version 360.15.0 \
    --values ./registry-access.yaml \
    --values ./sep-prod-setup.yaml
  1. Determine the IP address or the DNS hostname of the coordinator by running the kubectl get pods command.

  2. Use the IP address or hostname to verify the coordinator is running by accessing the Web UI. You can use the same information to connect with the CLI or the JDBC driver.

Next steps:

Updating to a new release#

If you have created focused, well-managed override files following our best practices guide, the upgrade process is a straightforward Helm-based process. As with any enterprise-scale application, we do recommend that you test upgrades from one release to another using a test cluster. This allows you to catch any configuration changes and update Helm charts before deploying into production:

  1. Review the Helm charts release notes for any relevant changes that affect your override files. For instance, there may be new configuration options to add, or deprecated properties to remove.

  2. Review the SEP release notes for new capabilities and breaking changes.

  3. Run Helm with the updated Helm chart version and with the updated YAML configuration files, as in the following example:

$ helm upgrade my-sep-staging-cluster starburstdata/starburst-enterprise \
    --install \
    --version 360.15.0 \
    --values ./registry-access.yaml \
    --values ./sep-stage-setup.yaml

Hive Metastore Service installation#

The Hive Metastore Service installation is managed with the starburst-hive Helm chart. Installation follows the same approach as the SEP chart with a minimal values YAML file with the registry credentials and custom values file for your HMS configuration, for example hms-prod.yaml, in the current directory:

$ helm upgrade  my-sep starburstdata/starburst-hive \
    --install \
    --version 360.15.0 \
    --values ./registry-access.yaml \
    --values ./hms-prod.yaml

The Configuring the Hive Metastore Service in Kubernetes page provides details for the configuration of the Hive Metastore Service.

Ranger installation#

The Ranger installation is managed with the starburst-ranger Helm chart. Installation follows the same approach as the SEP chart with a minimal values YAML file with the registry credentials and custom values file for your Ranger configuration, for example ranger-prod.yaml, in the current directory:

$ helm upgrade  my-sep starburstdata/starburst-ranger \
    --install \
    --version 360.15.0 \
    --values ./registry-access.yaml \
    --values ./ranger-prod.yaml

The Configuring Starburst Enterprise with Ranger in Kubernetes provides details for the configuration of Apache Ranger.

Starburst Cache Service installation#

The Starburst Cache Service installation is managed with the starburst-cache-service Helm chart. The installation follows the same approach as the SEP chart with a minimal values YAML file with the registry credentials and custom values file for your cache service configuration. The following example uses the cache-service-prod.yaml in the current directory:

$ helm upgrade  my-caching-service starburstdata/starburst-cache-service \
    --install \
    --version 360.15.0 \
    --values ./registry-access.yaml \
    --values ./cache-service-prod.yaml

Configure the cache service in Kubernetes provides details for the configuration.

We strongly suggest that you follow best practices for customization by creating a series of focused configuration files. The file set described below accomplishes this. If you have more than one cluster, such as a test cluster and a production cluster, name the files accordingly.

Warning

Do not copy over entire sections of the default values.yaml file. Only include nodes that you are changing or adding. Review how to create these files before you begin.

We have provided examples of these files with content.

Recommended customization file set#

File name

Content

registry-access.yaml

Docker registry access credentials file, typically to access the Docker registry on the Starburst Harbor instance. Include the registryCredentials: or imagePullSecrets: top level node in this file to configure access to the Docker registry. This file can be used for all SEP, HMS, and Ranger configuration for all clusters you operate.

sep-prod-catalogs.yaml

Catalog configuration for all catalogs configured for SEP on the prod cluster. It is typically useful to separate catalog configurations out into a separate file to allow reuse across clusters, as well as to separate the large amount of configuration of the catalogs from all the cluster configuration.

sep-prod-setup.yaml

Main configuration file for the prod cluster. Include any configuration for all other top level nodes that configure the coordinator, workers, and all other aspects of the cluster.

Create and manage additional configuration files, if you are operating multiple clusters, while reusing the credentials file. For example, if you run a dev and stage cluster use the following additional files:

  • sep-dev-catalogs.yaml

  • sep-dev-setup.yaml

  • sep-stage-catalogs.yaml

  • sep-stage-setup.yaml

If you are optionally implementing one or both of the Hive Metastore Service and Apache Ranger, you to create an configuration file for each of these per cluster as well:

Production prod cluster:

  • hms-prod.yaml

  • ranger-prod.yaml

Development dev cluster:

  • hms-dev.yaml

  • ranger-dev.yaml

Staging stage cluster:

  • hms-stage.yaml

  • ranger-stage.yaml

Private registries and repositories instead of Starburst Harbor#

Typically, you must use your username and password for accessing the Helm chart repositories and the Docker registry on the Starburst Harbor instance. You can instead use private Docker and Helm chart repositories.