Red Hat OpenShift #

Prerequisites #

Before you get started, here are some things you need:

  • Access to an OpenShift cluster with with correctly-sized nodes, using IAM credentials, and with sufficient Elastic IPs
  • Previously installed and configured Kubernetes, including access to kubectl
  • An editor suitable for editing YAML files
  • Your SEP license file

Before you get started installing SEP, we suggest that you read our reference documentation and our helpful customization guide.

Quick start #

After you have signed up through RHM, download the latest OpenShift Container Platform (OCP) client for your platform from the OpenShift mirror site, and copy the oc executable into your path, usually /usr/local/bin. Once this is done, you are ready to install the operator in OCP4.

Using your administrator login for Red Hat OCP, log in to the OCP web console and click Operators > OperatorHub in the left-hand menu.

Once there, select “Presto” from the Project: drop-down menu, and navigate through the projects to All Items > Big Data > Starburst until you see Starburst Enterprise. Click on that tile, then click the Install button.

When the Create Operator Subscription page appears, select the Starburst project as the specific namespace on the cluster, leave all other options as default, and click Subscribe.

When the operation is complete, you are subscribed to the SEP operator, and it is installed and accessible to you in OCP4.

Getting up and running #

Installation #

Once you have your operator subscription in place, it’s time to install. There are several steps to getting SEP installed and deployed:

  • Installing the SEP cluster
  • Installing the Hive Metastore Service (HMS)

You must install the HMS to connect and query any objects storage with the Hive connector. This is typically a core use case for SEP, and then a required step. The HMS is used by SEP to manage the metadata of any objects storage.

Configuration #

When the operator installation is complete, you can proceed to deploy two custom resources:

Just like with installation, there are several steps to configuring Starburst Enterprise:

Each of these steps uses a specific Helm chart values.yaml configuration. Click on the links for detailed instructions on configuring each of the custom resources.

The following setup steps are required:

  • Configure the resource requirements based on your cluster and node sizes
  • Update the image repository and tags to use the RedHat registry:
    • Presto: registry.connect.redhat.com/starburst/presto:350-e.1-ubi
    • Presto-init: registry.connect.redhat.com/starburst/presto-init:350.1.1-ubi
    • HMS: registry.connect.redhat.com/starburst/metastore:350.1.1-ubi

Next steps #

Your cluster is now operational! You can now connect to it with your client tools and start querying your data sources.

We’ve created an operations guide to get you started with common first steps in cluster operations.

It includes some great advice about starting with a small, initial configuration that is built upon in our cluster sizing and performance video training.

Troubleshooting #

SEP is powerful, enterprise-grade software with many moving parts. As such, if you find you need help troubleshooting, here are some helpful resources:

FAQ #

Q: Once it’s deployed, how do I access my cluster?

A: You can use the CLI on a terminal or the Web UI to access your cluster. For example:

  • Trino CLI command: ./presto --server example-presto-presto.apps.demo.rht-sbu.io --catalog hive

  • Web UI URL: http://example-presto-presto.apps.demo.rht-sbu.io

  • Many other client applications can be connected, and used to run queries, created dashboards and more.


Q: I need to make administrative changes that require a shell prompt. How to I get a command line shell prompt in a container within my cluster?

A: On OCP, you’ll get a shell prompt for a pod. To get a shell prompt for a pod, you’ll need the name of the pod you want to work from. To do so, log in to your cluster as per your RHM documentation. For example:

oc login -u kubeadmin -p XXXXX-XXXXX-XXXXX-XXXX https://api.demo.rht-sbu.io:6443

Get the list of running pods:

❯ oc get pod -o wide
NAME                                                 READY   STATUS    RESTARTS   AGE   IP            NODE                                         NOMINATED NODE   READINESS GATES
hive-metastore-example-presto-XXXXXXXXX-lhj7l        1/1     Running   0          27m   10.131.2.XX   ip-10-0-139-XXX.us-west-2.compute.internal    <none>           <none>
presto-coordinator-example-presto-XXXXXXXXX-4bzrv   1/1     Running   0          27m   10.129.2.XX   ip-10-0-153-XXX.us-west-2.compute.internal     <none>           <none>
presto-operator-7c4ff6dd8f-2xxrr                     1/1     Running   0          41m   10.131.2.XX   ip-10-0-139-XXX.us-west-2.compute.internal    <none>           <none>
presto-worker-example-presto-XXXXXXXXX-522j8        1/1     Running   0          27m   10.131.2.XX   ip-10-0-139-XXX.us-west-2.compute.internal    <none>           <none>
presto-worker-example-presto-XXXXXXXXX-kwxhr        1/1     Running   0          27m   10.130.2.XX   ip-10-0-162-XXX.us-west-2.compute.internal   <none>           <none>
presto-worker-example-presto-XXXXXXXXX-phlqq        1/1     Running   0          27m   10.129.2.XX   ip-10-0-153-XXX.us-west-2.compute.internal     <none>           <none>

The pod name is the first value in a record. Use the pod name to open a shell:

❯ oc rsh presto-coordinator-example-presto-XXXXXXXXX-4bzrv

A shell prompt will appear. For example, on OCP 4.4:

sh-4.4$

Q: Is there a way to get a shell prompt through the OCP web console?

A: Yes. Log in to your OCP web console and navigate to Workloads > Pods. Select the pod you want a terminal for, and click the Terminal tab.


Q: I’ve added a new data source. How do I update the configuration to recognize it?

A: Using the making configuration changes section to edit your YAML configuration, find additionalCatalogs, and add an entry for your new data source. For example, to add a PostgreSQL data source called mydatabase:

    mydatabase: |
      connector.name=postgresql
      connection-url=jdbc:postgresql://172.30.XX.64:5432/pgbench
      connection-user=pgbench
      connection-password=postgres123

Once your changes are complete, click Save and then Reload to deploy your changes. Note that this restarts the coordinator and all workers on the cluster, and might take a little while.