Starburst Admin overview and getting started #

Starburst Admin is a collection of Ansible playbooks for installing and managing Starburst Enterprise platform (SEP) or Trino clusters.

The following features are available:

  • Installation and upgrade of Starburst Enterprise platform (SEP) or Trino using the RPM or tar.gz archives
  • Addition and update of coordinator and worker nodes configuration files, including catalog properties files for data source configuration
  • Service management of the cluster and on all nodes (start/stop/restart/status)
  • Collection of logs
  • Addition of custom binary files, such as custom connectors or UDF
  • Addition of custom configuration files

Starburst Admin does not manage the creation of the servers, the operating system installation and configuration, and the Python and Java installation. It is also not designed to manage other related tools such as Apache Ranger, a Hive Metastore Service or any data source.

It is most suitable for managing clusters installed on bare metal servers or virtual machines. Use the Kubernetes with Helm support instead of Starburst Admin if you use containers and Kubernetes.

Requirements #

Deep knowledge of Ansible is not expected for usage, but familiarity with Ansible is helpful. At a minimum, it is assumed you have Ansible installed, and are familiar with running Ansible playbooks.

Requirements for the control node #

The control node is used to run Starburst Admin, and therefore Ansible playbooks. Standard Ansible requirements apply:

  • Ansible 2.10 or higher
  • Linux/Unix operating system
  • Python 2.7 and higher, or Python 3.5 and higher

In addition, the following resources are needed:

  • SSH connectivity to the cluster nodes
  • Downloaded Starburst Enterprise platform (SEP) or Trino tar.gz or RPM archive files on the control node, or alternatively URL to the files that is accessible on all cluster nodes

The controller node can be any machine that is configured to fulfill these requirements. For initial testing you can use your workstation or even a node in the cluster directly. Production usage should follow Ansible best practices, and use dedicated workflow or Ansible orchestration and automation tools such as Ansible Tower or Concord.

Requirements for managed cluster nodes #

Starburst Admin does not manage the cluster hardware, operating system and package installation. It relies on the the the existence of all the nodes in the cluster and the fact they fulfill the requirements detailed in this section.

Typically provisioning systems such as Puppet, Chef, Terraform and others are used to prepare the cluster nodes.

All cluster nodes need to fulfill the normal Starburst Enterprise platform (SEP) or Trino requirements:

  • Linux operating system
  • Java runtime environment
  • Python

Memory and hardware resource requirements depend on the planned capacity of the cluster. Following are a few high level guidelines:

  • Use identical machines for all workers.
  • Start with at least two workers, scale up as needed.
  • Prefer fewer, more powerful worker nodes over many smaller ones.
  • All nodes need to be located in the same network and be able to communicate via TCP/IP.

Inspect the requirements for the specific version for detailed version information.

Additional requirements:

When using Starburst Admin with an RPM archive:

  • RPM-based Linux distribution
  • rpm command, yum, dnf, or others are not required

When Starburst Admin with an tar.gz archive:

  • GNU tar command
  • unzip command

Installing Starburst Admin on the control node #

Starburst Admin is a collection of Ansible playbooks that you install on the control node:

  • Contact Starburst Support for the Starburst Admin tar.gz binary package.
  • Download it onto the control node into any directory e.g. ~/tmp.
  • Access the directory in a command line interface.
  • Install the collection with the following command:
    ansible-galaxy collection install starburst-admin-*.tar.gz
  • Confirm the command finishes successfully:
    Starting galaxy collection install process
    Process install dependency map
    Starting collection install process
    Installing 'starburst.admin:1.0.0' to '....'
    starburst.admin:1.0.0 was installed successfully

The collection is installed into ~/.ansible/collections by default. The installation path ~/.ansible/collections/ansible_collections/starburst/admin/files is used for the binaries and all the configuration files for a cluster. Make sure you manage the files in this directory with a version control system.

You can override the installation path with the option -p <installation-path>.

If you need to install the collection into numerous control nodes, you can make the binary available on a remote URL:

  • Make the binary available on a server via HTTP, for example,
  • Create a file requirements.yml that includes a link to the binary.
        # Example link to tar.gz package
  • Use the YAML file for the installation
    ansible-galaxy collection install -r requirements.yml

Next steps #

Now that you have set up the control nodes and the managed cluster nodes, you can proceed with the initial installation on the cluster.