Migrating to Starburst Enterprise 354-e or higher#

The Starburst Enterprise platform (SEP) 354-e release is the first release using the upstream project after the rename to Trino (formerly PrestoSQL). As a result a number of binaries, paths, properties, and other aspects now use the new name.

Users of SEP versions 350-e and lower must review this migration guide, and take the suggested measures to upgrade to release 354-e or a later release.

You can find more Trino information in the migration blog post on the Trino website. SEP-specific help and details about the relevant changes in SEP is available in the following sections.

General advice#

Upgrading software is always an important step to gain access to new features, bug fixes, performance improvements, and other enhancements of the software.

When choosing your upgrade path and strategy, you must take into account how business critical your deployment of SEP is to your company. It is generally advisable to perform a test upgrade in a non-critical environment, and especially advised when making the complex update from SEP <= 350-e to SEP >= 354-e.

The following is a rough guide to use as a blueprint for your specific upgrade plan:

Step 1: Understand your current usage

  • To understand the scope of the changes, it is important to review the release notes for all versions, including multiple Trino and SEP release notes, especially if you are upgrading from an older release. Pay special attention to any breaking changes, and changes on connectors and other features you are actually using.

  • Make sure that your users are using recent client versions. Ideally, upgrade them all to version 350 before the upgrade. You can check the HTTP request logs for the coordinator to see what client versions are in use.

  • Make sure you understand all usage of JMX-based monitoring, and create an upgrade plan to adjust to the new names.

  • Create an inventory of all custom plugins, including authentication, UDFs, and connector plugins, and plan to update them all to adapt to the new SPI.

Step 2: Prepare

  • Download all required binaries

  • Implement all necessary custom plugin updates

Step 3: Upgrade

  • Upgrade your clusters one at a time, ideally beginning with a test upgrade on a development or staging cluster first.

  • Upgrade all clients including the CLI, JDBC driver, Python, etc.

In the next sections you can learn more details about the changes affecting different users of SEP, and the required steps to adjust:

Platform administrator#

As a platform administrator, you are responsible for installing, maintaining, and updating the SEP clusters for your organization.

The upgrade to 354-e or higher is a larger effort than typically required for a SEP upgrade. The specific changes necessary includes some changes that affect all SEP installation, and others depend on your deployment method and related tooling in use.

The following sections include details for all the required changes. Be sure to work with your data engineers to adapt to any changes required for catalog files.

Finally your users, the data consumers, are affected by a number of changes as well, so you must work with them, and prepare them for your rollout of the upgrade to prevent any problems.

Linux operating system and Python#

Before upgrading, review the operating system and Python requirements, to assess whether you need to update.

Java runtime#

There is a strict requirement for the Java runtime version of SEP in place, so be sure to include an update of your Java runtime. For SEP 354-e Java 11, patch version 11.0.7 or higher, is required.

Client tool incompatibility#

The update changes the HTTP client protocol headers to start with X-Trino-, compared to the old X-Presto-. The SEP version 350-e, as well as 354-e and all newer versions, accepts and supports client processes that use either header. As a result you can use any older client tool that supports the Presto, or newer ones that support Trino.

For example, you can use the Trino JDBC driver 354-e, that uses the X-Trino- headers with SEP 350-e, 354-e and newer version. Note that this does not work with older SEP LTS releases such as 345. Therefore an upgrade to 350-e can be an intermediate upgrade step to allow time for the client tool updates.

Ideally you work with your data consumers to update all clients together with the release, or immediately afterwards. This ensures best compatibility with regards to data types (e.g. high precision timestamps) and newer authentication, such as OAuth 2.0 authentication.

Make sure you use this property only as a temporary measure to assist in your migration efforts, such as when continuing to operate clusters with an older version of SEP. In addition, use at least the 350-e version of the relevant client tools for best compatibility.

tar.gz archive#

The name of the tar.gz file has changed. The internals, such as the launcher script and folders, remain identical to earlier releases.

RPM archive#

The name of the RPM file has changed. This means that your Linux distribution treats the archive as a completely new and separate application, and no automated upgrade process is possible.

Your manual upgrade and installation process needs to include the following steps:

  • gather all existing configuration files

  • adapt them to the new folder structure

  • remove the old RPM installation

  • install the new RPM

It is best to plan out this approach in detail, and test it on a non-production system to ensure minimal downtime for the eventual production upgrade.

The following RPM specific aspects have changed:

  • new configuration file location /etc/starburst

  • new logging directory /var/log/starburst

  • new service name starburst

See the RPM page for details.

presto-admin#

The Python script collection presto-admin to manage clusters with the RPM archive is not compatible with Trino and SEP 354-e+, and has been deprecated.

Starburst is providing Starburst Admin as a replacement tool for Trino and SEP deployments.

Perform an upgrade with the following high-level steps:

  • Gather all configuration files and settings

  • Adapt them to Starburst Admin configuration files

  • Use Starburst Admin to install a new SEP cluster

  • Upgrade all users and migrate them to the new cluster

  • Decommission the old cluster with presto-admin

More information is available in our dedicated documentation for Starburst Admin.

Docker image#

The identifier of the Docker image changed to starburstdata/starburst-enterprise.

The following container-specific aspects have also changed:

  • new configuration file location /etc/starburst

More information is available in our dedicated documentation for the Docker container.

Kubernetes with Helm#

The wide-ranging changes in the underlying systems and artifacts resulted in a new set of Helm charts that are not fully compatible with older versions. Perform the upgrade with the following high-level steps:

  • Gather all configurations

  • Adapt them to the new setup

  • Create a new configuration set and install on a new SEP cluster

  • Upgrade all users and migrate them to the new cluster

  • Decommission the old cluster

All relevant information about the new setup is available in the Kubernetes with Helm chapter.

Amazon AMI and CFT#

The wide-ranging changes in the underlying systems and artifacts resulted in a new AMI image and set of CFT configurations that are not fully compatible with older versions. Perform the upgrade with the following high-level steps:

  • Gather all configurations, including catalog files

  • Adapt them to the new setup

  • Create a new stack with updated configuration to create a new SEP cluster

  • Upgrade all users and migrate them to the new cluster

  • Decommission the old cluster

All relevant information about the new setup is available in the Amazon Web Services chapter.

Starburst Ranger plugin#

Users of the Starburst Ranger plugin, needed for global access control, are required to run a special upgrade procedure to adapt to the new file names and service names.

Logging#

All packages move from the package name io.prestosql to io.trino. As a result you must update all your logging configuration of relevant nested loggers.

JMX diagnostics and monitoring#

SEP exposes many metrics for monitoring and diagnostics with JMX. With the upgrade the metric names changed to start with trino instead of presto. Make sure you update all uses of metrics names.

As a temporary workaround, you can configure the base name of the metrics to use the old name with the following property setting in config.properties:

jmx.base-name=presto

Similarly, the metrics for connectors now start with trino.plugin instead of presto.plugin. Again, you might need to update these names in your monitoring system.

Alternatively, you can configure SEP to use the old name. This must be done for each plugin usage individually, since they are using separate class loaders. For example, for the Hive connector, you must add the following configuration to each catalog properties file that uses the Hive connector:

jmx.base-name=presto.plugin.hive

Custom plugins using the SPI#

If you have any custom plugins for SEP, such as connectors or functions, these must be updated by changing the source code of your plugin to adapt to the new SPI. All changes are an easy refactor with an IDE.

The package name is now io.trino.spi, and a few classes are renamed:

  • PrestoException to TrinoException

  • PrestoPrincipal to TrinoPrincipal

  • PrestoWarning to TrinoWarning

There are no functional changes, so you only need to perform the following steps:

  • Update all import statements.

  • Rename the references to the above class names.

  • Rebuild your plugin into a new binary.

  • Replace the old plugin in your deployment with the updated binary.

Custom authenticators using the SPI#

If you are using any custom authenticators, you must adapt them to SPI changes.

Work with your data engineers#

A number of changes also affect access to the data sources from SEP, and the related catalog files and connectors. Closely collaborate with your data engineers to access the changes and proceed with the necessary updates.

Work with your data consumers#

The typical requirement for data consumers with this migration is to upgrade client tools, libraries, and drivers.

If this is hard to roll out in your organization, you can support legacy client usage with the protocol.v1.alternate-header-name client protocol header property discussed in a preceding section.

Data engineer#

As a data engineer, you are responsible for the data sources connected to SEP, and any related configuration in catalog files and elsewhere in SEP, security tools, and general aspects such as network performance between SEP and data sources.

The upgrade to 354-e or higher is similar to other SEP upgrades, since most relevant configuration remained unchanged.

The specific changes necessary depend on your data sources, and specific usage and is detailed in the following sections.

Connectors#

Connectors change regularly with each release, and you must make sure you understand the changes for all connectors you are actively using in your catalogs and clusters.

For example, the Redshift connector changed the JDBC driver to use, and you therefore must update the connection URL in the relevant catalog files.

The following connectors specifically changed as part of changing to Trino:

Hive connector#

The following properties are renamed and must be updated in your catalog files:

  • hive.hdfs.presto.principal to hive.hdfs.trino.principal

  • hive.hdfs.presto.keytab to hive.hdfs.trino.keytab

Additionally, the JMX MBean name of PrestoS3FileSystem changed to TrinoS3FileSystem.

Thrift connector#

A number of changes affect users of the Thrift connector, and you must be sure to adapt your configuration files and source code as applicable:

  • Thrift service method names now start with trino.

  • All classes in the Thrift IDL now start with Trino.

  • All configuration properties now start with trino.

Catalog properties files#

Apart from the connector-specific changes discussed in the preceding section, catalog properties files can remain unchanged for the upgrade.

Custom connectors using the SPI#

If you are using any custom connectors, you must adapt them to SPI changes.

Custom UDFs using the SPI#

If you are using any custom UDFs, you must adapt them to SPI changes.

Data consumer#

As a data consumer, you are working with the SQL supported by SEP, and use client tools, such as the CLI, to connect to SEP to execute these queries.

The upgrade to 354-e or higher has a limited impact on you, and necessary steps are detailed in the following sections:

SQL#

The 354-e upgrade brings no incompatible changes to the SQL language, SQL functions, and session properties. As a result, all your queries continue to work unchanged.

CLI#

If you use the CLI, you must upgrade to a new version of the CLI to gain access to new features. It uses the new client protocol header, and is only compatible with releases 350-e, 354-e, and newer releases.

With the update to the new CLI, the recommended command changes from presto to trino. The prompt value and the returned version also changes to use trino/Trino. Otherwise all behavior remains unchanged.

If you adopt the new command name, you must also update any scripts and other workflows that invoke the CLI.

JDBC driver and dependent applications#

The new JDBC driver uses the new client protocol header, and is is only compatible with releases 350-e, 354-e, and newer releases. It is not compatible with older releases.

As a user of the JDBC driver in your custom application, or as the driver in an open source or commercial client application, you must upgrade to the new JDBC driver:

  • Download and install the new JDBC driver. The specific steps vary depending on your client application.

  • Make sure your cluster uses the new client protocol headers by confirming the SEP version with your platform administrator.

  • Update any JDBC connection URL configuration from using jdbc:presto: to jdbc:trino:.

  • Update the classname for the main driver class to io.trino.jdbc.TrinoDriver.

The JDBC driver versions for 350-e and earlier are not compatible with 354-e and newer releases. At the same time, the newer driver cannot be used to query older clusters before 350-e.

If you are using the JDBC driver in an application’s source code, you must adjust to the new driver with these additional steps:

  • Rename the Java package for all driver classes to io.trino.jdbc

  • Rename various driver classes such as TrinoConnection to start with Trino.

ODBC driver and dependent applications#

Unlike the JDBC driver, the Starburst ODBC driver continues to work with older SEP releases, as well as with the 354-e and later releases. There is no need to upgrade your ODBC driver version.

Other client tools and applications#

If your application uses any other client, such as the open source Go or Python clients, you also must upgrade to a new client and update the configuration accordingly.

In the following are a few guidelines:

The following tools are known to explicitly support Trino, and therefore SEP 354-e and newer:

Many other tools work with upgraded drivers and connection details.