Release 423-e LTS (31 Aug 2023)#

Starburst Enterprise platform (SEP) 423-e LTS is the follow up release to the 423-e STS release and the 413-e LTS release.

This release is a promotion of the original 423-e STS release in August 2023 into a long term support (LTS) release.

The 423-e release includes all improvements from the following Trino project releases:

It contains all improvements from the Starburst Enterprise releases since 413-e LTS:

Highlights since 413-e#

Added a new Snowflake parallel connector.
Added support for Ceph storage, Dell ECS, and ObjectScale as a storage backend for object storage connectors.
Added support for SCIM user synchronization with Azure Active Directory and Okta.
Added managed statistics support to the Greenplum, DB2, Netezza, Redshift, Synapse, and Stargate connectors.
Added support for AWS Lake Formation resource links.

Breaking changes since 413-e#

Removed legacy Teradata query pass-through in favor of the query table function. The deprecated.teradata.query-pass-through.enabled property must be removed from the cluster configuration or the cluster does not start. Instead of legacy pass-through, use the query function.
The Trino S3 file system class was renamed from io.trino.plugin.hive.s3.TrinoS3FileSystem to io.trino.hdfs.s3.TrinoS3FileSystem. If you are using Ranger Audit to S3, update the property value in ranger-audit.xml.
On a cluster with built-in access control, the sysadmin role is no longer assumed automatically when querying views in DEFINER mode whose owner has the sysadmin role. For views that are no longer accessible due to this change, the privileges of the view owner must be adjusted to grant access to the underlying tables.
The preferredWritePartitioningMinNumberOfPartitions cache service configuration property and underlying functionality has been removed. You must remove this configuration property or table scan redirection fails.
The following configuration properties and command line options have been removed from the cache service and the cache service CLI:
- The unpartitioned.writer-count, unpartitioned.scale-writers, unpartitioned.writer-min-size, partitioned.use-preferred-write-partitioning, partitioned.writer-count, partitioned.scale-writers, and partitioned.writer-min-size configuration properties have been removed from the cache service. You must remove these configuration properties or the cluster fails to start.
- The --use-server-import-config, --use-preferred-write-partitioning, --writer-count, --scale-writers, and --writer-min-size command line options have been removed from the cache service CLI.
- The defaultUnpartitionedImportConfig, defaultPartitionedImportConfig, importConfig in elements of rules, and incrementalImportConfig in elements of rules properties are no longer recognized in the rules.json file.
The following changes have been made to the LDAP group provider configuration properties related to caching:
- The ldap.cache-ttl property has been renamed to ldap.cache.ttl. You must update your configuration settings to reflect this change.
- The ldap.negative-cache-ttl property has been removed. You must remove this property or the cluster fails to start. Negative entries are now cached as normal entries.
URL extraction functions have been updated to reject invalid URIs rather than returning NULL. Queries that rely on these functions returning NULL must be updated accordingly.
The optimizer.use-mark-distinct configuration property has been removed. You must remove this property from cluster configuration in favor of the optimizer.mark-distinct-strategy property or the cluster fails to start.
The parquet.optimized-writer.enabled and parquet.optimized-writer.validation-percentage configuration properties have been removed. You must remove these properties or the cluster fails to start.

423-e initial changes#

General#

Added support for Starburst Warp Speed to run on M6gd instances. Node sizes must be m6gd.4xlarge or larger.
Added read-only support for Lake Formation cross-account resource sharing.
Removed the troubleshooting.max-capture-duration configuration property.
Fixed Starburst Warp Speed loading issue in GKE and AKS deployments.
Fixed issue that prevented the cluster from starting with Starburst Warp Speed catalogs in Azure or Google Cloud.
Fixed issues when querying Hive legacy views in Starburst Warp Speed.
Fixed run and troubleshoot failing when recordings are larger than 16MB.

Security#

Added support for SCIM provisioning with Okta.
Added a native Starburst group provider caching service for SCIM.

Cosmos DB connector#

Added support for case-insensitive name matching for schemas and tables.

Iceberg connector#

Added support for materialized views on Starburst Warp Speed catalogs.

Snowflake connector#

Updated connectors to use version 3.13.33 of the Snowflake JDBC driver.

423-e.1 changes (31 Aug 2023)#

Changed LDAP and Starburst user and group caching to be enabled by default.
Upgraded Prometheus JMX Exporter to 0.20.0.
Fixed failure when reading Decimal128 in MongoDB connector.
Fixed possible failures when a query is cancelled. Applies to the snowflake-parallel connector.
Fixed parallel reads of SQL Server tables with non-clustered index.

423-e.2 changes (19 Sep 2023)#

Fixed issue with Query Editor not running queries to completion which may have prevented results from being visible.
Fixed failure when performing an OUTER JOIN involving geospatial functions in the JOIN clause.
Fixed potential unnecessary autoscaling triggers when Warp Speed is enabled.
Fixed download and troubleshoot options in the query editor when BIAC is enabled.

423-e.3 changes (2 Oct 2023)#

Fixed license issue in DynamoDB connector.
Fixed resolving LF table metadata when using database resource link.
Fixed run and download when browser policies disallow downloading larger result sets.
Fixed possible query failure when joins are pushed down to Oracle.
Fixed issue where the catalog name was included twice when creating an access policy for a catalog session property (BIAC).
Added access control check to redirected tables when getting comments.
Fixed an edge case that might result in a correctness issue when some data is indexed and cached by Warp Speed.

423-e.4 changes (18 Oct 2023)#

Fixed issue for Avro native readers and writers in situations where the provided schema contained camelCase in fields.
Fixed error in Starburst Warp Speed when reading Delta tables.
Fixed performance issue when reading with the native CSV reader on Hive.
Fixed instant file name path retrieval in Hudi Active Timeline.
Improved performance for filtering catalogs, schemas, tables, and columns in BIAC.
Fixed MongoDB mixed case schemas with custom roles.
Fixed performance issue with HMS calls by disabling batch fetch tables and views.
Fixed inferring type of decimals with leading zeros in MongoDB.
Fixed JavaScript policy evaluation in Ranger and Privacera.
Remediated CVE-2018-20839.

423-e.5 changes (31 Oct 2023)#

Fixed potential Starburst Warp Speed crash that can happen when more than a single Starburst Warp Speed catalog is used.
Fixed incorrect column statistics for Parquet file format in manifest files.
Cast char fields, if necessary, to varchar type in Hive view translations.
Fixed incorrect results for queries involving an aggregation in a correlated subquery.

423-e.6 changes (13 Nov 2023)#

Fixed incorrect results for queries involving ORDER BY and window functions with ordered frames.
Masked additional sensitive values in log files.
Fixed incorrect results in MongoDB when a query contains several != or NOT IN predicates.
Fixed JavaScript policy evaluation in Ranger and Privacera.
Improved support for concurrent updates of table statistics in Glue.
Support RENAME SCHEMA and RENAME TABLE when snowflake.database-prefix-for-schema.enabled=true.
Remediated CVE-2023-39410.

423-e.7 changes (27 Nov 2023)#

Fixed possible JVM crash when reading short decimal columns in parquet files created by Impala (Hive, Hudi, Delta, Iceberg).
Remediated CVE-2023-41900.
Granting execution rights on a non-qualified function no longer makes all catalogs visible.
Fixed inability to create Data Domains and Data Products when Global Ranger Access Control is enabled.

423-e.8 changes (21 Dec 2023)#

Warning

This release upgrades Ranger to 2.4, and to avoid a breaking change, adhere to the following steps. Before upgrading to this version of SEP, record your current version of Ranger (for k8s deployments, found in your starburst-ranger values.yaml file as admin.image.tag and usersync.image.tag). While upgrading SEP, before deploying the update to your cluster nodes, you must revert the Ranger 2.4 version tag back to the previous version.

Improved query planning time on Hive tables without statistics generated.
Upgraded JDK to 17.0.8 to fix worker JVM crashes.
Fixed long query planning times for queries with many local exchanges.
Fixed query failure when reading parquet column index for timestamped columns in Hive, Delta, Iceberg, and Hudi tables.
Fixed a potential Starburst Warp Speed crash when creating Lucene text indexes.
Fixed reading JSON columns with more than 128 keys.
Fixed incorrect results for LIKE with some strings containing repeated substrings.
Fixed deletes in Delta tables with partitions with special characters.
Fixed coordinator memory leak.

423-e.9 changes (18 Jan 2024)#

Fixed a potential issue with SEP inadvertently changing users’ passwords in Ranger when used with Ranger Admin 2.4.0.
Fixed incorrect results on parquet files containing page indexes when the query has filters on multiple columns in Hive, Delta, and Hudi tables.

423-e.10 changes (14 Feb 2024)#

Fixed query failure when reading array columns.
Fixed a bug where an entire directory is skipped from schema discovery if at least one file matched the excludePatterns option.
Fixed out-of-bound (OOB) telemetry null pointer exception in parallel Snowflake connector.
Fixed complex expression pushdown in the Redshift connector.
Fixed a bug where query history displayed queries of another user.

423-e.11 changes (11 Mar 2024)#

Updated Kubernetes external secret operator.
Fixed UI authentication for large authentication tokens.
Fixed incorrect results for DATETIMEOFFSET values before the year 1400.
Fixed query failure when using char types with the reverse() function.
Fixed schema, table, and function visibility in BIAC filtering.
Fixed a bug where column statistics created in SEP would not be visible in Hive when using CDP 7.

423-e.12 changes (28 Mar 2024)#

Fixed an issue which caused the sync_partition_metadata operation to fail when partition paths had case changes.
Restored support for SymlinkTextInputFormat for text formats.
Fixed reading Delta Lake files with encoded characters on Azure.
Fixed failure when reading certain Avro data with UNION data types.
Fixed reading large SequenceFile, RCFile, or Avro files.

423-e.13 changes (17 Apr 2024)#

Fixed possible worker crashes when running aggregation queries due to out-of-memory error.
Fixed incorrect results when querying a table being modified concurrently.
Fixed handling of union options in Hive and Avro to allow coercion to a single type.

423-e.14 changes (20 May 2024)#

Fixed potential query failure due to worker nodes running out of memory in concurrent scenarios.
Fixed correctness bug in constant literal distinct aggregation.
Fixed Prometheus whiteListObjectNames being overwritten when KEDA is enabled.

423-e.15 changes (14 Jun 2024)#

Fixed potential failure when reading ORC files larger than 2GB.
Fixed startup failure when fault-tolerant execution is enabled with Google Cloud Storage exchange.
Fixed potential loss of a query completion event when multiple queries fail at the same time.
Fixed underestimation of memory usage when writing strings to Parquet files.
Fixed potential correctness issue on receivers refresh that could cause query hanging.
Backported IMDSv2 service metadata access.

423-e.16 changes (28 Jun 2024)#

Fixed incorrect results when specifying a value for the cassandra.partition-size-for-batch-select configuration property.
Fixed failure when writing to tables with Iceberg VARBINARY values.

423-e.17 changes (11 Jul 2024)#

Added encoding to error code in OAuth2 callback handler.
Fixed reading empty files from S3 and GCS.
Fixed issue syncing partition metadata which could cause data deletion.