Release 423-e LTS (31 Aug 2023)#
Starburst Enterprise platform (SEP) 423-e LTS is the follow up release to the 423-e STS release and the 413-e LTS release.
This release is a promotion of the original 423-e STS release in August 2023 into a long term support (LTS) release.
The 423-e release includes all improvements from the following Trino project releases:
It contains all improvements from the Starburst Enterprise releases since 413-e LTS:
Highlights since 413-e#
Added a new Snowflake parallel connector.
Added support for Ceph storage, Dell ECS, and ObjectScale as a storage backend for object storage connectors.
Added support for SCIM user synchronization with Azure Active Directory and Okta.
Added managed statistics support to the Greenplum, DB2, Netezza, Redshift, Synapse, and Stargate connectors.
Added support for AWS Lake Formation resource links.
Breaking changes since 413-e#
Removed legacy Teradata query pass-through in favor of the query table function. The
deprecated.teradata.query-pass-through.enabled
property must be removed from the cluster configuration or the cluster does not start. Instead of legacy pass-through, use the query function.The Trino S3 file system class was renamed from
io.trino.plugin.hive.s3.TrinoS3FileSystem
toio.trino.hdfs.s3.TrinoS3FileSystem
. If you are using Ranger Audit to S3, update the property value inranger-audit.xml
.On a cluster with built-in access control, the sysadmin role is no longer assumed automatically when querying views in
DEFINER
mode whose owner has the sysadmin role. For views that are no longer accessible due to this change, the privileges of the view owner must be adjusted to grant access to the underlying tables.The
preferredWritePartitioningMinNumberOfPartitions
cache service configuration property and underlying functionality has been removed. You must remove this configuration property or table scan redirection fails.The following configuration properties and command line options have been removed from the cache service and the cache service CLI:
The
unpartitioned.writer-count
,unpartitioned.scale-writers
,unpartitioned.writer-min-size
,partitioned.use-preferred-write-partitioning
,partitioned.writer-count
,partitioned.scale-writers
, andpartitioned.writer-min-size
configuration properties have been removed from the cache service. You must remove these configuration properties or the cluster fails to start.The
--use-server-import-config
,--use-preferred-write-partitioning
,--writer-count
,--scale-writers
, and--writer-min-size
command line options have been removed from the cache service CLI.The
defaultUnpartitionedImportConfig
,defaultPartitionedImportConfig
,importConfig
in elements ofrules
, andincrementalImportConfig
in elements ofrules
properties are no longer recognized in therules.json
file.
The following changes have been made to the LDAP group provider configuration properties related to caching:
The
ldap.cache-ttl
property has been renamed toldap.cache.ttl
. You must update your configuration settings to reflect this change.The
ldap.negative-cache-ttl
property has been removed. You must remove this property or the cluster fails to start. Negative entries are now cached as normal entries.
URL extraction functions have been updated to reject invalid URIs rather than returning
NULL
. Queries that rely on these functions returningNULL
must be updated accordingly.The
optimizer.use-mark-distinct
configuration property has been removed. You must remove this property from cluster configuration in favor of theoptimizer.mark-distinct-strategy
property or the cluster fails to start.The
parquet.optimized-writer.enabled
andparquet.optimized-writer.validation-percentage
configuration properties have been removed. You must remove these properties or the cluster fails to start.
423-e initial changes#
General#
Added support for Starburst Warp Speed to run on M6gd instances. Node sizes must be
m6gd.4xlarge
or larger.Added read-only support for Lake Formation cross-account resource sharing.
Removed the
troubleshooting.max-capture-duration
configuration property.Fixed Starburst Warp Speed loading issue in GKE and AKS deployments.
Fixed issue that prevented the cluster from starting with Starburst Warp Speed catalogs in Azure or Google Cloud.
Fixed issues when querying Hive legacy views in Starburst Warp Speed.
Fixed run and troubleshoot failing when recordings are larger than
16MB
.
Security#
Added support for SCIM provisioning with Okta.
Added a native Starburst group provider caching service for SCIM.
Cosmos DB connector#
Added support for case-insensitive name matching for schemas and tables.
Iceberg connector#
Added support for materialized views on Starburst Warp Speed catalogs.
Snowflake connector#
Updated connectors to use version 3.13.33 of the Snowflake JDBC driver.
423-e.1 changes (31 Aug 2023)#
Changed LDAP and Starburst user and group caching to be enabled by default.
Upgraded Prometheus JMX Exporter to 0.20.0.
Fixed failure when reading
Decimal128
in MongoDB connector.Fixed possible failures when a query is cancelled. Applies to the snowflake-parallel connector.
Fixed parallel reads of SQL Server tables with non-clustered index.
423-e.2 changes (19 Sep 2023)#
Fixed issue with Query Editor not running queries to completion which may have prevented results from being visible.
Fixed failure when performing an
OUTER JOIN
involving geospatial functions in theJOIN
clause.Fixed potential unnecessary autoscaling triggers when Warp Speed is enabled.
Fixed download and troubleshoot options in the query editor when BIAC is enabled.
423-e.3 changes (2 Oct 2023)#
Fixed license issue in DynamoDB connector.
Fixed resolving LF table metadata when using database resource link.
Fixed run and download when browser policies disallow downloading larger result sets.
Fixed possible query failure when joins are pushed down to Oracle.
Fixed issue where the catalog name was included twice when creating an access policy for a catalog session property (BIAC).
Added access control check to redirected tables when getting comments.
Fixed an edge case that might result in a correctness issue when some data is indexed and cached by Warp Speed.
423-e.4 changes (18 Oct 2023)#
Fixed issue for Avro native readers and writers in situations where the provided schema contained camelCase in fields.
Fixed error in Starburst Warp Speed when reading Delta tables.
Fixed performance issue when reading with the native CSV reader on Hive.
Fixed instant file name path retrieval in Hudi Active Timeline.
Improved performance for filtering catalogs, schemas, tables, and columns in BIAC.
Fixed MongoDB mixed case schemas with custom roles.
Fixed performance issue with HMS calls by disabling batch fetch tables and views.
Fixed inferring type of decimals with leading zeros in MongoDB.
Fixed JavaScript policy evaluation in Ranger and Privacera.
Remediated CVE-2018-20839.
423-e.5 changes (31 Oct 2023)#
Fixed potential Starburst Warp Speed crash that can happen when more than a single Starburst Warp Speed catalog is used.
Fixed incorrect column statistics for Parquet file format in manifest files.
Cast
char
fields, if necessary, tovarchar
type in Hive view translations.Fixed incorrect results for queries involving an aggregation in a correlated subquery.
423-e.6 changes (13 Nov 2023)#
Fixed incorrect results for queries involving
ORDER BY
and window functions with ordered frames.Masked additional sensitive values in log files.
Fixed incorrect results in MongoDB when a query contains several
!=
orNOT IN
predicates.Fixed JavaScript policy evaluation in Ranger and Privacera.
Improved support for concurrent updates of table statistics in Glue.
Support
RENAME SCHEMA
andRENAME TABLE
whensnowflake.database-prefix-for-schema.enabled=true
.Remediated CVE-2023-39410.
423-e.7 changes (27 Nov 2023)#
Fixed possible JVM crash when reading short decimal columns in parquet files created by Impala (Hive, Hudi, Delta, Iceberg).
Remediated CVE-2023-41900.
Granting execution rights on a non-qualified function no longer makes all catalogs visible.
Fixed inability to create Data Domains and Data Products when Global Ranger Access Control is enabled.
423-e.8 changes (21 Dec 2023)#
Warning
This release upgrades Ranger to 2.4, and to avoid a breaking change, adhere to
the following steps. Before upgrading to this version of SEP, record your
current version of Ranger (for k8s deployments, found in your starburst-ranger
values.yaml
file as admin.image.tag
and usersync.image.tag
). While
upgrading SEP, before deploying the update to your cluster nodes, you must
revert the Ranger 2.4 version tag back to the previous version.
Improved query planning time on Hive tables without statistics generated.
Upgraded JDK to 17.0.8 to fix worker JVM crashes.
Fixed long query planning times for queries with many local exchanges.
Fixed query failure when reading parquet column index for timestamped columns in Hive, Delta, Iceberg, and Hudi tables.
Fixed a potential Starburst Warp Speed crash when creating Lucene text indexes.
Fixed reading JSON columns with more than 128 keys.
Fixed incorrect results for
LIKE
with some strings containing repeated substrings.Fixed deletes in Delta tables with partitions with special characters.
Fixed coordinator memory leak.
423-e.9 changes (18 Jan 2024)#
Fixed a potential issue with SEP inadvertently changing users’ passwords in Ranger when used with Ranger Admin 2.4.0.
Fixed incorrect results on parquet files containing page indexes when the query has filters on multiple columns in Hive, Delta, and Hudi tables.
423-e.10 changes (14 Feb 2024)#
Fixed query failure when reading array columns.
Fixed a bug where an entire directory is skipped from schema discovery if at least one file matched the
excludePatterns
option.Fixed out-of-bound (OOB) telemetry null pointer exception in parallel Snowflake connector.
Fixed complex expression pushdown in the Redshift connector.
Fixed a bug where query history displayed queries of another user.
423-e.11 changes (11 Mar 2024)#
Updated Kubernetes external secret operator.
Fixed UI authentication for large authentication tokens.
Fixed incorrect results for
DATETIMEOFFSET
values before the year 1400.Fixed query failure when using
char
types with thereverse()
function.Fixed schema, table, and function visibility in BIAC filtering.
Fixed a bug where column statistics created in SEP would not be visible in Hive when using CDP 7.
423-e.12 changes (28 Mar 2024)#
Fixed an issue which caused the
sync_partition_metadata
operation to fail when partition paths had case changes.Restored support for
SymlinkTextInputFormat
for text formats.Fixed reading Delta Lake files with encoded characters on Azure.
Fixed failure when reading certain Avro data with
UNION
data types.Fixed reading large SequenceFile, RCFile, or Avro files.
423-e.13 changes (17 Apr 2024)#
Fixed possible worker crashes when running aggregation queries due to out-of-memory error.
Fixed incorrect results when querying a table being modified concurrently.
Fixed handling of union options in Hive and Avro to allow coercion to a single type.
423-e.14 changes (20 May 2024)#
Fixed potential query failure due to worker nodes running out of memory in concurrent scenarios.
Fixed correctness bug in constant literal distinct aggregation.
Fixed Prometheus whiteListObjectNames being overwritten when KEDA is enabled.
423-e.15 changes (14 Jun 2024)#
Fixed potential failure when reading ORC files larger than 2GB.
Fixed startup failure when fault-tolerant execution is enabled with Google Cloud Storage exchange.
Fixed potential loss of a query completion event when multiple queries fail at the same time.
Fixed underestimation of memory usage when writing strings to Parquet files.
Fixed potential correctness issue on receivers refresh that could cause query hanging. Applies to the Teradata Direct connector.
Backported IMDSv2 service metadata access.
423-e.16 changes (28 Jun 2024)#
Fixed incorrect results when specifying a value for the
cassandra.partition-size-for-batch-select
configuration property.Fixed failure when writing to tables with Iceberg
VARBINARY
values.
423-e.17 changes (11 Jul 2024)#
Added encoding to error code in OAuth2 callback handler.
Fixed reading empty files from S3 and GCS.
Fixed issue syncing partition metadata which could cause data deletion.
423-e.18 was skipped.
423-e.19 changes (14 Aug 2024)#
Fixed optimizer timeout for certain queries involving aggregations and
CASE
expressions.Fixed failure when adding new columns with a decimal type.
Fixed failure to read Hive tables migrated to Iceberg with Apache Spark.
Fixed issue that caused the error ‘Multiple masks on a single column are not supported’ to occur unintentionally.