Release 429-e LTS (29 Nov 2023)#
Starburst Enterprise platform (SEP) 429-e LTS is the follow up release to the 429-e STS release and the 423-e LTS release.
This release is a promotion of the original 429-e STS release in November 2023 into a long term support (LTS) release.
The 429-e release includes all improvements from the following Trino releases:
It contains all improvements from the Starburst Enterprise releases since 423-e LTS:
Highlights since 423-e#
Added support for PyStarburst.
Added read-only public preview support for Unity Catalog as a metastore.
Added support for credential vending for reading AWS Lake Formation tables.
Added support for CREATE OR REPLACE TABLE statements in Delta Lake.
Added predicate pushdown support to the MongoDB connector.
Breaking changes#
The SEP backend service has been updated to require PostgreSQL 12.0+ when using PostgreSQL as the underlying RDBMS.
TIMESTAMP
type mapping between MySQL and Trino is no longerTIMESTAMP
toTIMESTAMP
. The new conversion is MySQLTIMESTAMP
to TrinoTIMESTAMP WITH TIMEZONE
. Depending on the query, mapping from MySQLTIMESTAMP
to TrinoTIMESTAMP
may result in an error message.Privileged access to the attached storage on nodes is no longer required for Starburst Warp Speed cluster configuration in Helm deployments. Existing cluster configurations must be updated, with EKS deployments requiring the addition of a boostrap script. Review and follow the considerations for your platform in the Starburst Warp Speed documentation.
The
deprecated.hive.metastore.glue-read-properties-based-column-statistics
Hive Metastore configuration property and underlying functionality has been removed. You must remove this configuration property or the cluster fails to start.The updated base Docker image for SEP no longer includes
curl
,vi
,nano
,sed
,awk
,grep
, and other popular command line tools. Starburst recommends using an init container with a base image that includes your needed command line tools. Guidance on using init containers and selecting suitable base images can be found in our init container documentation.A new
autoConfigure
property was added to the Starburst Warp Speed Helm chart which defaults tofalse
. Starburst Warp Speed deployments on AKS and GKE upgrading from 426-e that have already been reconfigured to use the filesystem instead of privileged mode must set this property totrue
or the filesystem is not created. Review and follow the migration guide for detailed instructions for your cloud platform.SEP 427-e uses a base system image that does not contain a system-wide trust store. Trusted, self-signed certificates must now be added to the Java distribution CA certificates located under
$JAVA_HOME/lib/security/cacerts
.The legacy
parse-decimal-literals-as-double
configuration property has been removed. Clusters that use this property must have it removed from configuration or the cluster does not start.The following deprecated task writer configuration properties have been removed:
task.writer-count
, replaced byprop-task-min-writer-count
.task.partitioned-writer-count
, replaced byprop-task-max-writer-count
.task.scale-writers.max-writer-count
, replaced byprop-task-max-writer-count
.writer-min-size
, replaced bywriter-scaling-min-data-processed
.
You must remove these properties from the cluster configuration and replace them with these replacement properties, or the cluster does not start.
The Snowflake distributed connector is now deprecated and is planned to be removed in a future SEP release, in favor of the improved Snowflake parallel connector. Existing catalogs that use the Snowflake distributed connector must be migrated to the Snowflake parallel connector.
The RPM package
service
daemon script is now deprecated and is planned to be removed in a future SEP release. Configurations that rely on this script must be updated to use thesystemctl
daemon script instead.As of the 429-e release, privileges allowing execution of table functions such as
query
must be qualified with a schema. Privileges for table function execution that are still not qualified with a schema result in anAccess Denied
error.The legacy Parquet reader has been removed from the Hive, Hudi, Delta Lake, and Iceberg connectors. The
parquet.optimized-reader.enabled
andparquet.optimized-nested-reader.enabled
catalog configuration properties must be removed from your catalog configurations or the cluster does not start.The legacy Hive readers and writers are removed in Trino, as well as other deprecated features. The following catalog configuration properties and their respective session properties have been removed:
*.native-reader.enabled
and*_native_reader_enabled
*.native-writer.enabled
and*_native_writer_enabled
hive.s3select*
ands3_select_pushdown_enabled
hive.optimize-symlink-listing
andoptimize_symlink_listing
You must remove these properties from your catalog configurations or the cluster does not start.
Trino 429 removed differntiation between function types in File-based access control rules. Any rules that have been reliant on function type must be updated accordingly.
429-e initial changes#
General#
Added support for publishing data products that contain decimal literals.
Updated usage metrics to upload data collected between previous upload and the coordinator shutdown or restart.
Fixed issue that prevented the Run and troubleshoot option in the query editor from working when built-in access control is enabled.
Security#
Added session logout to OAuth 2.0 providers when logging out from the SEP web UI.
Changed built-in functions to be qualified under the
system.builtin
schema. No access control privileges are necessary to grant access to these basic, non user-defined functions.Fixed issue that prevented tables and columns inside
information_schema
from being displayed when built-in access control is used.Fixed JavaScript policy evaluation in Ranger and Privacera.
Hive connector#
Added support for flushing the filesystem cache for tables with the
flush_filesystem_cache
system procedure.
Delta Lake connector#
Added support for CREATE OR REPLACE TABLE statements.
MongoDB connector#
Added predicate pushdown support.
Snowflake connector#
Added support for
RENAME SCHEMA
andRENAME TABLE
when thesnowflake.database-prefix-for-schema.enabled
configuration property is set totrue
.Updated connectors to use fully parallel mode by default for more query shapes.
SQL Server connector#
Added the
sqlserver.database-prefix-for-schema.enabled
catalog configuration property that allows SQL Server catalogs to access multiple databases.
429-e.0 changes (29 Nov 2023)#
Improved support for concurrent updates of table statistics in Glue.
Added masking for additional sensitive values in log files.
Added casting of
char
fields, if necessary, tovarchar
type in Hive view translations.Added support for
RENAME SCHEMA
andRENAME TABLE
when thesnowflake.database-prefix-for-schema.enabled
property is set totrue
.Remediated CVE-2023-41900
Fixed incorrect results for queries involving an aggregation in a correlated subquery.
Fixed incorrect results for queries involving
ORDER BY
and window functions with ordered frames.Fixed launcher start command not working with default directories.
Fixed possible JVM crash when reading short decimal columns in parquet files created by Impala. Applies to the Hive, Hudi, Delta, and Iceberg connectors.
Fixed incorrect results when a query contains several
!=
orNOT IN
predicates in MongoDB catalogs.
429-e.1 changes (21 Dec 2023)#
Warning
This release upgrades Ranger to 2.4, and to avoid a breaking change, adhere to
the following steps. Before upgrading to this version of SEP, record your
current version of Ranger (for k8s deployments, found in your starburst-ranger
values.yaml
file as admin.image.tag
and usersync.image.tag
). While
upgrading SEP, before deploying the update to your cluster nodes, you must
revert the Ranger 2.4 version tag back to the previous version.
Improved query planning time on Hive tables without statistics generated.
Fixed long query planning times for queries with many local exchanges.
Fixed query failure when reading parquet column index for timestamped columns in Hive, Delta, Iceberg, and Hudi tables.
Fixed incorrect results for
LIKE
with some strings containing repeated substrings.Fixed coordinator memory leak.
429-e.2 changes (18 Jan 2024)#
Fixed a potential issue with SEP inadvertently changing users’ passwords in Ranger when used with Ranger Admin 2.4.0.
Fixed incorrect results on parquet files containing page indexes when the query has filters on multiple columns in Hive, Delta, and Hudi tables.
Fixed an issue with the
Run and troubleshoot
Run button option writing to empty directories without the option being selected.
429-e.3 changes (14 Feb 2024)#
Fixed Teradata custom dates format.
Fixed query failure when reading array columns.
Fixed a bug where an entire directory is skipped from schema discovery if at least one file matched the
excludePatterns
option.Fixed out-of-bound (OOB) telemetry null pointer exception in parallel Snowflake connector.
Fixed complex expression pushdown in the Redshift connector.
Fixed a bug where query history displayed queries of another user.
429-e.4 changes (11 Mar 2024)#
Updated Kubernetes external secret operator.
Fixed UI authentication for large authentication tokens.
Fixed incorrect results for
DATETIMEOFFSET
values before the year 1400.Fixed query failure when using
char
types with thereverse()
function.Fixed potential incorrect results when using the
ST_Centroid()
andST_Buffer()
functions for tiny geometries.Fixed schema, table, and function visibility in BIAC filtering.
Fixed a bug where column statistics created in |sep| would not be visible in Hive when using CDP 7.
429-e.5 changes (28 Mar 2024)#
Added support for setting endpoint and region in STS clients in Lake Formation.
Added AWS endpoint configuration for Lake Formation client.
Fixed an issue which caused the
sync_partition_metadata
operation to fail when partition paths had case changes.Restored support for
SymlinkTextInputFormat
for text formats.Fixed reading Delta Lake files with encoded characters on Azure.
Fixed failure when reading certain Avro data with
UNION
data types.
429-e.6 changes (17 Apr 2024)#
Enabled PyStarburst dataframe API by default.
Fixed possible worker crashes when running aggregation queries due to out-of-memory error.
Fixed incorrect results when querying a table being modified concurrently.
Fixed handling of union options in Hive and Avro to allow coercion to a single type.
Fixed a bug that caused the creation of materialized views to fail when using MySQL as the cache service backend database if
materialized_view_definitions
is longer than 64K characters.
429-e.7 changes (20 May 2024)#
Fixed potential query failure due to worker nodes running out of memory in concurrent scenarios.
Fixed incorrect result with deletion vector on Delta partitioned table.
Fixed correctness bug in constant literal distinct aggregation.
Fixed Prometheus whiteListObjectNames being overwritten when KEDA is enabled.
429-e.8 changes (14 Jun 2024)#
Fixed potential failure when reading ORC files larger than 2GB.
Fixed startup failure when fault-tolerant execution is enabled with Google Cloud Storage exchange.
Fixed potential loss of a query completion event when multiple queries fail at the same time.
Backported IMDSv2 service metadata access.
429-e.9 changes (28 Jun 2024)#
Fixed incorrect results when specifying a value for the
cassandra.partition-size-for-batch-select
configuration property.Fixed failure when writing to tables with Iceberg
VARBINARY
values.Fixed correctness issue on receivers refresh that could cause query hanging.
429-e.10 changes (11 Jul 2024)#
Added encoding to error code in OAuth2 callback handler.
Fixed reading empty files from S3 and GCS.
Fixed issue syncing partition metadata which could cause data deletion.
429-e.11 changes (29 Jul 2024)#
Fixed bug preventing use of Starburst security in Delta Lake connector.
429-e.12 changes (14 Aug 2024)#
Fixed optimizer timeout for certain queries involving aggregations and
CASE
expressions.Fixed failure when adding new columns with a decimal type.
Fixed failure to read Hive tables migrated to Iceberg with Apache Spark.
Fixed issue that caused the error ‘Multiple masks on a single column are not supported’ to occur unintentionally.
429-e.13 changes (30 Aug 2024)#
Fixed query failure when file-based network topology is configured with the
node-scheduler.network-topology.file
configuration property.
429-e.14 changes (13 Sep 2024)#
Fixed a bug that caused cluster metrics to be created with incorrect intervals and subsequently led to loss of cluster metrics data.
Fixed Run and troubleshoot feature when
insights.authorized-groups
configuration property contains authorized groups.Fixed numeric overflow during managed statistics computation for large tables in Teradata mode session.
429-e.15 was skipped.
429-e.16 changes (18 Oct 2024)#
Fixed OpenX JSON decoding a JSON array line that resulted in data being written to the wrong output column.
Fixed reading large Prometheus responses.
Fixed failures for
count(*)
queries with predicates containing non-ASCII strings. Applies to the Elasticsearch connector.
429-e.17 was skipped.
429-e.18 changes (4 Nov 2024)#
Use
hive.metastore.partition-batch-size.max
config property value insync_partition_metadata
procedure. The default batch size is changed to 100 from 1000.
429-e.19 changes (14 Nov 2024)#
Fixed memory leak in
InMemoryEventClient
within cache service.