Starburst SQL Server connector#

The Starburst SQL Server Connector is an extended version of the SQL Server connector with configuration and usage identical.

The following improvements are included:

Requirements#

Configuration#

The connector configuration is identical to the configuration for the base SQL Server connector.

A minimal configuration, with the connector-name sqlserver and the JDBC connections details, is shown in the following snippet:

connector.name=sqlserver
connection-url=jdbc:sqlserver://<host>:<port>;database=<database>
connection-user=sqlserver_username
connection-password=sqlserver_password

The connection-url parameter can use any valid JDBC connection string supported by the SQL Server JDBC driver. This allows you to specify a SQL Server instance and other parameters as desired in your specific use case.

Dynamic catalog selection#

The default configuration, e.g. in a file etc/oursqlserver.properties, enables the connection to one database running on a SQL Server instance.

connector.name=sqlserver
connection-url=jdbc:sqlserver://dbserver.example.com:1443/exampledb

The connector supports connecting to multiple SQL Server databases using a single catalog by setting a override-catalog session property. This support has to be enabled in the catalog properties file with the sqlserver.override-catalog.enabled property:

connector.name=sqlserver
connection-url=jdbc:sqlserver://dbserver.example.com:1443/exampledb
sqlserver.override-catalog.enabled=true

With sufficient access rights, available to the user specified in connection-user, the preceding example allows you to query any table in any schema in the database exampledb on the SQL Server dbserver:

SELECT * FROM oursqlserver.exampleschema.exampletable;

In order to query another database, such as testdb, you have override the database configured in the catalog. From then on you can query that database in the current user session:

SET SESSION oursqlserver.override-catalog=testdb;
SELECT * FROM oursqlserver.testdbschema.testdbtable;

Note

The access rights to the databases, schemas, tables and actual rows, continue to be determined by the configured user for the connection, and any other security setup like impersonation or Apache Ranger integration. Make sure these access rights are as restrictive as required.

Type mapping#

Because SEP and SQL Server each support types that the other does not, this connector modifies some types when reading or writing data.

SQL Server to SEP read type mapping#

The following read type mapping applies when data is read from existing tables in SQL Server, or inserted into existing tables in SQL Server from SEP.

SQL Server to SEP read type mapping#

SQL Server type

SEP type

Notes

BIT

BOOLEAN

BIGINT

BIGINT

SMALLINT

SMALLINT

TINYINT

TINYINT

INT

INTEGER

DECIMAL(p, s), ``NUMERIC(p, s)

DECIMAL(p, s)

for p <= 38

DOUBLE PRECISION

DOUBLE

FLOAT

DOUBLE

BINARY

VARBINARY

CHAR(n)

CHAR(n)

VARCHAR(n)

VARCHAR(n)

NCHAR(n)

CHAR(n)

NVARCHAR(n)

VARCHAR(n)

DATE

DATE

DATETIME2(n)

TIMESTAMP(n)

TIME

TIME

No other types are supported.

SEP to SQL Server write type mapping#

The following write type mapping applies when tables are created in SQL Server from SEP.

SEP to SQL Server write type mapping#

SEP type

SQL Server type

Notes

BOOLEAN

BIT

BIGINT

BIGINT

INTEGER

INT

SMALLINT

SMALLINT

DOUBLE

DOUBLE PRECISION

CHAR(n <= 4000)

NCHAR(n)

CHAR(n > 4000)

NVARCHAR(max)

VARCHAR(n <= 4000)

NVARCHAR(n)

VARCHAR, VARCHAR(n > 4000)

NVARCHAR(max)

VARBINARY

VARBINARY(max)

DATE

DATE

TIME

TIME

TIMESTAMP(n <= 7) without timezone

DATETIME2(n)

TIMESTAMP(n > 7) without timezone

DATETIME2(7)

SQL support#

The connector supports all of the SQL statements listed in the SQL Server connector documentation.

The following section describes additional SQL operations that are supported by SEP enhancements to the Trino connector.

Insert#

The connector supports adding rows using INSERT statements. By default, data insertion is performed by writing data to a temporary table. You can skip this step to improve performance and write directly to the target table. Set the catalog property sqlserver.non-transactional-insert.enabled or the corresponding catalog session property non_transactional_insert to true.

In rare cases when exceptions occur during the insert operation, data in the target table can be corrupted. Since transactions have been disabled no rollback can be performed.

You can use bulk insert to further improve performance.

Performance#

The connector includes a number of performance improvements, detailed in the following sections.

Table statistics#

This feature is available for free, and does not require a valid license.

The SQL Server connector supports table and column statistics to improve query processing performance based on the actual data in the data source.

The statistics are collected by SQL Server and retrieved by the connector.

The connector can use information stored in single-column statistics. SQL Server Database can automatically create column statistics for certain columns. If column statistics were not created automatically for a certain column, you can create them by executing the following statement in SQL Server Database.

CREATE STATISTICS my_statistics_name ON table_schema.table_name (column_name);

SQL Server Database routinely updates the statistics. In some cases, you may want to force statistics update (e.g. after defining new column statistics or after changing data in the table). You can do that by executing the following statement in SQL Server Database.

UPDATE STATISTICS table_schema.table_name;

Refer to SQL Server documentation for information about options, limitations and additional considerations.

Dynamic filtering#

Dynamic filtering is enabled by default. It causes the connector to wait for dynamic filtering to complete before starting a JDBC query.

You can disable dynamic filtering by setting the property dynamic-filtering.enabled in your catalog properties file to false.

Pushdown#

The connector supports pushdown for a number of operations:

Aggregate pushdown for the following functions:

  • avg(bigint)()

  • avg(float)()

  • avg(decimal)()

  • count(column)()

Starburst Cached Views#

The connectors supports table scan redirection to improve performance and reduce load on the data source.

Bulk insert#

You can optionally use the bulk copy API to drastically speed up write operations.

Enable bulk copying and a lock on the destination table to meet minimal logging requirements.

The following table shows the relevant catalog configuration properties and their default values:

Bulk load properties#

Property name

Description

Default

sqlserver.bulk-copy-for-write.enabled

Use the SQL Server bulk copy API for writes. The corresponding catalog session property is bulk_copy_for_write.

false

sqlserver.bulk-copy-for-write.lock-destination-table

Obtain a bulk update lock on the destination table for write operations. The corresponding catalog session property is bulk_copy_for_write_lock_destination_table. Setting is only used when bulk-copy-for-write.enabled=true.

false

Limitations:

  • timestamp data type is not supported.

  • Column names with leading and trailing spaces are not supported.

You can further improve insert performance by disabling transaction usage for inserts.

Security#

The connector includes a number of security-related features, detailed in the following sections.

User impersonation#

The SQL Server connector supports user impersonation.

User impersonation can be enabled in the catalog file:

sqlserver.impersonation.enabled=true

User impersonation in SQL Server connector is based on EXECUTE AS USER. For more details visit: docs.microsoft.com.

Kerberos authentication#

The connector supports Kerberos authentication. Use the following properties in the catalog properties file to configure it.

sqlserver.authentication.type=KERBEROS
kerberos.client.principal=example@example.com
kerberos.client.keytab=etc/kerberos/example.keytab
kerberos.config=etc/kerberos/krb5.conf

In this example the user example@example.com, as defined in the kerberos.client.principal property, is used to connect to the database. The related Kerberos service ticket is located in the file defined in the kerberos.client.keytab property.

Kerberos credential pass-through#

The connector can be configured to pass through Kerberos credentials, received by SEP, to the SQL server database. This allows you to apply Kerberos-defined permissions to SQL Server connections through SEP.

To configure credential pass-through in Kerberos and SEP, see Kerberos credential pass-through.

After you configure Kerberos and SEP, edit the catalog properties file to enable the connector to pass the credentials to the SQL Server database. Configure the following Kerberos client configuration properties in the catalog properties file:

sqlserver.authentication.type=KERBEROS_PASS_THROUGH
http.authentication.krb5.config=/etc/krb5.conf
http-server.authentication.krb5.service-name=exampleServiceName
http-server.authentication.krb5.keytab=/path/to/Keytab/File

Any SQL server database accessed using SEP is now subject to the Kerberos-defined data access restrictions and permissions.

Password credential pass-through#

The connector supports password credential pass-through. To enable it, edit the catalog properties file to include the authentication type:

sqlserver.authentication.type=PASSWORD_PASS_THROUGH

For more information about configurations and limitations, see Password credential pass-through.