9.17. Starburst Teradata Parallel Direct Connector#
The Teradata Parallel Direct Connector is an alternative connector for Teradata databases.
Compared to the Starburst Teradata Connector, it requires more advanced configuration and
is capable of achieving higher performance. It supports all use cases of the
Teradata connector, and adds greatly improved performance for
The connector is composed of components running on Presto, and others running on Teradata. All components participate in every transmission. A transmission includes all data transfer related to a single table scan.
The connector uses JDBC as a control channel to schedule the query in Teradata.
SELECT statement run with JDBC is passed to a Teradata Table Operator,
which then uses the custom transmitters to pass the data over HTTP to the Presto
controller and receivers:
- The controller starts and manages all transmissions. It runs on the Presto coordinator and distributes work to the receivers, and provides status information to the transmitters.
- The receivers connect to the transmitters and receive data from Teradata from the transmitters. They run on the Presto workers and are managed by the controller.
- Table Operator and JDBC Driver
- The table operator and JDBC driver read the data from the Teradata storage.
- The transmitters send the data for each table operator invocation on Teradata to the receivers on Presto.
Follow the documentation below to start using the connector:
The Teradata Parallel Direct connector needs to be configured in a catalog
properties file. e.g.
myteradatadb.properties, using the connector name
connector.name=teradata-parallel-direct connection-url=jdbc:teradata://HOST connection-user=USERNAME connection-password=PASSWORD teradata-parallel-direct.table-operator.name=some_schema.presto_table_operator teradata-parallel-direct.http.port=9000 teradata-parallel-direct.http.host=hostname.accessible.from.teradata.instance
Besides the connector name, you need to provide the JDBC connection parameters
and the relevant JAR files in the
teradata plugin directory, as discussed in
Starburst Teradata Connector. Additionally you need to configure the parameters for the
connector specifying the table operator name, and the HTTP port and host of the
Presto coordinator node.
Further tweaking can be done with the following parameters:
teradata-parallel-direct.receiver.buffer.size in the catalog properties
receiver_buffer_size in catalog session properties:
Determines the buffer size per each Presto worker. This buffer is available per table scan, so a single query joining three tables uses three buffers. Default value is 32MB.
parcel_size in catalog session properties:
The size of the data packets sent to the Presto receivers. The last packet may be smaller. Defaults to 2MB. Parcel size should not be higher than the buffer size.
compression_enabled in catalog session properties:
Determines if transmission data is compressed. Defaults to
splits_per_worker in catalog session properties:
Determines how many splits each Presto worker receives. This determines the maximum concurrency level on Presto. Defaults to 16, identical to the
task.concurrencyof the Presto server.
Aside from Teradata specific options, it is recommended to use JDBC connection configuration, configured in the catalog properties file, to cache the Teradata metadata on Presto and improve query performance.
metadata.cache-ttl=10m metadata.cache-missing=true statistics.cache-ttl=10m statistics.cache-missing=true
The connector can encrypt all data transmissions and use HTTPS. Add the details for the Java keystore file to use in the Presto server config.
Presto automatically distributes the keystore to the controller, the receivers, the transmitters and the table operator.
The connector supports Kerberos-based authentication with the following configuration:
teradata.authentication.type=KERBEROS teradata.kerberos.client.principal=test@TESTING-KRB.STARBURSTDATA.COM teradata.kerberos.client.keytab=krb/client/test.keytab
With this configuration the user
email@example.com, defined in the
principal property, is used to connect to the database, and the related Kerberos
service ticket is located in the
The connector can be configured to pass through Kerberos credentials, received by Presto, to the Teradata database.
Configure Kerberos and Presto, following the instructions in Kerberos Credential Passthrough.
Then configure the connector to pass through the credentials from the server to the database in your catalog properties file and ensure the Kerberos client configuration properties are in place.
teradata.authentication.type=KERBEROS_PASS_THROUGH http.authentication.krb5.config=/etc/krb5.conf http-server.authentication.krb5.service-name=exampleServiceName http-server.authentication.krb5.keytab=/path/to/Keytab/File
Now any database access via Presto is subject to the data access restrictions and permissions of the user supplied via Kerberos.
Contact Starburst Support to receive the required JAR file,
Teradata instance is up and running.
Teradata user with sufficient access rights to install JAR and create user defined functions UDF
GRANT CREATE EXTERNAL PROCEDURE ON syslib TO your_user WITH GRANT OPTION GRANT EXECUTE PROCEDURE ON sqlj.install_jar TO your_user WITH GRANT OPTION GRANT EXECUTE PROCEDURE ON sqlj.remove_jar TO your_user WITH GRANT OPTION GRANT EXECUTE PROCEDURE ON sqlj.replace_jar TO your_user WITH GRANT OPTION
Install the JAR located in
JARPATH on the Teradata server, and create the
table operator UDF using BTEQ or a JDBC-connected tool. Ensure a short path to
the JAR file is used since only up to 64 character length for the path is
CALL sqlj.install_jar('SJ!/presto/table/operator/jar/JARPATH',’presto',0) REPLACE FUNCTION presto_table_operator() RETURNS TABLE VARYING USING FUNCTION session_contract LANGUAGE JAVA NO SQL PARAMETER STYLE SQLTABLE EXTERNAL NAME 'presto:com.starburstdata.teradata.PrestoTableOperator.execute'
No Teradata restart is needed for the changes to take effect.
Each user that connects to Teradata from Presto needs to be granted sufficient access:
GRANT EXECUTE ON SPECIFIC FUNCTION test.presto_table_operator TO your_user
Alternatively access to table operator UDF can be granted via ROLE
CREATE ROLE presto_table_operator_users; GRANT EXECUTE ON SPECIFIC FUNCTION test.presto_table_operator TO presto_table_operator_users; GRANT presto_table_operator_users TO your_user;
This also requires to set the role for any impersonated user, this could be achieved with the below in the catalog file:
DROP FUNCTION presto_table_operator CALL sqlj.remove_jar(‘presto’',0)
It is recommended for upgrade to install the Presto table operator under a different name. The UDF function should also use a different name. That way it is possible to have different Presto table operator versions installed at the same time and an upgrade process does not affect the currently working system.