Python clients #
Starburst Enterprise and Starburst Galaxy fully support client access based
trino Python package from the open source Trino project.
The source is available in the GitHub repository
The Python client package supports the following Trino authentication methods:
- No authentication
- Basic authentication using passwords, which includes:
The client supports running queries within transactions, as described in the GitHub project’s README.
The Python client package requires Python 3.6 or later, or PyPy 3.
To use the package directly in your Python code, install it locally with
install trino (or use
pip3 if your system is so configured). Thereafter,
import trino into your code.
To use one of the Python-based clients, follow the
setup instructions for that client, which incorporates the
Package comparison #
The Python Database API Specification (DBAPI) defines a standard way for Python clients to access databases. The Trino Python client is a direct implementation of the DBAPI specification.
SQLAlchemy is a toolkit whose core
component provides a SQL abstraction layer over many DBAPI implementations.
Several Python clients use SQLAlchemy along with the
package to provide SQL access to Trino clusters.
Python clients that use the Trino DBAPI implementation directly, or that use SQLAlchemy along with the Trino DBAPI package, are the most direct path to querying Trino, Starburst Enterprise, and Starburst Galaxy clusters.
Several alternative Python access methods are not as direct, and are not recommended:
PySpark requires Spark JARs as well as a JDBC driver. This leaves your SQL query two layers removed from a direct DBAPI implementation.
PyJDBC does implement DBAPI, but also inserts the requirement of a JDBC driver in the path of your query.
PyHive implements DBAPI, can support use with SQLAlchemy, and has support for the Trino client package. However, it is designed to use the Hive query language, and not SQL. While both languages are similar, they are not identical and using the PyHive library can therefore result in unexpected query results or failures.
The following example shows how to use the Python API to connect to a local cluster running without security to submit a single query and return the results.
import trino conn = trino.dbapi.connect( host='localhost', port=8080, user='sep-user', catalog='system', schema='runtime', ) cur = conn.cursor() cur.execute('SELECT * FROM nodes') rows = cur.fetchall() for row in rows: print(row)
The next example runs the same query on a remote cluster secured with LDAP
user parameter is not needed for LDAP because you
specify the username in the
auth parameter. The
parameters are not required for this query format, which specifies the entire
import trino conn = trino.dbapi.connect( host='cluster.example.com', port=8443, http_scheme='https', auth=trino.auth.BasicAuthentication("ldap-username", "ldap-password"), ) cur = conn.cursor() cur.execute('SELECT * FROM system.runtime.nodes') rows = cur.fetchall() for row in rows: print(row)
Python-based clients #
The following data query clients take advantage of the Trino Python client package.
Apache Superset is a data exploration and visualization platform. Connections to clusters use the SQLAlchemy-Trino package in conjunction with the Trino Python client package.
The Superset client page describes the steps to use Superset with Trino, Starburst Enterprise, or Starburst Galaxy.
dbt is a data transformation workflow development framework that lets teams quickly and collaboratively deploy analytics code. Starburst provides a supported adapter.
The dbt client page describes the steps to use the adapter and dbt with Trino, Starburst Enterprise, or Starburst Galaxy.
Querybook is a browser-based data analysis tool that turns SQL queries into natural language reports and graphs called DataDocs.
The Querybook client page describes the steps to use Querybook with Trino, Starburst Enterprise, or Starburst Galaxy.
Is the information on this page helpful?
Is the information on this page helpful?