Glossary #

Terms A-E #
Amazon AWS marketplace #

A provider for all aspects of the required infrastructure. This includes using AWS CloudFormation for provisioning, Amazon Simple Storage Service (S3) for storage, Amazon Machine Images (AMI), and Amazon Elastic Compute Cloud (EC2) for computes, Amazon Glue as metadata catalog, and others. For more information, see Amazon AWS Marketplace.

Bare-metal server #

A physical computer server dedicated to a single tenant. See bare-metal server.

Catalog #

Learn the role of a catalog in Trino.

Certificate Authority (CA) #

A trusted organization that examines and validates organizations and their proposed server URIs, and issues digital certificates verified as valid for the requesting organization.

Certificate #

A public key certificate issued by a CA, sometimes abbreviated as cert, that verifies the ownership of a server’s keys. Certificate format is specified in the X.509 standard.

Clock time #

See the definition for wall time.

Cluster #

A cluster provides the resources to run queries against numerous data sources. Clusters define the number of workers, the configuration for the JVM runtime, configured data sources, and others aspects. For more information, see cluster basics.

Connector #

Learn how a connector works with a data source.

Container #

A lightweight virtual package of software that contains libraries, binaries, code, configuration files, and other dependencies needed to deploy an application. A running container does not include an operating system. It uses the operating system of the host machine, typically Linux. Learn more at Container concept in the Kubernetes documentation.

Coordinator #

Learn the role of a coordinator.

COTS #

Common off-the-shelf. Refers to commodity hardware components.

Data consumer persona #

Owns data products such as reports, dashboards, models, and the quality of analysis. For more information, see Starburst personas.

Data engineer persona #

Owns schemas and is responsible for the source data quality and ETL SLA. For more information, see Starburst personas.

Data source #

A data source is a system from which data is retrieved. In Starburst products, you query that source by using a catalog, see Data sources and catalogs.

Driver #

Learn how a driver operates in Trino.

Exchange #

Learn how an exchange works in Trino.

External ID (AWS) #

An external ID is an identifier in AWS that is required for using Starburst Galaxy. It is used to ensure that only trusted AWS accounts are given permission to operate the Starburst Galaxy clusters based on their assigned role and trust policy. For more information on AWS Identity and Access Management, see How to use an external ID when granting access to your AWS resources to a third party.

Terms F-J #
Google Cloud Marketplace #

Deploy in the Google Cloud Marketplace or using the Starburst Kubernetes solution on the Google Kubernetes Engine (GKE). GKE is a secure, production ready, managed Kubernetes service in Google Cloud managing for containerized applications. For more information, see Google Cloud Marketplace.

Hive Metastore Service (HMS) #

Manages metadata for data stores that do not necessarily have a catalog, such as HDFS, Object Stores (S3, ADLS, GKS, Min.IO, etc). The metadata is stored in a RDBMS like PostgreSQL.

Java KeyStore (JKS) #

Java KeyStore, the system of public key cryptography supported as one part of the Java security APIs. The legacy JKS system recognizes keys and certificates stored in keystore files, typically with the .jks extension, and relies on a system-level list of CAs in truststore files installed as part of the current Java installation.

Terms K-O #
Key #

A cryptographic key specified as a pair of public and private keys.

Lake data (non-Delta) #

Data stored in a traditional lake method as files in an object store.

Load Balancer (LB) #

Software or a hardware device that sits on a network’s outer edge or firewall and accepts network connections on behalf of servers behind that wall. Load balancers carefully manage network traffic, and can accept TLS connections from incoming clients and pass those connections transparently to servers behind the wall.

Marketplace #

Purchase a preconfigured set of machine images, containers, and other needed resources to run SEP on their cloud hosts under your control. See Marketplace deployments.

Microsoft Azure marketplace #

Deploy using in the Azure Marketplace or using the Starburst Kubernetes solution onto the Azure Kubernetes Services (AKS). AKS is a secure, production-ready, managed Kubernetes service on Azure for managing for containerized applications. For more information, see Microsoft Azure Marketplace.

Operator #

Learn how an operator handles data.

Terms P-T #
Parser #

Analyses the Service Provider Interface (SPI) metadata for information about tables, columns, and types to validate SQL semantics, and to perform security checks and type checking of expressions in the original query.

PKCS #12 #

A binary archive used to store keys and certificates or certificate chains that validate a key. PKCS #12 files have .p12 or .pfx extensions.

Planner #

Uses the Statistics SPI to obtain information about row counts and table sizes to perform cost-based query optimizations during planning.

Platform administrator persona #

Owns platforms and services (ITIL-style). Has service SLA responsibility for the infrastructure supporting the cluster. For more information, see Starburst personas.

Presto and PrestoSQL #

Former name for Trino.

Privacy-Enhanced Mail (PEM) #

A syntax for private key information, and a content type used to store and send cryptographic keys and certificates. The PEM format can contain both a key and its certificate, plus the chain of certificates from authorities back to the root CA, or back to a CA vendor’s intermediate CA.

Query #

Learn how Trino handles a query.

Queue #

A sequence in which statements enter the coordinator to be executed.

Red Hat OpenShift marketplace #

A container platform using Kubernetes operators that automates the provisioning, management and scaling of applications to any cloud platform or even on-prem. Starburst Enterprise is available on Red Hat marketplace as of OpenShift version

  1. For more information, see Red Hat Marketplace.
Role-Based Access Control (BIAC) #

A custom integration of Apache Ranger that enables global policy & role based security and can be integrated with your existing Identity Provider.

Scheduler #

Uses the Data Location SPI in the creation of the distributed query plan to distribute plan stages to workers.

Schema #

Learn how a schema works in a database.

Secure Sockets Layer (SSL) #

Secure Sockets Layer, now superceded by TLS; still recognized as the term for what TLS does now.

Split #

Learn how a split operates in Trino.

SQL #

Structured Query Language. The standard language used with relational databases. For more information, see SQL.

SQL client #

A tool or application used to connect Starburst to a database. SQL clients include BI Tools, command line tools, SQL workbenches, etc.

Stage #

Learn how a stage works during querying.

Starburst Enterprise platform (SEP) #

A fully supported, enterprise-grade distribution of Trino. It adds integrations, improves performance, provides security, and makes it easy to deploy, configure, and manager your clusters. For more information, see Starburst Enterprise.

Starburst Galaxy #

An easy to use, fully-managed and enterprise-ready SaaS offering of Trino. Configure your data sources, and query your data wherever it lives. Starbursts takes care of the rest so you can concentrate on the analytics. For more information, see Starburst Galaxy.

Statement #

Learn what comprises a statement.

Table #

Learn what a table is inside a database.

Task #

Learn how a task operates.

Transport Layer Security (TLS) #

The successor to SSL. These security topics use the term TLS to refer to both TLS and SSL.

Trino #

The fastest open source, massively parallel processing SQL query engine designed for analytics of large datasets distributed over one or more data sources in object storage, databases, and other systems. Formerly PrestoSQL. For more information, see Trino.

Virtual machine (VM) #

An emulation of the hardware of a computer system on a physical host machine, so any operating system suitable for that hardware can run in the emulator. A typical example is a Linux virtual machine running on a Windows-based host machine. See virtual machine.

Wall time #

The elapsed real time from start to finish. For more information, see elapsed time.

Example, wall time for query processing is the elapsed time between the a user submitting a query and receiving results.

Real-world time, clock time, and wall-clock time refer to the same amount of time.

Worker #

Learn how a worker handles data.