Concepts #

Starburst Enterprise and Starburst Galaxy embed Trino, and therefore all share numerous concepts. Your understanding of these basics helps you with using our products efficiently.

The following sections detail some core concepts.

Want to learn even more? Try these sources:

Trino #

Trino (formerly Presto® SQL) is the fastest open source, massively parallel processing SQL query engine designed for analytics of large datasets distributed over one or more data sources in object storage, databases and other systems.

Learn more about Trino.

Starburst Enterprise #

Starburst Enterprise platform (SEP) is a fully supported, enterprise-grade distribution of Trino. It adds integrations, improves performance, provides security, and makes it easy to deploy, configure and manage your clusters.

Get started with Starburst Enterprise.

Starburst Galaxy #

Starburst Galaxy is an easy to use, fully-managed and enterprise-ready SaaS offering of Trino. Configure your data sources, and query your data wherever it lives. Starburst takes care of the rest so you can concentrate on the analytics.

Get started with Starburst Galaxy.

Open source software #

Trino, is distributed as open source software under the Apache license, and therefore maintained by a community of contributors from all across the globe. The founders and many core contributors of Trino, are with Starburst, leading the project and helping it grow.

Analytics #

Analytics is the process of systematically inspecting and manipulating data or statistics to better understand patterns and characteristics of the data and its origins. Trino is designed for analytics processing with SQL.

Massively parallel processing #

Massively parallel processing (MPP) is an architecture for distributed workload processing. Multiple server nodes collaborate in a cluster.

The coordinator node receives a query written in SQL from a user. The coordinator analyzes and plans the query execution. It then adapts the plan to the number of worker nodes in the cluster and distributes the workload for parallel processing across all workers.

These workers all load and process data at the same time. As a result, the query processing is completed much faster. The workers collaborate and provide the processing results back to the coordinator and ultimately to the user. Results are returned to the user much quicker than what a single node architecture can achieve.

SQL #

Structured Query Language (SQL) is a domain-specific language for data access and data manipulation. It is the industry standard with a long history and wide-adoption by users and tools alike.

Trino uses SQL as query language for analytics of the data in any connected data source.

Query engine #

A query engine is a system designed to receive queries, process them, and return results to the users.

Contrary to a database system, it does not include a storage engine for managing the actual data in files, objects, or in memory. Instead a query engine integrates with many storage engines and can therefore be used to query multiple systems at the same time.

These systems can be relational databases, data warehouse, object storage system implementing a data or even very different systems that simply expose an API to retrieve data.

Trino is a query engine, and not a database system.