Using SQL #

Starburst Enterprise and Starburst Galaxy are built on Trino. Trino’s open source distributed SQL engine runs fast analytic queries against various data sources ranging in size from gigabytes to petabytes. Data sources are exposed as catalogs. Because Trino’s SQL is ANSI-compliant and supports most of the SQL language features you depend on, you can hit the ground running.

Business intelligence users and data scientists can continue to use their favorite client tools such as Tableau, Qlik and Apache Superset to access and analyze virtually any data source, or multiple data sources in a single query.

General SQL features #

We know you want to jump right in, and we know you already have awesome analytics skills. It’s just a matter of harnessing the power of SQL to take your analytics even further:

  • SQL language (Galaxy or SEP )
  • SQL statement syntax (Galaxy or SEP )
  • Functions and operators (Galaxy or SEP )

Just in case you’d like a more structured walkthrough, here are of some specifically interesting SQL features in Starburst, presented by one of our founders, David Phillips:

Topics:
  • Formatting (Galaxy or SEP )
  • CASE and searched expressions (Galaxy or SEP )
  • IF expressions (Galaxy or SEP )
  • TRY expressions (Galaxy or SEP )
  • Lambda expressions (Galaxy or SEP )
Click on the links to read more on that topic in our reference manual.

Running time: ~8 min.


Advanced SQL #

Ready to move past the basics? For your convenience, we’ve divided the Advanced SQL for Starburst video training course up into topic sections, and provided links to the relevant parts of our documentation below.

Advanced aggregation techniques #

Topics:
  • count() with DISTINCT (Galaxy or SEP )
  • Approximations, including counting and percentiles (Galaxy or SEP )
  • max_by() values (Galaxy or SEP )
  • Pivoting with count_if() and FILTER (Galaxy or SEP )
  • Complex aggregations
  • Checksums (Galaxy or SEP )
  • ROLLUP (Galaxy or SEP )
  • CUBE (Galaxy or SEP )
  • GROUPING SETS (Galaxy or SEP )

Running time: ~28 min.


Window functions #

Topics:
  • Row numbering (Galaxy or SEP )
  • Ranking (Galaxy or SEP )
  • Ranking and numbering without ordering
  • Bucketing (Galaxy or SEP ) and percentage ranking (Galaxy or SEP )
  • Partitioning (Galaxy or SEP )
  • Accessing leading and training rows with lead() (Galaxy or SEP ) and lag() (Galaxy or SEP )
  • Window frames (Galaxy or SEP )
  • Accessing first (Galaxy or SEP ), last (Galaxy or SEP ), and Nth values (Galaxy or SEP )
  • ROWS vs RANGE using array_agg() (Galaxy or SEP )
  • Using aggregations in window functions

Running time: ~25 min.


Array and map functions #

Starburst allows you to easily create arrays and maps with your data:

trino> SELECT ARRAY[4, 5, 6] AS integers,
       ARRAY['hello', 'world'] AS varchars;

 integers  |   varchars
-----------+----------------
 [4, 5, 6] | [hello, world]

SQL array indexes are 1-based. Learn more about how to use them in this in-depth video.

Topics:
  • Accessing array and map elements with element_at() (Galaxy or SEP )
  • Sorting arrays with array_sort() (Galaxy or SEP )
  • Matching elements with any_match() (Galaxy or SEP ), all_match() (Galaxy or SEP ), and none_match() (Galaxy or SEP )
  • Filtering elements (Galaxy or SEP )
  • Transforming elements (Galaxy or SEP )
  • Converting arrays to strings (Galaxy or SEP )
  • Computing array products (Galaxy or SEP )
  • Unnesting arrays and maps (Galaxy or SEP )
  • Creating maps from keys and values (Galaxy or SEP ), and an array of entry rows (Galaxy or SEP )

Running time: ~19 min.


Using JSON #

Topics:
  • Creating maps from keys and values (Galaxy or SEP )
  • Extraction using json_extract() (Galaxy or SEP ), and json_extract_scalar() (Galaxy or SEP )
  • Casting and partial casting from JSON (Galaxy or SEP )
  • Formatting as JSON (Galaxy or SEP )

Running time: ~14 min.