Starburst Galaxy

  •  Get started

  •  Working with data

  •  Developer tools

  • Starburst Galaxy UI
  •  Catalogs
  •  Clusters
  •  Admin
  •  Access control
  •  Cloud settings

  • Administration
  •  Security
  •  Single sign-on

  •  Troubleshooting

  • Galaxy status

  •  Reference
  • Great Lakes connectivity #

    Great Lakes connectivity abstracts the details of using different table formats and file types when using certain write access statements for object storage systems. This connectivity is built into Starburst Galaxy, and is available to all users.

    The following object storage catalogs support Great Lakes connectivity:

    This feature supports the following table formats described on separate pages:

    Any access to existing tables works transparently for all table formats. Starburst Galaxy recognizes the format of your tables by reading the metastore associated with your object storage.

    Great Lakes SQL support #

    The SQL support with a catalog using Great Lakes connectivity depends partially on the table format in use, and set as default per catalog. The following additional details apply:

    More specifics are available in the following sections about the different table formats, and their SQL support.

    Specify default table format #

    For data consumers who want to read from any object storage file in any supported table format, there are no configuration steps or commands. Starburst Galaxy reads each file’s type and format from the object store’s associated metadata. To check the format of a table, use the SHOW CREATE TABLE SQL statement and refer to the type property in the query result.

    For data engineers who want to create new tables in an object storage catalog, Starburst Galaxy provides a simple way to specify the default format for all newly created tables. When creating a new catalog, select from one of the options in the Default table format dialog. Your selection applies only to newly created tables, and does not convert any existing tables.

    Select default table format

    You can also use a SQL statement to specify an individual table’s format when creating that table. With a statement such as CREATE TABLE or CREATE TABLE AS, specify the type parameter with iceberg, delta, or hive as its argument. A table created this way can override the catalog’s overall default table format. Examples are shown in the individual table format pages listed above.

    Table properties #

    Each table format has its own set of properties that are supported by Great Lakes connectivity as part of CREATE TABLE statements.

    The following query lists the Starburst Galaxy table properties supported by Great Lakes connectivity:

    SELECT * FROM system.metadata.table_properties;

    Read more about the available table properties for each table format:


    The CREATE VIEW statement creates a new view based on a SELECT query. The view is a logical table that can be referenced by future queries. Views do not contain any data. Instead, the query stored by the view is executed every time the view is referenced by another query.

    Views in Starburst Galaxy can be used to pull together federated data across any catalog. Read more about federating multiple data sources in Starburst Galaxy.

    The use of an OR REPLACE clause is optional and causes the view to be replaced if it already exists, rather than raising an error.

    The following examples show how to create a view:

    Create a view:

    CREATE VIEW view_name AS
    SELECT col1, col2, col3
    FROM table_name;

    Create a view with a view comment:

    CREATE VIEW view_with_comment
    COMMENT 'This is a comment'
    SELECT col1, col2, col3
    FROM table_name;

    Create a view that summarizes the table_name table:

    CREATE VIEW view_summary AS
    SELECT col2, sum(col3) AS col_total
    FROM table_name
    GROUP BY col2;

    Create a view replacing an existing view:

    SELECT col1, col2, col3
    FROM table_name

    In addition, Great Lakes connectivity also supports the following SQL statements for views:

    Security for views #

    There are two different security modes for table views, DEFINER and INVOKER. The default security mode is DEFINER.

    CREATE VIEW view_name
    COMMENT 'this is a comment'
    SELECT col1, col2
    FROM table_name;

    The DEFINER security mode allows the ability to restrict access to the underlying tables that the user may not be allowed to access. Tables referenced in the view are accessed using the permissions of the view owner (the creator or definer of the view) rather than the user executing the query.

    A view created in INVOKER security mode is a stored query. Tables referenced in the view are accessed using the permissions of the user executing the query (the invoker of the view).


    Create and validate the definition of a new materialized view view_name of a SELECT query. Run the REFRESH MATERIALIZED VIEW statement after the creation to repopulate the materialized view with data. This materialized view is a physical representation of the query results at the time of refresh. The data is stored and can be referenced by future queries.

    If the materialized view is made from an Iceberg table, it only gets refreshed if there are new snapshots in the underlying tables.

    Create a simple materialized view view_name over the table_name that only includes column_data_1:

    CREATE MATERIALIZED VIEW catalog_name.schema_name.view_name
    SELECT column_data_1 FROM catalog_name.schema_name.table_name;

    Read about the optional clauses and view additional usage examples for the CREATE MATERIALIZED VIEW SQL statement.

    In addition, Great Lakes connectivity also supports the following SQL statements for materialized views:


    The CREATE SCHEMA SQL statement creates a new, empty schema. A schema is a container that holds tables, views, and other database objects.

    Create a new schema by using the default directory for the catalog or an explicit LOCATION. The location of the schema must be retrievable through the SHOW CREATE SCHEMA statement. In addition, your active role set must have the Location privilege for the intended location.

    Assign location privileges by following these steps:

    1. Click Access control > Roles and privileges.
    2. Click the name of the role you want to change.
    3. Click Privileges > Add privilege > Location.
    4. In the Storage location name text field, enter the path of the location you want to use.

    The following SQL statements are examples of creating a schema using the default directory location and using an explicit LOCATION:

    Create a new schema schema_name in the default location:

    CREATE SCHEMA schema_name;

    Create a new schema schema_name_location, with a set LOCATION to s3://bucket-name/lakehouse/data/schema_name_location:

    CREATE SCHEMA schema_name_location WITH (LOCATION = 's3://bucket-name/lakehouse/data/schema_name_location');

    Create a new schema if it does not already exist.


    Create a new schema schema_name, set the LOCATION property to s3://bucket-name/lakehouse/data/schema_name_location and set the owner to user_name:

    CREATE SCHEMA schema_name AUTHORIZATION user_name WITH (LOCATION = 's3://bucket-name/lakehouse/data/schema_name_location');

    View additional usage and clause examples for the CREATE SCHEMA SQL statement. In addition, the Great Lakes connectivity also supports the following SQL statements for schemas:


    The DROP SCHEMA SQL statement drops an existing schema that is empty.

    Drop an empty schema:

    DROP SCHEMA schema_name;

    Use the CASCADE clause to drop all objects within a schema. For example, this will drop tables, views, and materialized views.

    Drop a non empty schema:

    DROP SCHEMA schema_name CASCADE;

    Session properties #

    A session property temporarily modifies the runtime environment for the duration of the current connection session to the cluster. To modify the property, use the SET SESSION statement followed by the property name and a property-specific expression argument.

    SET SESSION property_name = expression;

    Session properties supported by a catalog can be set on a per-catalog basis for the current session. A catalog session property supported by more than one catalog can be set differently for each such catalog in the same session. To restrict a session property for use by a single property, prepend the catalog name to the property name:

    SET SESSION catalog_name.property_name = expression;

    Once set, session properties are in place only to the current connection session. You can have multiple connections to a cluster that each have a different combination of session properties. Once a session ends, either by disconnecting or creating a new session, any changes made to session properties during the previous session are lost.

    Use the RESET SESSION statement to clear the current session back to Starburst Galaxy defaults. Use the SHOW SESSION statement to view all currently available session properties. For additional information, read about the SET SESSION statement.

    Shared and unique session properties #

    With Great Lakes connectivity, object storage table formats share many session properties across Iceberg, Delta Lake, and Hive formats.

    In addition, each table format has a set of session properties uniquely supported that format. Read more about the available session properties for individual table formats: