Sample dataset #

The sample dataset provides data in a number of smaller tables that represent an organization, employer, and related data. The following tables are available in the demo schema of the catalog:

  • current_dept_emp
  • departments
  • dept_emp
  • dept_emp_latest_date
  • dept_manager
  • employees
  • salaries
  • titles

Configure a catalog #

To create a catalog with the sample dataset, select Catalogs in the main navigation and click Configure a catalog. Click on the Sample dataset button in the Select a dataset section.

Select a cloud provider #

The Cloud provider configuration is necessary to allow Starburst Galaxy to correctly match catalogs and clusters.

The data source configured in a catalog, and the cluster must operate in the same cloud provider and region for performance and costs reasons.

Define catalog name and description #

The Name of the catalog is visible in Query editor and other clients. It is used to identify the catalog when writing SQL or showing the catalog and nested schemas and tables in client applications.

The name is displayed in Query editor, and when running a SHOW CATALOGS. It is used to fully qualify the name of any table in SQL queries following the catalogname.schemaname.tablename syntax. For example, you can run the following query in the demo cluster without setting any context of catalog or schema.

SELECT * FROM tpch.sf1.nation;

The Description is a short, optional paragraph that provides more details about the catalog than the name alone. It appears in the user interface and can help other users to determine what data can be accessed with the catalog.

Region #

The Region of the dataset catalog determines in which cloud region of a specific cloud provider the data is stored. Choose the same region as the cluster in which you want to use the dataset from the drop down.

Save the catalog #

Click the Save catalog button, and proceed to add the catalog to a cluster to be able to query the data source.