Starburst Galaxy

  •  Get started

  •  Working with data

  •  Data engineering

  •  Developer tools

  •  Cluster administration

  •  Security and compliance

  •  Troubleshooting

  • Galaxy status

  •  Reference

  • Federating queries from different sources #

    A Starburst Galaxy cluster can include different data sources in the same cluster. This allows you to write queries that join data from data sources of different types in different locations.

    How to write federation queries #

    To federate data in Starburst Galaxy, connect to a cluster the data sources you’d like to federate. You can then join tables from different data sources in the same manner you would join tables from the same data source.

    In the following example, we combine data from two separate sources in a single query. One data source is an Amazon S3 catalog named glue. The other is a MySQL catalog.

    SELECT c.first_name, c.last_name, c.estimated_income,
           a.products, a.cc_number, a.mortgage_id
    
    FROM glue.burst_bank.customer c
    JOIN mysql.burst_bank.account a on a.custkey = c.custkey;
    

    Using fully-qualified object names is critical when querying from multiple sources. That is, specify the full path to a table or view, including its containing catalog and schema names, in the FROM and JOIN clauses of your statement.

    <catalog>.<schema>.<object>;
    

    Further resources on federating queries #

    Explore the following links to learn more about federating queries: