AWS Glue #
AWS Glue data catalogs are a supported metadata catalog for Starburst Galaxy and Starburst Enterprise platform (SEP), and can be used as an alternative to the Hive Metastore to query your S3 data.

Starburst Galaxy #
Use an S3 catalog.
Starburst Enterprise #
AWS Glue usage in Starburst Enterprise is supported with the following connectors:
Ensure the requirements for the connector are fulfilled.
Requirements #
Before you configure the Glue metastore, verify the following prerequisites:
- Your SEP instance must have permissions to access both S3 and Glue AWS services.
- For CFT deployments, review the IAM role permissions requirements if you are providing your own IAM Instance.
- When using the AMI and launching it manually, make sure you choose an IAM Role that satisfies the requirements.
Configuration #
- Configure to use Glue as metastore in the catalog properties file
connector.name=hive
hive.metastore=glue
-
Add other desired Glue properties such as the AWS region or credentials to use.
-
Restart the cluster to apply the changes.
AWS Glue with SEP AMI #
You can use the SEP AMI from the AWS Marketplace, with the Hive connector to use Glue.
After the configuration as described in the preceding section, you can restart the AMI:
sudo service starburst restart
AWS Glue with CloudFormation template #
When using the CloudFormation template in AWS, you can leverage Glue by navigating to the Stack Creation form and choosing AWS Glue Data Catalog in the MetastoreType field in the Starburst Enterprise Configuration section.
SEP with AWS Glue usage #
When configured, the Glue data catalog is available via the catalog from within
the CLI or any other connection. You must specify the location of the data on S3
for either the entire schema or at the table level. For example, to create a
schema myschema
in the Glue data catalog, with the S3 base directory (root
folder for per-table subdirectories) pointing to the root of my-bucket
S3
bucket, run the following SQL command:
CREATE SCHEMA mycatalog.myschema
WITH (location = 's3://my-bucket/')
You can also create and edit the schema and tables directly from Glue. In Glue terminology, a schema is referred to as a “database”.
Table and column statistics support #
SEP supports standard AWS Glue table and column statistics via
the AWS Glue
API.
You can create and manage the statistics with the
ANALYZE
statement.
Known limitations of AWS Glue support #
The following SEP features are not supported with the Glue data catalog:
- Statistics are not preserved when a column is renamed. Tables with renamed columns must be re-analyzed.
- Renaming tables from within AWS Glue is not supported.
- Partition values containing quotes and apostrophes are not supported (for
example,
PARTITION (owner="Doe's"
). - Using Hive authorization is not supported.
Is the information on this page helpful?
Yes
No