Metastore options #
Metastores map files in distributed objects stores such as S3 and HDFS into tables, and provide metadata such as columns names and type mapping. Metastores also provide data to the cost-based optimizer. A metastore service is required for Starburst products when querying object storage systems with the Hive connector and others. Your choice of metastore service depends on which Starburst product you are using:
- Amazon Glue - Starburst Enterprise platform (SEP) on AWS
- Hive Metastore Service (HMS) - SEP on any platform
- Starburst built-in metastore service - Starburst Galaxy only
The Starburst Galaxy metastore #
Starburst Galaxy provides a built-in metastore that requires no additional installation.
Amazon Glue #
Hive Metastore Service #
You can deploy HMS using the Starburst Kubernetes (K8s) Helm chart for use with SEP on supported Kubernetes services.
This section describes the process for deploying HMS into your environment. Our reference documentation contains a complete listing of configuration properties and additional customization options in the Helm chart.
Get the Helm chart #
Get the latest
starburst-hive Helm chart from the Starburst. The link
is available in the installation
checklist. Review our
ensure your customizations are easy to apply when you upgrade to later versions.
Provide your registry credentials #
We recommend that you use a separate
registry-access.yaml file across all Helm
charts as described in our SEP K8s installation
As an alternative, you can edit the
registryCredentials: node of the Ranger Helm
chart to include them.
Configure the server #
Ensure that the following values in the Helm chart reflect your environment:
serviceAccountName:- We strongly recommend using a service account for the pod.
expose:section - Set the value for the correct
type:for your environment, and configure the required key-value pairs for the type.
resources:- Ensure that the CPU and memory sizes are appropriate for your instance type.
heapSizePercentage:at the default value.
Configure the PostgreSQL backing database #
The configuration properties for the PostgreSQL database are found in the
database: top-level node. As a minimal customization, you must ensure that the
following are set correctly for your environment:
database: type: "internal" internal: port: 5432 databaseName: "hive" databaseUser: "Hive" databasePassword: "HivePassw0rd1234"
You must also configure
volume: persistence and resources, as well as the
resources: for the backing database itself in the
database: node. For a
complete list of available backing database properties, see our reference
database.resources:node is separate from the top level
resources:node. It defines the resources available to the backing database itself, not the HMS server.
Setup authentication #
hdfs: section of the Helm chart, you provide authentication for the
storage account used to query
and create objects. Secrets are specified directly in the HMS chart.
Run the Helm command #
When the HMS is configured as desired for your organization, run the following command to deploy it:
$ helm upgrade -i hms starburst/starburst-hive -f hms-values.yaml
Once the pod is deployed, other services can use this HMS if needed.
Is the information on this page helpful?