Hive and Delta Lake access control with Apache Ranger#

You can use the Apache Ranger integration with SEP to control access to Hive and Delta Lake data sources configured in any catalog using the SEP Hive or Delta Lake connectors.

This is specifically useful in the following scenarios:

  • You already use Apache Ranger to control access for these data sources

  • You use SQL standard-based authorization

If you want to want to control access for other catalogs and connectors with Ranger, you need to adopt global access control.

Note

Hive and Delta Lake access control with Apache Ranger requires a valid Starburst Enterprise license.

Installation#

Before you begin, verify you fulfill the Ranger requirements.

If you do not yet have a configured and running Apache Ranger, you can take advantage of the supported installation methods for global access control with Apache Ranger.

Configuration#

Each Hive and Delta Lake catalog that needs to be controlled with Ranger must have the catalog properties file configured to use ranger Hive security.

For Hive catalogs:

hive.security=ranger

For Delta Lake catalogs:

delta.security=ranger

The following is a more complete example of a Hive catalog properties file that is configured to use Apache Ranger for authorization. It utilizes Kerberos for authentication.

connector.name=hive
hive.metastore.uri=thrift://hive-metastore-node:9083

hive.metastore.authentication.type=KERBEROS
hive.metastore.service.principal=hive/hive-metastore-node@EXAMPLE.COM
hive.metastore.client.principal=hive/sep-server-node@EXAMPLE.COM
hive.metastore.client.keytab=/etc/hive/conf/hive.keytab

hive.hdfs.authentication.type=KERBEROS
hive.hdfs.impersonation.enabled=false
hive.hdfs.trino.principal=hdfs/sep-server-node@EXAMPLE.COM
hive.hdfs.trino.keytab=/etc/hadoop/conf/hdfs.keytab

hive.security=ranger

ranger.policy-rest-url=https://ranger-host:6182
ranger.service-name=hive

ranger.authentication-type=KERBEROS
ranger.kerberos-principal=sep-server/sep-server-node@EXAMPLE.COM
ranger.kerberos-keytab=/etc/sep/conf/sep-server.keytab
ranger.plugin-policy-ssl-config-file=/etc/hive/conf/ranger-policymgr-ssl.xml

More details about the supported configuration properties is available in the Ranger overview.

Access views created in SEP#

To access views created in SEP, you must assign hive.* catalog access to the Public role. If there is a connector-level access control system enabled, such as Ranger, SEP cannot determine that a user with a given username from another access control system is the same as the user in SEP. Granting catalog access to a Public role is synonymous with excluding the catalog from access control checks in SEP’s built-in access control system.

Policy management with SQL#

You can manage your Hive And Delta Lake access control with Ranger using the SQL GRANT privilege and REVOKE privilege statements. Policies managed this way must have the Delegate Admin option selected in the policy.

Warning

If you do not select the Delegate Admin option, you receive access denied errors.

Ranger Hive and Delta Lake service definitions use a limited access-type model compared to SEP. This limits the granularity of grant and revoke operations.

You can manage privileges on schema and table access with GRANT and REVOKE.

SEP defines the following mapping from the Hive and Delta Lake access type model to the SEP privileges model:

Hive access type

SEP privilege

SELECT

SELECT

UPDATE

INSERT and UPDATE and DELETE

and the other way around:

SEP privilege

Hive access type

SELECT

SELECT

INSERT

Unsupported, use ALL PRIVILEGES instead

UPDATE

Unsupported, use ALL PRIVILEGES instead

DELETE

Unsupported, use ALL PRIVILEGES instead

ALL PRIVILEGES

SELECT and UPDATE

Resource and privilege relationship limitations#

The integration of Ranger with the Hive and Delta Lake connectors does not support any kind of relation between a resource, such as a table, and the granted access. This is particularly evident when newly created resources are treated with the access rights of the existing policies.

For example, as user without full administrative access (ALL PRIVILEGES) you can end up in a situation where you create a new table, but can not grant access to that table to other users. As a workaround you can create global policies that use matching patterns and define the desired default access.

If you want all tables ending with -reporting to be accessible by everyone, you can create a policy that uses a value for *-reporting for the table. If you use * for database and * for column, you can grant a specific access to such a reporting table and all its columns in any database.

Alternatively you can grant all privileges to anyone creating new objects and requiring privileges to grant access to others. This policy might be too open for your use case, depending on your security policies, the data, the number of users and other aspects.

Role management limitations#

SQL supports the SET ROLE [role] statement to enable or disable a role. This feature is not supported in Ranger. All user roles are always enabled. Querying information_schema.applicable_roles and information_schema.enabled_roles yields the same results.