Apache Ranger overview#

Apache Ranger is a tool to manage access control policies for Hadoop/Hive and related object storage systems such as Delta Lake. It provides a simple and intuitive web-based console for creating and managing policies controlling access to the data.

The Privacera Platform, powered by Apache Ranger is an extended commercial distribution of Apache Ranger, that can also be used.

Starburst Enterprise platform (SEP) can be integrated with Ranger as an access control system. When a query is submitted to SEP, SEP parses and analyzes the query to understand the privileges required by the user to access objects such as schemas and tables. Once a list of these objects is created, SEP communicates with the Ranger service to determine if the request is valid. If the request is valid, the query continues to execute. If the request is invalid, because the user does not have the necessary privileges to query an object, an error is returned. Ranger policies are cached in SEP to improve performance.

Authentication is handled outside of Ranger, for example using LDAP, and Ranger uses the authenticated user and user groups to associate with the policy definition.


SEP integration with Ranger requires a valid Starburst Enterprise license.


Before you configure SEP for any integration with Apache Ranger or Privacera Platform, verify the following prerequisites:

  • The SEP coordinator and workers have the appropriate network access to communicate with the Ranger service. Typically this is port 6080 or 6182, if SSL is used.

  • Apache Ranger 2.0.0 or higher must be used

  • Privacera Platform should be used

Ranger usage options#

SEP offers the following different integrations with Ranger:

We highly recommend implementing Ranger for global access control. This allows you to use Ranger policies for all configured catalogs.


When used for global access control, the Starburst Ranger integration extends the basic functionality of Ranger with the Starburst Ranger plugin. It allows Ranger to provide access control for all data sources defined by a catalog in Starburst Enterprise, and all other data sources supported by SEP.

Key concepts#

The concepts and features described in the following section apply to all Ranger usage.


A policy is a combination of set of resources and the associated privileges. Ranger provides a user interface, or optionally a REST API, to create and manage these access control policies.

Resource sets#

A resource set includes one or more resources of different resource types. Wildcard characters are supported to select a number of resources based on a pattern.

  • catalog

  • catalog - schema

  • catalog - schema - table

  • catalog - schema - table - column

  • catalog - schema - procedure

  • catalog - session property

  • function

  • system session property

  • query

  • user

As you can see from the list above, some resources are hierarchically organized within a catalog and below. This allows you for example to restrict access to a complete catalog, a specific schema, or table or even down to a column or a procedure within a schema.

For example, if you can define a set of resources, that allows you to restrict access to all the two tables credit-info and cards-info in all schemas in the hdfs catalog.

  • Catalog: hdfs

  • Schema: *

  • Table: credit-info, cards-info

A set of resource works as a primary key for a policy. It needs to be unique. Multiple policies however may cover a single resource because of the wildcard.

It is best to create fine grained resource sets, especially when using column masking and row filtering. Using policies with wildcards can create hard to understand, or even unpredictable behavior, when there are multiple policies that apply to the same resource. For example, both *-schema-table-column and catalog-*-table-column apply to column in table in catalog. The second definition is more specific and therefore preferred to keep your configuration easier to understand.

Privilege sets#

A set of privileges consists of one or more user groups, roles and users, and a set of access types for the specified resource set. Privileges can allow or deny operations.

The catalog, schema, table and column resources, which grant access to resources for queries, have the following access types.

  • SELECT to read data from the resource

  • INSERT to add data to the resource

  • UPDATE to change data in the resource

  • DELETE to remove data from the resource

  • CREATE to create a resource

  • ALTER to alter a resource

  • DROP to remove a resource

  • OWNERSHIP to claim ownership of the resource, which provides complete access

  • IMPERSONATE to impersonate another user, and therefore use the privileges of that user

In addition there are privileges that determine access to queries and their usage, and are therefore of a more general nature.

  • SELECT to list queries.

  • EXECUTE to initiate processing of any query. Without this privilege user action is extremely limited.

  • KILL to stop processing of any query.

Users, groups, and roles#

Users, groups, and roles are sourced from your configured authentication system, ideally a connected LDAP directory, and are used the target users for each policy.

Column-level authorization#

SEP enforces column-level privileges granted to roles. For example, if a user is only granted access to a subset of table columns, they are only able to query from these columns. If they execute an SQL statement that refers to other columns, the query fails with an error.

Column masking#

SEP’s Apache Ranger integration supports most of the column masking methods that are supported in Hive with Ranger. SEP does not distinguish upper case, lower case and digital characters when masking. x is used for all mentioned character types.


In the case of usage of any unsupported column masking, MASK_NULL is used.

Service and catalog integrations#

In addition to enforcing the policies in Apache Ranger, SEP integrates with the Apache Ranger Key Management Service, and has support for AWS Glue Data Catalog, row level filtering and tag-based policies.

Features and use cases#

The following features and use cases are applicable with all Ranger usage.

Hive and other catalog authorization set up#

The Ranger integrations replace any other authorization setup for the data source.

For example, you have to treat is as a replacement for authorization by the user configured for the connection to the data source, or any restrictions in the data source utilized by user impersonation or credential pass-through. It is important to avoid these other configurations, and let Ranger manage all access to keep the overall setup simple and manageable.

When catalogs use the Hive connector, disable the other Hive authorization checks in each catalog properties file. Edit the catalog properties file with the following configuration:


Controlling access to User Defined Functions with Ranger#

You can use the Ranger system access control to enforce User Defined Function (UDF) policies. A UDF in SEP is deployed as a plugin (Functions) and stored in the SEP global namespace. This global namespace is managed at the system access control level.

This is independent of the global and Hive access control with Ranger and the Privacera Platform.

The Ranger resource hierarchy for all UDF policies requires an associated database (or schema) namespace when creating the policy. Because the global namespace is independent of any connector namespace, this poses a slight challenge to control access to UDFs using Ranger. To overcome this you must specify $presto as the database name in Ranger. This keeps all SEP functions under the $presto database in Ranger resource hierarchy.

To configure Ranger system access control for UDFs, you need to add the following to a system access control property file e.g. named etc/access-control-ranger-udf.properties:



All Ranger properties supported for Hive access control with Ranger are supported in the system access control file. However Ranger properties related to row filtering or column masking are ignored. This additional configuration is needed because the Ranger system access control uses an independent Ranger client from the Hive access control. Only one Ranger system access control can be defined, while Hive access control can be configured separated for each Hive catalog. In the scenario where there are multiple Hive catalogs and multiple Ranger services, only one of those Ranger services can be used to managed the UDF policies.


When Ranger audit is implemented, whenever access is granted or denied through Ranger, an audit event is logged if auditing is enabled in a given resource policy.

Ranger audit is configured in the Ranger-specific file /etc/hive/conf/ranger-hive-audit.xml. Configuring Ranger audit is complex, and outside the scope of Starburst documentation; please refer to your Ranger documentation to learn how to set up audit optimally for your environment.

For Audit to work with SEP, the location of the file must be specified in your catalog properties file:


Caveat regarding performance

Ranger audits are performed by accessing the internal table system.runtime.queries. Any access to the table is logged.

The Web UI makes heavy use of the queries table. The property ranger.audit.system-runtime-queries.enabled is set to true by default and controls this logging behavior. Using the web interface causes a flood of audit events. Setting the property to false disables this audit logging.


Caching is used to improve performance and reduce the number of requests to the Ranger service. Caching is enabled through configuration properties, which can be found in the Ranger installation and configuration page.

Authorization limitations#

Authorization information cannot be accessed by querying the following tables such as information_schema.roles, information_schema.applicable_roles, information_schema.enabled_roles, and information_schema.table_privileges.

Configuration properties#

The properties listed in this table apply to all Ranger-related configurations in system access control properties files as as well catalog files using the Hive connector for Hive access control with Apache Ranger or the Privacera Platform.

Ranger properties#

Property name




URL address of the Ranger REST service, required to use HTTPS with Kerberos authenticationpolicy-rest-url``.


SEP Ranger plugin service name



Authentication type for SEP connecting to Ranger, BASIC or KERBEROS.


SEP Ranger plugin user name. This property is used when ranger. authentication-type=BASIC is set.


SEP Ranger plugin user password. This property is used when ranger.authentication-type=BASIC is set.


Ranger service kerberos principal


Path to the Ranger service kerberos keytab file


Path to Ranger plugin SSL configuration


Ranger’s client persistent cache for policies



Interval determining how often authorization polices are refreshed. The highest latency after which changes in Ranger authorization policies are visible in SEP.



Ranger service connection timeout.



Ranger service read timeout.


Path to ranger cache dir for policies. It allows to load policies from cache on startup, even though Ranger Policy Admin was not available at the moment.



Period how long group mapping information is cached in SEP. 0ms disables the cache.


Disabled, 0ms

Period how long group mapping information is refreshed in SEP. Any value greater than ranger.cache-ttl disables it.



To enable row filtering set this flag to true. Note that there are semantic differences between the SEP and HiveQL SQL variants.



To enable resource wild card matching for row filtering set this flag to true. When two policies are matching single resource, the one without wildcards is used. When multiple wildcard policies match, it is undetermined which one is used.



To enable resource wild card matching for column masking set this flag to true. When two policies are matching single resource, the one without wildcards is used. When multiple wildcard policies match, it is undetermined which one is used.


Additional XML configuration files which is read before applying your SEP Ranger configuration. Useful for reusing existing HIVE-LEVEL RANGER configuration with things like Ranger Audit configuration.



Enable Ranger policy management with SQL as supported for Hive access control only.

Ensuring Ranger works with SSL#

If your organization implements SSL, you must ensure that Ranger is correctly configured for it, as connectors also use the configuration via the ranger.plugin-policy-ssl-config-file property. The following is a sample of Ranger SSL configuration file:

    <!--  The following properties are used for 2-way SSL client server validation -->

Ensuring Ranger works with your authorization service#

You need to configure SEP to work with the authentication service used by Ranger. While Starburst does offer Kerberos support, Starburst encourages the use of LDAP. The following configuration property is provided:


If your organization uses LDAP system for user and group information, Ranger can use that information to define role-based access to catalogs using any connector, as well as a number of other system resources. Policies in Ranger define access and authorization, and are created with the Ranger user interface. Users, groups, and roles are sourced from your connected LDAP directory and are used to target users for a Ranger policy. Each policy combines user and group information with a resource and access rights to the resource.

Ranger needs to access the information about your users, groups and roles in your LDAP system. With the K8s and AWS installation methods, all details are already configured. For existing Ranger usage or manual installation, you must ensure that Ranger is connected to your LDAP directory provider, and that a synchronization process is in place.

The process of connecting your existing Ranger installation depends on your particular LDAP implementation as well as your Ranger configuration. Learn more about that in the LDAP Authentication page.


SEP can use Kerberos authentication page, and the Ranger integration also support Kerberos.


Most organizations that use Kerberos also use LDAP. We strongly encourage you to use LDAP instead of Kerberos, due to the relative unreliability of Kerberos servers, their lack of clear error messaging, and their rigid OS and JVM dependencies.

A sys admin Ranger user (user with role ROLE_SYS_ADMIN) must exist that matches SEP Kerberos principal ranger.kerberos-principal when or SEP Ranger plugin username ranger.username and password ranger.password, if BASIC auth is used.

The SEP Kerberos principal is translated to Ranger user name via auth-to-local hadoop rules from core-site.xml.


Ranger version 2.1.0 removes the possibility to connect to Kerberized Ranger using basic user and password authentication. You have to add the following configuration to your Ranger core-site.xml file to restores this possibility by allowing unauthenticated access:


Alternatively, you can configure SEP to authenticate to Ranger using Kerberos.

Starburst Ranger CLI#

You can use the Starburst Ranger CLI to manage integration of SEP with Apache Ranger or the Privacera Platform for the following tasks:

The command line application is an executable Java archive, that requires Java 11 or higher available on the system path. You can download it from Starburst and install it with the following steps on Linux or macOS.

  • Ensure the computer is able to reach the Ranger server via HTTP, since the CLI interacts with the REST API. This can be the coordinator, or worker in the cluster or any other computer.

  • Verify Java with java -version

  • Move the binary to a directory in your path, such as ~/bin and rename it.

    mv starburst-ranger-cli-*-executable.jar ~/bin/starburst-ranger-cli
  • Verify the folder is on the path.

    echo $PATH
  • If necessary, add the folder.

    export PATH=~/bin:$PATH
  • Now you can run the help command to verify the CLI works.

    starburst-ranger-cli help
  • The resulting output is similar to the following:

    Starburst Ranger command line interface
    starburst-ranger-cli [--properties=<configFile>] [-p=<String=String>]... [COMMAND]

The help command can also provide details about the other commands and their specific options, if you append help to the desired command, with a few examples shown in the following block:

starburst-ranger-cli help
starburst-ranger-cli user help
starburst-ranger-cli service-definition help
starburst-ranger-cli user create help

Windows installation is supported as well and requires similar commands. You can also run the application directly with Java on Linux, macOS or Windows.

java -jar starburst-ranger-cli-*-executable.jar

You have to supply the connection details from SEP to Ranger in a properties file. Typically you can simply use the Ranger access control properties file by copying it to the computer running the CLI. Alternatively you can use individual properties as command line options.

  • Use the --properties to specify the full path to a .properties file that contains one or more key=value pairs on each line

  • Use the -p option for each property separately with the format -p=key=value.

Ranger user management#

You can manage users in Ranger with the CLI. Properties are used to provide the details for Ranger access.

The following operations are available:

  • create a user

  • get user details

  • delete a user

Syntax follows the same syntax and users properties and the --name option:

starburst-ranger-cli user get
starburst-ranger-cli user create
starburst-ranger-cli user delete

A full example to get a user can look like this:

starburst-ranger-cli user get --name=username --properties=ranger-access-control.properties

Creating a user relies on a JSON file, such as alice.json, with the following syntax:

  "name": "alice",
  "firstName": "Alice",
  "lastName": "Wonderland",
  "emailAddress": "alice@example.com",
  "password": "not@trivialP225w0rd",
  "description": "She went down the rabbit hole.",
  "groupIdList": [],
  "groupNameList": [],
  "status": 1,
  "isVisible": 1,
  "userSource": 0,
  "userRoleList": []

The files is passed with the -f or --from-file option:

starburst-ranger-cli user create -f=alice.json

Service definition management#

You can find information about creating and overriding the service definition in the sections about installing and upgrading the SEP Ranger plugin.

Ranger REST API#

Apache Ranger includes a REST API that can be used for automating and troubleshooting your configuration and setup. Use it with caution and reference the API documentation as needed.