Configuring Starburst Enterprise with Ranger in Kubernetes#

The starburst-ranger Helm chart configures Apache Ranger 2.1.0 usage in the cluster with the values.yaml file detailed in the following sections. It allows you to implement global access control or just Hive access control with Ranger for Starburst Enterprise platform (SEP).

It creates the following setup:

Use your registry credentials, and follow best practices by creating an override file for changes to default values as desired.

Docker registry access#

Same as Docker image and registry section for the Helm chart for SEP.

registryCredentials:
  enabled: false
  registry:
  username:
  password:

imagePullSecrets:
 - name:

Ranger server#

The admin section configures the Ranger server and the included user interface for policy management.

admin:
  image:
    repository: "harbor.starburstdata.net/starburstdata/starburst-ranger-admin"
    tag: "2.1.0-e.7"
    pullPolicy: "IfNotPresent"
  port: 6080
  resources:
    requests:
      memory: "1Gi"
      cpu: 2
    limits:
      memory: "1Gi"
      cpu: 2
  # serviceUser is used by SEP to access Ranger
  serviceUser: "starburst_service"
  passwords:
    admin: "RangerPassword1"
    tagsync: "TagSyncPassword1"
    usersync: "UserSyncPassword1"
    keyadmin: "KeyAdminPassword1"
    service: "StarburstServicePassword1"
  # optional truststore containing CA certificates to use instead of default one
  truststore:
    # existing secret containing truststore.jks key
    secret:
    # password to truststore
    password:
  # Enable the propagation of environment variables from Secrets and Configmaps
  envFrom: []
  env:
    # Additional env variables to pass to Ranger Admin.
    # To pass Ranger install property, use variable with name RANGE__<property_name>,
    # for example RANGER__authentication_method.
  securityContext: {}
    # Optionally configure a security context for the ranger admin container

admin.serviceUser

The operating system user that is used to run the Ranger application.

admin.passwords

A number of passwords need to be set to any desired values. They are used for administrative and Ranger internal purposes and do not need to be changed or used elsewhere.

LDAP user synchronization server#

You can use the usersync block to configure the details of the synchronization of users and groups between Ranger and your LDAP system. It runs on a separate pod when deployed.

The default configuration enables user synchronization:

usersync:
  enabled: true
  image:
    repository: "harbor.starburstdata.net/starburstdata/ranger-usersync"
    tag: "2.1.0-e.7"
    pullPolicy: "IfNotPresent"
  name: "ranger-usersync"
  resources:
    requests:
      memory: "1Gi"
      cpu: 1
    limits:
      memory: "1Gi"
      cpu: 1
  tls:
    # optional truststore containing CA certificate for ldap server
    truststore:
      # existing secret containing truststore.jks key
      secret:
      # password to truststore
      password:
  # Enable the propagation of environment variables from Secrets and Configmaps
  envFrom: []
  # env is a map of ranger config variables
  env:
    # Use RANGER__<property_name> variables to set Ranger install properties.
    RANGER__SYNC_LDAP_URL: "ldap://ranger-ldap:389"
    RANGER__SYNC_LDAP_BIND_DN: "cn=admin,dc=ldap,dc=example,dc=org"
    RANGER__SYNC_LDAP_BIND_PASSWORD: "cieX7moong3u"
    RANGER__SYNC_LDAP_SEARCH_BASE: "dc=ldap,dc=example,dc=org"
    RANGER__SYNC_LDAP_USER_SEARCH_BASE: "ou=users,dc=ldap,dc=example,dc=org"
    RANGER__SYNC_LDAP_USER_OBJECT_CLASS: "person"
    RANGER__SYNC_GROUP_SEARCH_ENABLED: "true"
    RANGER__SYNC_GROUP_USER_MAP_SYNC_ENABLED: "true"
    RANGER__SYNC_GROUP_SEARCH_BASE: "ou=groups,dc=ldap,dc=example,dc=org"
    RANGER__SYNC_GROUP_OBJECT_CLASS: "groupOfNames"
  securityContext:
    # Optionally configure a security context for the ranger usersync container
User synchronization configuration properties#

Node name

Description

usersync.enabled

Enables or disables user synchronization feature

usersync.name

Name of the pod

usersync.tls.truststore.secret

Name of the secret created from the truststore. This is required if you need to use tls for usersync.

usersync.tls.truststore.password

Password for the truststore. This is required if you need to use tls for usersync.

usersync.env

A map of Ranger config variables related to the user synchronization

usersync.env.RANGER__SYNC_LDAP_URL

URL to the LDAP server

usersync.env.RANGER__SYNC_LDAP_BIND_DN

Distinguished name (DN) string used to bind for the LDAP connection

usersync.env.RANGER__SYNC_LDAP_BIND_PASSWORD

usersync.env.RANGER__SYNC_LDAP_SEARCH_BASE

usersync.env.RANGER__SYNC_LDAP_USER_SEARCH_BASE

User information search base in the LDAP directory

usersync.env.RANGER__SYNC_LDAP_USER_OBJECT_CLASS

Object class for users

usersync.env.RANGER__SYNC_GROUP_SEARCH_ENABLED

Enable or disable group search

usersync.env.RANGER__SYNC_GROUP_USER_MAP_SYNC_ENABLED

Enable or disable synchronization of group-user mapping

usersync.env.RANGER__SYNC_GROUP_SEARCH_BASE

Group information search base in the LDAP directory

usersync.env.RANGER__SYNC_GROUP_OBJECT_CLASS

Object class for groups, typically groupOfNames for OpenLDAP or group for Active Directory

The following steps can be used to enable TLS with the LDAP server:

  • Create a truststore file named truststore.jks from the LDAP server

  • Create a Kubernetes secret ldap-cert from the truststore file

    kubectl create secret generic ldap-cert --from-file truststore.jks
    
  • Update values to reflect the secret name in the tls section

  • Update truststore password in the tls section

    tls:
      enabled: true
      truststore:
        secret: ldap-cert
        password: "truststore password"
    

Internal backing database server#

You can use a PostgreSQL database located within the cluster, created by the chart, as backend for the policy storage of Ranger in the database block for testing.

Note

Alternatively, you can use an external PostgreSQL database for production usage that you must manage yourself.

This section describes YAML nodes provided by default for configuring the internal backing database:

database:
  type: "internal"
  internal:
    image:
      repository: "library/postgres"
      tag: "10.6"
      pullPolicy: "IfNotPresent"
    volume:
      persistentVolumeClaim:
        storageClassName:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: "2Gi"
    resources:
      requests:
        memory: "1Gi"
        cpu: 2
      limits:
        memory: "1Gi"
        cpu: 2
    port: 5432
    databaseName: "ranger"
    databaseUser: "ranger"
    databasePassword: "RangerPass123"
    databaseRootUser: "rangeradmin"
    databaseRootPassword: "RangerAdminPass123"
    securityContext: {}
    envFrom: []
    env: []
Internal backing database server configuration properties#

Node name

Description

database.type

Set to internal to use a database in the k8s cluster, managed by the chart

database.internal.image

Docker container images used for the PostgreSQL server

database.internal.volume

Storage volume to persist the database. The default configuration requests a new persistent volume (PV).

database.internal.volume.persistentVolumeClaim

The default configuration, which requests a new persistent volume (PV).

database.internal.volume.existingVolumeClaim

Alternative volume configuration, which use existing volume claim by referencing the name as the value in quotes, e.g., "my_claim".

database.internal.volume.emptyDir

Alternative volume configuration, which configures an empty directory on the pod, keeping in mind that a pod replacement loses the database content.

database.internal.resources

database.internal.databaseName

Name of the internal database

database.internal.databaseUser

User to connect to the internal database

database.internal.databasePassword

Password to connect to internal database

database.internal.databaseRootUser

User to administrate the internal database for creating and updating tables and similar operations.

database.internal.databaseRootPassword

Password for the administrator to connect to the the internal database

database.internal.envFrom

YAML sequence of mappings to define Secret or Configmap as a source of environment variables for the internal PostgreSQL container.

database.internal.env

YAML sequence of mappings to define two keys environment variables for the internal PostgreSQL container.

Examples#

OpenShift deployments often do not have access to pull from the default Docker registry library/postgres. You can replace it with an image from the Red Hat registry, which requires additional environment variables set with the parameter database.internal.env:

database:
  type: internal
  internal:
    image:
       repository: "registry.redhat.io/rhscl/postgresql-96-rhel7"
       tag: "latest"
    env:
      - name: POSTGRESQL_DATABASE
        value: "hive"
      - name: POSTGRESQL_USER
        value: "hive"
      - name: POSTGRESQL_PASSWORD
        value: "HivePass1234"

Another option is to create a Secret (ex. postgresql-secret) containing variables needed by postgresql mentioned in previous code block, and pass it to the container with envFrom parameter:

database:
  type: internal
  internal:
    image:
       repository: "registry.redhat.io/rhscl/postgresql-96-rhel7"
       tag: "latest"
    envFrom:
      - secretRef:
          name: postgresql-secret

External backing database server#

This section shows the empty default setup for using of an external PostgreSQL database. You must provide the necessary details for the external server and ensure that it can be reached from the k8s cluster pod.

database:
  type: "external"
  external:
    port:
    host:
    databaseName:
    databaseUser:
    databasePassword:
    databaseRootUser:
    databaseRootPassword:
External backing database server configuration properties#

Node name

Description

database.type

Set to external to use a database managed externally

database.external / port

Port to access the external database

database.external / host

Host of the external database

database.external / databaseName

Name of the database

database.external / databaseUser

User to connect to the database. If the user does not already exist, it is automatically created during installation.

database.external / databasePassword

Password to connect to the database

database.external / databaseRootUser

The existing root user to administrate the external database. It is used to create and update tables and similar operations.

database.external / databaseRootPassword

Password for the administrator to connect to the external database

Additional volumes#

Additional volumes can be necessary for storing and accessing persisted files. They can be defined in the additionalVolumes section. None are defined by default:

additionalVolumes: []

You can add one or more volumes supported by k8s, to all nodes in the cluster.

If you specify path only, a directory named in path is created. When mounting ConfigMap or Secret, files are created in this directory for each key.

This supports an optional subPath parameter which takes in an optional key in the ConfigMap or Secret volume you create. If you specify subPath, a specific key named subPath from ConfigMap or Secret is mounted as a file with the name provided by path.

The following example snippet shows both use cases:

additionalVolumes:
  - path: /mnt/InContainer
    volume:
      emptyDir: {}
  - path: /tmp/config.txt
    subPath: config.txt
    volume:
      configMap:
        name: "configmap-in-volume"

Exposing the cluster to outside network#

The expose section for Ranger works identical to the expose section for SEP. It exposes the Ranger user interface for configuring and managing policies outside the cluster.

Differences are isolated to the configured default values. The default type is clusterIp:

expose:
  type: "clusterIp"
  clusterIp:
    name: "ranger"
    ports:
      http:
        port: 6080

The following section shows the default values with an activated nodePort type:

expose:
  type: "nodePort"
  nodePort:
    name: "ranger"
    ports:
      http:
        port: 6080
        nodePort: 30680

The following section shows the default values with an activated loadBalancer type:

expose:
  type: "loadBalancer"
  loadBalancer:
    name: "ranger"
    IP: ""
    ports:
      http:
        port: 6080
    annotations: {}
    sourceRanges: []

The following section shows the default values with an activated ingress type:

expose:
  type: "ingress"
  ingress:
    tls:
      enabled: true
      secretName:
    host:
    path: "/"
    annotations: {}

Datasources#

# datasources - list of SEP datasources to configure Ranger # services. It is mounted as file /config/datasources.yaml inside # container and processed by init script.

datasources:
  - name: "fake-starburst-1"
    host: "starburst.fake-starburst-1-namespace"
    port: 8080
    username: "starburst_service1"
    password: "Password123"
  - name: "fake-starburst-2"
    host: "starburst.fake-starburst-2-namespace"
    port: 8080
    username: "starburst_service2"
    password: Password123

Server start up configuration#

You can create a startup shell script to customize how Ranger is started, and pass additional arguments to it.

The script receives the container name as input parameter. Possible values are ranger-admin and ranger-usersync. Additional arguments can be configured with extraArguments.

initFile:
extraArguments:

Extra secrets#

You can configure additional secrets that are mounted in the /extra-secret/ path on each container.

extraSecret:
  # Replace this with secret name that should be used from namespace you are deploying to
  name:
  # Optionally 'file' may be provided which will be deployed as secret with given 'name' in used namespace.
  file:

Node assignment#

You can configure your cluster to determine the node and pod to use for the Ranger server:

nodeSelector: {}
tolerations: []
affinity: {}

Our SEP configuration documentation contains examples and resources to help you configure these YAML nodes.

Annotations#

You can add configuration to annotate the deployment and pod:

deploymentAnnotations: {}
podAnnotations: {}

Security context#

You can optionally configure security contexts to define privilege and access control settings for the Ranger containers. You can separately configure the security context for Ranger admin, usersync and database containers.

securityContext:

If you do not want to set the serviceContext for the default service account, you can restrict it by configuring the service account for the Ranger pod.

For a restricted environment like OpenShift, you may need to set the AUDIT_WRITE capability for Ranger usersync:

securityContext:
  capabilities:
    add:
      - AUDIT_WRITE

Additionally OpenShift clusters, need anyuid and privileged security context constraints set for the service account used by Ranger. For example:

oc create serviceaccount <k8s-service-account>
oc adm policy add-scc-to-user anyuid system:serviceaccount:<k8s-namespace>:<k8s-service-account>

Service account#

You can configure a security account for the Ranger pod using:

serviceAccountName:

Environment variables#

You can pass environment variables to the Ranger container using the same mechanism used for the internal database:

envFrom: []
env: []

Both are specified as a mapping sequences for example:

envFrom:
  - secretRef:
      name: my-secret-with-vars
env:
  - name: MY_VARIABLE
    value: some-value