AWS Lake Formation access control support#
Note
AWS Lake Formation access control and security mapping support is a public preview feature. Contact Starburst support with questions or feedback.
Starburst Enterprise platform (SEP) provides support for using an existing AWS Lake Formation access control system.
Requirements#
In order to use AWS Lake Formation integration with Starburst Enterprise, you need:
An existing AWS Lake Formation configuration and AWS credentials that allow interacting with its API.
A valid Starburst Enterprise license.
Overview#
AWS Lake Formation provides a single place to manage access controls policies. You can define security policies that restrict access to data at database, table, column, row and cell levels. These policies apply to AWS Identity and Access Management (IAM) users and roles, and to users and groups when federating through an external identity provider.
Starburst Enterprise platform (SEP) integration with AWS Lake Formation enforces AWS Lake Formation access control policies when accessing registered Amazon S3 data lake locations.
AWS Lake Formation access control support is only available for catalogs, that use the Hive connector, since it utilizes the security system of the Hive connector.
Configure AWS Lake Formation#
Each catalog that needs to be controlled with AWS Lake Formation must have the
catalog properties file configured to use the lake-formation
Hive security:
hive.security=lake-formation
The following is a more complex example of a catalog properties file that is configured to use AWS Lake Formation for authorization with the Hive connector.
connector.name=hive
hive.security=lake-formation
hive.metastore=glue
hive.metastore.glue.region=us-east-2
hive.metastore.glue.default-warehouse-dir=s3://data-lake-bucket
hive.metastore.glue.iam-role=arn:aws:iam::<account_id>:role/role_for_glue
hive.s3.iam-role=arn:aws:iam::<account_id>:role/role_for_s3
lake-formation.authorized-caller-tag=starburst-enterprise
lake-formation.security-mapping.config-file=etc/lakeformation-security-mapping.json
More information on lake formation security mapping can be found later in this topic.
Configuration properties#
Property |
Description |
---|---|
|
The value of |
Lake formation security mapping#
SEP supports flexible security mapping for lake formation, which associates
SEP users or groups with AWS security entities like IAM roles according to a
JSON mappings file. The IAM role for a specific query can be selected from a
list of allowed roles using SHOW ROLE GRANTS FROM <catalog>
and
SET ROLE "..." IN <catalog>
sql statements.
Each security mapping entry may specify one or more match criteria. If multiple criteria are specified, all criteria must match. The following match criteria are available:
"user":
- Regular expression to match against username. For example:alice|bob
to match either SEP users “alice” and “bob”."group":
- Regular expression to match against any of the groups that the user belongs to. For example:finance|sales
to match either the finance or sales groups in SEP.
Each SEP match criteria can be mapped to one or more of the following AWS security entities:
"iamRole":
IAM role to use if no user provided role is specified. This overrides any globally configured IAM role."roleSessionName":
(Optional) Only valid wheniamRole
is specified. IfroleSessionName
includes the string${USER}
, then the${USER}
portion of the string will be replaced with the current session’s username. IfroleSessionName
is not specified, it defaults totrino-session
."allowedIamRoles":
Comma-separated list of IAM roles that specified AWS account users are limited to.
The security mapping entries are processed in the order listed in the
JSON mapping. More specific mapping entries should thus be specified
before less specific mapping entries. For example, the mapping list might have a
"group":
entry for “salesnorth” followed by an entry for “sales” to allow to
apply a more specific lake formation security mapping to the north sales team,
before applying a more broad security mapping to the whole sales department.
You can set a default mapping by adding an entry to the end of the file that does not specify an SEP match criteria. If no mapping entry matches and no default is configured, access is denied with a “Cannot set role NONE” error.
The JSON mapping can either be retrieved from a file or REST-endpoint
specified via the lake-formation.security-mapping.config-file
config
property.
The following example JSON mapping applies SEP user and group mappings to security entities in AWS lake formation:
{
"mappings": [
{
"user": "bob|charlie",
"iamRole": "arn:aws:iam::123456789101:role/test_default",
"allowedIamRoles": [
"arn:aws:iam::123456789101:role/test_default"
"arn:aws:iam::123456789101:role/test1",
"arn:aws:iam::123456789101:role/test2",
"arn:aws:iam::123456789101:role/test3"
]
},
{
"user": "salesnorth",
"iamRole": "arn:aws:iam::123456789101:role/sales_north_users"
},
{
"group": "sales*",
"iamRole": "arn:aws:iam::123456789101:role/sales_all_users"
},
{
"iamRole": "arn:aws:iam::123456789101:role/default"
}
]
}
Security mapping configuration properties#
Property |
Description |
---|---|
|
Path and filename of the JSON mapping file, or REST-endpoint URI containing security mappings. |
|
How often to refresh the security mapping configuration. For example, use
|
The following example shows the lake formation security mapping configuration properties:
lake-formation.authorized-caller-tag=starburst-enterprise
lake-formation.security-mapping.config-file=etc/example-lake-formation-security-mapping.json
lake-formation.security-mapping.refresh-period=5m
Security mapping role requirements#
AWS Lake Formation permissions are read from AWS using two different sets of impersonated role credentials when executing queries against catalogs protected by AWS Lake Formation security policies:
admin
- Identified byhive.metastore.glue.iam-role
configuration property.user
- Selected according to Security Mapping rules.
You must configure the following in AWS IAM:
admin
role must be configured to impersonateuser
role in AWS Trust Relationships.sts:TagSession
andsts:AssumeRole
actions must both be allowed.glue:GetDatabases
,glue:GetDatabase
,glue:GetTables
,glue:GetTable
,glue:GetPartition
,glue:BatchGetPartition
AWS Glue API permissions must be granted for theuser
role.
Listing and selecting available user roles#
The following SQL statements are available to list and set roles:
SHOW ROLE GRANTS FROM <catalog_name>
- Lists AWS roles available to the user in a catalog protected by lake formation.SHOW CURRENT ROLES IN <catalog_name>
- Returns currently enabled roles.SET ROLE "arn:iam::..." IN <catalog_name>
- Selects a specific role.SET ROLE NONE IN <catalog_name>
- Causes the role to default to the role defined byiamRole
in the security mapping configuration, if its set.
Use the roles=<catalog_name>:arn:iam::...
connection property to select
specific role for a jdbc connection.
Caching#
In order to make permission checks run as fast as possible, lake formation access control caches permission data for each user. By default, permissions are stored for a maximum of 1000 users, and expire ofter 10 minutes. In the following example, the permissions cache size is reduced to 100 users, and the
permissions set to expire after two hours:
lake-formation.cache-ttl=2h
lake-formation.cache-size=100
Property |
Description |
---|---|
|
Time duration for which to store lake formation permission data for each user. |
|
Maximum number of users for which to store lake formation permission data. |
If needed, the cache can be manually cleared using a SQL procedure call in the catalog:
CALL system.flush_access_control_cache()
Limitations#
The following are not supported:
DML or DDL operations in AWS Lake Formation.