EKS cluster creation #

AWS requires EKS clusters to have a minimum of two availability zones (AZs), but Starburst Enterprise platform (SEP) best practices requires that the coordinator and workers reside in a single AZ, in a single subnet.

This document covers how to configure your cluster to ensure that all SEP resources are co-located and follow best practices.

Prerequisites #

The following tools and policies are required to create an SEP cluster in EKS:

  • kubectl
  • eksctl version 0.54.0 or later
  • IAM Polices for Glue, S3, as desired

Create your sep_eks_cluster.yaml file #

Your YAML file should start with the following two lines:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

Next, add the metadata: section to describe your cluster. The following example shows the minimum required fields, as well as suggested tags for a staging cluster running in us-east-2:

metadata:
  name: my-sep-cluster
  region: us-east-2
  version: "1.20"
  tags:
    cloud: aws
    environment: staging
    info: "EKS cluster for Starburst Enterprise staging environment"
    user: your.username

Specify your networking #

Add the two required AZs, and the existing subnets associated with them:

vpc:
  subnets:
     private:
       us-east-2a:
         id: subnet-0Subnet1ID8String2
       us-east-2b:
         id: subnet-0Subnet0ID2String3

For purposes of this example, we set up SEP in the us-east-2a AZ.

Create your EKS managedNodeGroups: #

Starburst recommends using managedNodeGroups: to create the pools of instances available to SEP and its associated services. managedNodeGroups: in EKS have the additional benefit of automating SIGTERM delivery to SEP workers and the coordinator when a Spot instance is removed to enable graceful shutdown. With nodeGroups: additional development must be done, outside of SEP, to allow for graceful shutdown.

In this example, a managed node group called sep_support_services is created alongside the sep managed node group. The support services managed node group runs HMS and Ranger, if those services are required in your environment.

The first three IAM policy ARNs, shown below for both groups, are required by EKS. The IAM policy ARN for Glue illustrates using an additional policy to allow access to EKS, S3 and Glue in the account used for the EKS cluster without supplying credentials. It is not required.

managedNodeGroups:

  - name: sep
    tags:
      cloud: aws
      environment: staging
      info: "EKS cluster for Starburst Enterprise staging environment"
      user: your.username
    availabilityZones: [us-east-2a]
    labels:
      allow-workers: workers
    instanceTypes: ["m5.xlarge", "m5a.xlarge", "m5ad.xlarge"]
    desiredCapacity: 2
    minSize: 2
    maxSize: 4
    privateNetworking: true
    ssh:
      allow: true
      publicKeyName: en-field-key
    iam:
      attachPolicyARNs:
        - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
        - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
        - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
        - arn:aws:iam::47ID9String78:policy/EKS-S3-Glue

  - name: sep_support_services
    tags:
      cloud: aws
      environment: staging
      info: "EKS cluster for Starburst Enterprise staging environment"
      user: your.username
    availabilityZones: [us-east-2b]
    spot: true
    instanceTypes: ["m5.large", "m5a.large", "m5ad.large"]
    desiredCapacity: 1
    minSize: 1
    maxSize: 1
    privateNetworking: true
    ssh:
      allow: true
      publicKeyName: en-field-key
    iam:
      attachPolicyARNs:
        - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
        - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
        - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
        - arn:aws:iam::47ID9String78:policy/EKS-S3-Glue

Save your file and create your cluster #

When you are finished adding the required content to your sep_eks_cluster.yaml file, save it and use it to create your cluster with eksctl:

$ eksctl create cluster -f sep_eks_cluster.yaml

When the command completes successfully, the following message appears in your terminal:

2021-07-14 14:28:56 [✓] EKS cluster "my-sep-cluster" in "us-east-2" region is ready