EKS L2 Re-write

December 10, 2024 · View on GitHub

Original Author(s): @xazhao
Tracking Issue: #605
API Bar Raiser: @iliapolo

The eks-v2-alpha module is a rewrite of the existing aws-eks module. This new iteration leverages native L1 CFN resources, replacing the previous custom resource approach for creating EKS clusters and Fargate Profiles. Beyond the resource types change, the module introduces several other breaking changes designed to better align the L2 construct with its updated implementation. These modifications collectively enhance the module's functionality and integration within the AWS ecosystem, providing a more robust and streamlined solution for managing Elastic Kubernetes Service (EKS) resources.

This RFC primarily focus on API changes in the new module. The project plan about the new module e.g. how does it co-exists with the current EKS module is out of scope and will be decided later.

Comprehensive use cases and examples will be available in the README file of the forthcoming eks-v2-alpha module.

Compared to the original EKS module, it has the following major changes:

Use native L1 AWS::EKS::Cluster resource to replace custom resource Custom::AWSCDK-EKS-Cluster
Use native L1 AWS::EKS::FargateProfile resource to replace custom resource Custom::AWSCDK-EKS-FargateProfile
Kubectl Handler will not be created by default. It will only be created if users specify it.
Remove AwsAuth construct. Permissions to the cluster will be managed by Access Entry.
Remove the limit of 1 cluster per stack
Remove nested stacks
API changes to make them more ergonomic.

With the new EKS module, customers can deploy an EKS cluster without custom resources.

Working Backwards

Readme

Note: Full readme is too long for this RFC. This readme is simplified version that only focus on use cases that are different from the original EKS module. Full readme will be published in the alpha module.

This library is a rewrite of existing EKS module including breaking changes to address some pain points on the existing EKS module including:

Can't deploy EKS cluster without custom resources
The stack uses nested stacks
Can't create multiple cluster per stack
Can't use escape hatches

It allows you to define Amazon Elastic Container Service for Kubernetes (EKS) clusters. In addition, the library also supports defining Kubernetes resource manifests within EKS clusters.

Quick start

Here is the minimal example of defining an AWS EKS cluster

import * as eks from '@aws-cdk/aws-eks-v2-alpha';

const cluster = new eks.Cluster(this, 'hello-eks', {
  version: eks.KubernetesVersion.V1_31,
});

Architecture

 +-----------------------------------------------+
 | EKS Cluster      | kubectl |  |
 | -----------------|<--------+| Kubectl Handler |
 | AWS::EKS::Cluster             (Optional)      |
 | +--------------------+    +-----------------+ |
 | |                    |    |                 | |
 | | Managed Node Group |    | Fargate Profile | |
 | |                    |    |                 | |
 | +--------------------+    +-----------------+ |
 +-----------------------------------------------+
    ^
    | connect self managed capacity
    +
 +--------------------+
 | Auto Scaling Group |
 +--------------------+

In a nutshell:

EKS Cluster - The cluster endpoint created by EKS.
Managed Node Group - EC2 worker nodes managed by EKS.
Fargate Profile - Fargate worker nodes managed by EKS.
Auto Scaling Group - EC2 worker nodes managed by the user.
Kubectl Handler (Optional) - Custom resource (i.e Lambda Function) for invoking kubectl commands on the cluster - created by CDK

Provisioning cluster

Creating a new cluster is done using the Cluster constructs. The only required property is the kubernetes version.

new eks.Cluster(this, 'HelloEKS', {
  version: eks.KubernetesVersion.V1_31,
});

You can also use FargateCluster to provision a cluster that uses only fargate workers.

new eks.FargateCluster(this, 'HelloEKS', {
  version: eks.KubernetesVersion.V1_31,
});

Note: Unlike the previous EKS cluster, Kubectl Handler will not be created by default. It will only be deployed when kubectlProviderOptions property is used.

new eks.Cluster(this, 'hello-eks', {
  version: eks.KubernetesVersion.V1_31,
  # Using this property will create `Kubectl Handler` as custom resource handler
  kubectlProviderOptions: {
    kubectlLayer: new KubectlV31Layer(this, 'kubectl'),
  }
});

VPC Support

You can specify the VPC of the cluster using the vpc and vpcSubnets properties:

declare const vpc: ec2.Vpc;
new eks.Cluster(this, 'HelloEKS', {
  version: eks.KubernetesVersion.V1_31,
  vpc,
  vpcSubnets: [{ subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS }],
});

If you do not specify a VPC, one will be created on your behalf, which you can then access via cluster.vpc. The cluster VPC will be associated to any EKS managed capacity (i.e Managed Node Groups and Fargate Profiles).

The cluster can be placed inside an isolated VPC. The cluster’s VPC subnets must have a VPC interface endpoint for any AWS services that your Pods need access to. See https://docs.aws.amazon.com/eks/latest/userguide/private-clusters.html

const vpc = new ec2.Vpc(this, 'vpc', {
    subnetConfiguration: [
        {
            cidrMask: 24,
            name: 'Private',
            subnetType: ec2.SubnetType.PRIVATE_ISOLATED,
        },
    ]
});

const cluster = new eks.Cluster(this, 'Mycluster', {
    version: eks.KubernetesVersion.V1_31,
    vpc,
    vpcSubnets: [{ subnetType: ec2.SubnetType.PRIVATE_ISOLATED }]
});

Kubectl Support

You can choose to have CDK create a Kubectl Handler - a Python Lambda Function to apply k8s manifests using kubectl apply. This handler will not be created by default.

To create a Kubectl Handler, use kubectlProviderOptions when creating the cluster. kubectlLayer is the only required property in kubectlProviderOptions.

new eks.Cluster(this, 'hello-eks', {
  version: eks.KubernetesVersion.V1_31,
  # Using this property will create `Kubectl Handler` as custom resource handler
  kubectlProviderOptions: {
    kubectlLayer: new KubectlV31Layer(this, 'kubectl'),
  }
});

`Kubectl Handler` created along with the cluster will be granted admin permissions to the cluster.

Permissions and Security

Amazon EKS supports three modes of authentication: CONFIG_MAP, API_AND_CONFIG_MAP, and API. ConfigMap authentication mode has been deprecated by EKS and the recommended mode is API. The new EKS L2 will go a step further and only support API authentication mode. All grant functions in EKS will use Access Entry to grant permissions to an IAM role/user.

As a result, AwsAuth construct in the previous EKS module will not be provided in the new module. grant() functions are introduced to replace the awsAuth. It’s implemented using Access Entry.

Grant Cluster Admin Access to an IAM role

cluster.grantClusterAdmin('adminAccess', role);

You can also use general grant() to attach a policy to an IAM role/user. See https://docs.aws.amazon.com/eks/latest/userguide/access-policies.html for all access policies

For example, grant Admin Access to a namespace using general grant() method

cluster.grant('namespaceAdmin', role, [
  eks.AccessPolicy.fromAccessPolicyName('AmazonEKSAdminPolicy', {
    accessScopeType: eks.AccessScopeType.NAMESPACE,
    namespaces: ['foo', 'bar'],
  }),
])

Use existing Cluster/Kubectl Handler

This module allows defining Kubernetes resources such as Kubernetes manifests and Helm charts on clusters that are not defined as part of your CDK app.

There are 3 scenarios here:

Import the cluster without creating a new kubectl Handler

const cluster = eks.Cluster.fromClusterAttributes(this, 'MyCluster', {
  clusterName: 'my-cluster-name',
});

This imported cluster is not associated with a Kubectl Handler. It means we won't be able to invoke addManifest() function on the cluster.

Import the cluster and the kubectl Handler

const kubectlProvider = eks.KubectlProvider.fromKubectlProviderArn(this, 'KubectlProvider', 'function-arn');

const cluster = eks.Cluster.fromClusterAttributes(this, 'MyCluster', {
  clusterName: 'my-cluster',
  kubectlProvider
});

Import the cluster and create a new kubectl Handler

const kubectlProvider = new eks.KubectlProvider(this, 'KubectlProvier', {
  clusterName: 'my-cluster',
});

const cluster = eks.Cluster.fromClusterAttributes(this, 'MyCluster', {
  clusterName: 'my-cluster',
  kubectlProvider
});

Migration Guide

This is a general guideline. After migrating to the new construct, run cdk diff to make sure no unexpected changes. If you encounter any issues during the migration, please open a Github issue at CDK repo

Prerequisite: Exposed removal policy in the current EKS Cluster construct so that customers can remove the cluster definition from CDK app without actual deleting the cluster.

Tracking issue: https://github.com/aws/aws-cdk/issues/25544

Due to the fact that switching from a custom resource (Custom::AWSCDK-EKS-Cluster) to a native L1 (AWS::EKS::Cluster) resource requires cluster replacement, CDK users who need to preserve their cluster will have to take additional actions.

Set removal policy to RETAIN on the existing cluster and deploy (this feature will be added later).

To make sure the cluster is not being deleted, set the removal policy to RETAIN on the cluster. It will keep EKS related resources from being deleted when we clean up previous EKS constructs in the stack.
```
 new eks.Cluster(this, 'hello-eks', {
   ...
   removalPolicy: RemovalPolicy.RETAIN,
 });
```
Change the authentication mode of cluster from CONFIG_MAP to API.

Since the new EKS module will only support API authentication mode, you will need to migrate your cluster to API mode.

This is a two steps change. First you need to change CONFIG_MAP to API_AND_CONFIG_MAP to enable access entry. Then for all mappings in aws-auth ConfigMap, you can migrate to access entries. After this migration is done, change API_AND_CONFIG_MAP to API to disable ConfigMap.
Remove cluster definition from their CDK app and deploy.
Add new cluster definition using the new constructs(eks-v2-alpha).
User cdk import to import the existing cluster as the new definition. cdk import will ask you for id/arn/name for EKS related resources. It may include following:
- AWS::EKS::Cluster
- AWS::EKS::FargateProfile
- AWS::EKS::Nodegroup
- AWS::EKS::AccessEntry
After cdk import, running cdk diff to see if there's any unexpected changes.

Public FAQ

What are we launching today?

We’re launching a new EKS alpha module aws-eks-v2-alpha. It's a rewrite of existing aws-eks module with some breaking changes to address pain points in aws-eks module.

Why should I use this feature?

The new EKS alpha module has following benefits:

faster deployment
option to not use custom resource
remove limitations on the previous EKS module (isolated VPC, 1 cluster limit per stack etc)

Internal FAQ

Why are we doing this?

This feature has been highly requested by the community since Feb 2023. The current implementation using custom resource has some limitations and is harder to maintain. We can also use this chance to solve some major pain points in the current EKS L2.

Issues will be solved with the new module:

https://github.com/aws/aws-cdk/issues/24059 (Custom Resource)
https://github.com/aws/aws-cdk/issues/25544 (Custom Resource related)
https://github.com/aws/aws-cdk/issues/24174 (Custom Resource related)
https://github.com/aws/aws-cdk/issues/19753 (ConfigMap)
https://github.com/aws/aws-cdk/issues/19218 (ConfigMap)
https://github.com/aws/aws-cdk/issues/31942 (One cluster per stack limit)

readonly version: KubernetesVersion;
readonly vpc: ec2.IVpc;
readonly vpcSubnets: ec2.SubnetSelection[];
readonly albController?: AlbController;
readonly clusterName: string;
readonly coreDnsComputeType?: CoreDnsComputeType;
readonly defaultCapacity?: autoscaling.AutoScalingGroup;
readonly defaultCapacityInstance?: ec2.InstanceType;
readonly defaultCapacityType?: DefaultCapacityType;
readonly endpointAccess: EndpointAccess;
readonly ipFamily?: IpFamily;
readonly prune?: boolean;
readonly role?: iam.IRole;
readonly secretsEncryptionKey?: kms.IKey;
readonly securityGroup?: ec2.ISecurityGroup;
readonly serviceIpv4Cidr?: string;
readonly tags?: { [key: string]: string };
readonly mastersRole?: iam.IRole;
readonly bootstrapClusterCreatorAdminPermissions?: boolean;
readonly clusterLogging?: ClusterLoggingTypes[];

readonly kubectlProviderOptions?: KubectlProviderOptions; # new property

readonly outputMastersRoleArn?: boolean; # will be removed
readonly outputClusterName?: boolean; # will be removed
readonly outputConfigCommand?: boolean; # will be removed
readonly authenticationMode?: AuthenticationMode; # will be removed
readonly clusterHandlerEnvironment?: { [key: string]: string }; # will be removed
readonly clusterHandlerSecurityGroup?: ec2.ISecurityGroup; # will be removed
readonly onEventLayer?: lambda.ILayerVersion; # will be removed
readonly clusterHandlerSecurityGroup?: ec2.ISecurityGroup; # will be removed

readonly awscliLayer?: lambda.ILayerVersion; # move to kubectlProviderOptions
readonly kubectlEnvironment?: { [key: string]: string }; # move to kubectlProviderOptions
readonly kubectlLambdaRole?: iam.IRole; # move to kubectlProviderOptions
readonly kubectlLayer?: lambda.ILayerVersion; # move to kubectlProviderOptions
readonly kubectlMemory?: Size; # move to kubectlProviderOptions

KubectlProviderOptions Definition

export interface KubectlProviderOptions {
  readonly role?: iam.IRole;
  readonly awscliLayer?: lambda.ILayerVersion;
  readonly kubectlLayer?: lambda.ILayerVersion;
  readonly memory?: Size;
  readonly environment?: { [key: string]: string };
  /**
   * Which subnets should the provider functions be placed in.
   */
  readonly vpcSubnets?: ec2.SubnetSelection;
}

Note: fromKubectlProviderAttributes() is renamed to fromKubectlProviderFunctionArn(). Before

const kubectlProvider = eks.KubectlProvider.fromKubectlProviderAttributes(this, 'KubectlProvider', {
  functionArn,
  kubectlRoleArn: 'arn:aws:iam::123456789012:role/kubectl-role',
  handlerRole,
});

After

const kubectlProvider = eks.KubectlProvider.fromKubectlProviderArn(this, 'KubectlProvider', {
  functionArn: // Required. ARN of the original kubectl function
});

Following parameters are removed:

kubectlRoleArn
handlerRole

Reason: when the KubectlProvider was created in another stack, the lambda execution role already has permissions to access the cluster.