How to create a cluster node autoscaler in AWS EKS ?

Krishnendu Bhowmick
4 min readMar 8, 2023

--

Amazon Elastic Kubernetes Service (EKS) allows you to run Kubernetes workloads on AWS without needing to manage the Kubernetes control plane. One of the essential features of Kubernetes is the ability to autoscale workloads based on the resource utilization of your nodes. EKS makes it easy to create an auto-scaling group to automatically adjust the number of worker nodes based on the resource requirements of your pods.

In this tutorial, we’ll walk through the steps to create an EKS cluster autoscaler using the Kubernetes cluster autoscaler project.

Prerequisites

Before getting started, you’ll need to have the following:

  • An EKS cluster with worker nodes.
  • A Kubernetes cluster version 1.14 or later.
  • kubectl command-line tool installed.
  • IAM permissions to create an IAM policy, create a Kubernetes service account, and deploy a Kubernetes manifest.

Step 1: Create an IAM policy for the autoscaler

First, you need to create an IAM policy that allows the Kubernetes cluster autoscaler to access the AWS Auto Scaling API. Create a file named autoscaler-policy.json with the following content:

{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"ec2:DescribeLaunchTemplateVersions"
],
"Resource": "*",
"Effect": "Allow"
}
]
}

Save the file and then create the IAM policy using the AWS CLI:

aws iam create-policy --policy-name eks-autoscaler-policy --policy-document file://autoscaler-policy.json

Note down the ARN of the created policy. You’ll need it later.

Step 2: Attach the IAM policy to the EKS worker node IAM role

Next, you need to attach the IAM policy created in the previous step to the EKS worker node IAM role. Replace your-node-instance-profile with the name of your node instance profile, and your-eks-autoscaler-policy-arn with the ARN of the IAM policy created in step 1.

aws iam attach-role-policy --role-name your-node-instance-profile --policy-arn your-eks-autoscaler-policy-arn

Step 3: Create a Kubernetes service account and RBAC

To run the cluster autoscaler on EKS, you’ll need to create a Kubernetes service account and RBAC to give it the required permissions. Create a file named cluster-autoscaler.yaml with the following content:

apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["events", "endpoints"]
verbs: ["create", "patch"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["update"]
- apiGroups: [""]
resources: ["endpoints"]
resourceNames: ["cluster-autoscaler"]
verbs: ["get", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["watch", "list", "get", "update"]
- apiGroups: [""]
resources:
- "pods"
- "services"
- "replicationcontrollers"
- "persistentvolumeclaims"
- "persistentvolumes"
verbs: ["watch", "list", "get"]
- apiGroups: ["extensions"]
resources: ["replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["watch", "list"]
- apiGroups: ["apps"]
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "patch"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["create"]
- apiGroups: ["coordination.k8s.io"]
resourceNames: ["cluster-autoscaler"]
resources: ["leases"]
verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create","list","watch"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete", "get", "update", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8085'
spec:
serviceAccountName: cluster-autoscaler
containers:
- image: k8s.gcr.io/cluster-autoscaler:v1.14.7
name: cluster-autoscaler
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/EKS_Demo # Update Your ClusterName here
env:
- name: AWS_REGION
value: ap-south-1 # Update Your Region Name here in which u have EKS Cluster
volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: true
imagePullPolicy: "Always"
volumes:
- name: ssl-certs
hostPath:
path: "/etc/ssl/certs/ca-bundle.crt"

Note: Make sure to update the image tag to the latest version available.

Once the YAML is updated, save the file and deploy it to the cluster with the following command:

kubectl apply -f cluster-autoscaler.yaml
  1. Verify the Autoscaler After the deployment, you can verify the cluster autoscaler is working by looking at the logs of the autoscaler pod.
kubectl logs -f cluster-autoscaler-<pod-hash>

If the autoscaler is working correctly, you should see log messages indicating that it has discovered the node groups and scaled up or down the number of nodes.

You can also check the status of the autoscaling groups in the AWS console or with the following command:

aws autoscaling describe-auto-scaling-groups

And that’s it! Your EKS cluster now has a fully functional autoscaler that can scale the worker nodes up or down based on the cluster’s workload.

Thank you for taking the time to read this article. If you found it helpful or enjoyable, please consider following, sharing and subscribing. Your support means a lot and helps us continue creating content that you’ll love.

--

--

Krishnendu Bhowmick
Krishnendu Bhowmick

Written by Krishnendu Bhowmick

Site Reliability Engineering | Devops Practitioner | Open Source Advocate | Cloud Enthusiastic

No responses yet