IMDSv2 Configuration Impact on Amazon Linux 2023 in EKS Managed Node Groups when configuring MSDPCloud
Problem
MSDP Cloud and Snapshot manager configuration fails when Amazon Linux 2023 instance is launched as part of an EKS Managed Node Group.
Error Message
OCSD log message shows communication errors from cloud services.
Cause
When an Amazon Linux 2023 EC2 VM is launched as a standalone instance (not part of an EKS node group), AWS automatically configures the instance with the following IMDSv2 setting:
HttpPutResponseHopLimit = 2
This allows containers or applications running on the instance to access instance metadata over multiple network hops, enabling smooth interaction with AWS services such as S3, even from inside containers or pods.
When the same Amazon Linux 2023 instance is launched as part of an EKS Managed Node Group (without a custom launch template), AWS overrides the IMDS setting:
HttpPutResponseHopLimit = 1
This configuration limits metadata access to only the instance itself, which causes containers/pods running inside the VM to fail when attempting to access the metadata service.
Solution
1. Create a launch template. The template can be reused for different nodegroups. Do not put Subnet/IAM instance profile/UserData here. This LT is only to enforce IMDS and root volume. EKS will pick the correct AL2023 AMI.
aws ec2 create-launch-template \
--region us-east-1 \
--cli-input-json '{
"LaunchTemplateName": "lt-eks-cs-AL2023-hop2",
"LaunchTemplateData": {
"BlockDeviceMappings": [
{
"DeviceName": "/dev/xvda",
"Ebs": { "VolumeSize": 100, "VolumeType": "gp3", "DeleteOnTermination": true }
}
],
"MetadataOptions": { "HttpTokens": "required", "HttpPutResponseHopLimit": 2 }
}
}'
2. Get the existing node group’s configuration
Run this command for the nodepool being migrated and document the configuration. Set REGION
, CLUSTER
, and NODEGROUP
being migrated to match your configuration.
REGION=us-east-1
CLUSTER=eks-nbucs
NODEGROUP=eks-nodegroup-storage-nbucs
# Document the node group spec (labels, taints, subnets, scalingConfig, nodeRole, instanceTypes)
aws eks describe-nodegroup \
--region "$REGION" \
--cluster-name "$CLUSTER" \
--nodegroup-name "$NODEGROUP" \
--query 'nodegroup.{nodeRole:nodeRole,subnets:subnets,scalingConfig:scalingConfig,instanceTypes:instanceTypes,labels:labels,taints:taints}' \
--output json
The output should look something like this.
{
"nodeRole": "arn:aws:iam::999999999999:role/iam-role-cluster-role-nbucs",
"subnets": [
"subnet-XXXXXXXXXXXXXXXXX"
],
"scalingConfig": {
"minSize": 1,
"maxSize": 20,
"desiredSize": 1
},
"instanceTypes": [
"r5.2xlarge"
],
"labels": {
"storage-pool": "storagepool"
},
"taints": [
{
"key": "storage-pool",
"value": "storagepool",
"effect": "NO_SCHEDULE"
}
]
}
3. Create the new AL2023 nodegroup using the LT. Make sure to update the relevant fields to match the previous output. Make sure to set "amiType": "AL2023_x86_64_STANDARD".
aws eks create-nodegroup \
--region us-east-1 \
--cli-input-json '{
"clusterName": "eks-nbucs",
"nodegroupName": "msdppool-AL2023",
"subnets": ["subnet-XXXXXXXXXXXXXXXXX"],
"nodeRole": "arn:aws:iam::999999999999:role/iam-role-cluster-role-nbucs",
"scalingConfig": { "minSize": 1, "maxSize": 20, "desiredSize": 1 },
"labels": { "storage-pool": "storagepool" },
"taints": [{ "key": "storage-pool", "value": "storagepool", "effect": "NO_SCHEDULE" }],
"capacityType": "ON_DEMAND",
"amiType": "AL2023_x86_64_STANDARD",
"instanceTypes": ["r5.2xlarge"],
"launchTemplate": { "name": "lt-eks-cs-AL2023-hop2", "version": "$Latest" }
}'
You should see an output as shown below.
{
"nodegroup": {
"nodegroupName": "msdppool-AL2023",
"nodegroupArn": "arn:aws:eks:us-east-1:999999999999:nodegroup/eks-nbucs/msdppool-AL2023/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
"clusterName": "eks-nbucs",
"version": "1.32",
"releaseVersion": "1.32.7-20250819",
"createdAt": "2025-08-23T02:46:39.414000+00:00",
"modifiedAt": "2025-08-23T02:46:39.414000+00:00",
"status": "CREATING",
"capacityType": "ON_DEMAND",
"scalingConfig": {
"minSize": 1,
"maxSize": 20,
"desiredSize": 1
},
"instanceTypes": [
"r5.2xlarge"
],
"subnets": [
"subnet-XXXXXXXXXXXXXXXXX"
],
"amiType": "AL2023_x86_64_STANDARD",
"nodeRole": "arn:aws:iam::999999999999:role/iam-role-cluster-role-nbucs",
"labels": {
"storage-pool": "storagepool"
},
"taints": [
{
"key": "storage-pool",
"value": "storagepool",
"effect": "NO_SCHEDULE"
}
],
"health": {
"issues": []
},
"updateConfig": {
"maxUnavailable": 1
},
"launchTemplate": {
"name": "lt-eks-cs-AL2023-hop2",
"version": "1",
"id": "lt-XXXXXXXXXXXXXXXXX"
},
"tags": {}
}
}
4. Migrate the pods gracefully and delete the nodes.
# Find nodes in the nodegroup
kubectl get nodes -l "eks.amazonaws.com/nodegroup=${NODEGROUP}"
# Cordon them
for n in $(kubectl get nodes -l "eks.amazonaws.com/nodegroup=${NODEGROUP}" -o name); do
kubectl cordon "${n##*/}"
done
# Drain them (may need to relax PDBs or stop stateful workloads)
for n in $(kubectl get nodes -l "eks.amazonaws.com/nodegroup=${NODEGROUP}" -o name); do
kubectl drain "${n##*/}" --ignore-daemonsets --delete-emptydir-data --grace-period=60
done
5. Once the pods are running on the new node, delete the node group
# Delete the nodegroup
aws eks delete-nodegroup \
--region "$REGION" \
--cluster-name "$CLUSTER" \
--nodegroup-name "$NODEGROUP"
Repeat the steps 2 - 5 for each node pool being migrated. You can reuse the same launch template for each nodegroup being migrated.