We had an issue with one of our AWS autoscaling workers recently. Our builds are set up so that CodeDeploy runs a deployment on the new EC2 when it’s launched by the autoscaling group. I think that what happened is that we made a change to the CodeDeploy application via CloudFormation while a deployment was running, and left stranded LifeCycle hooks. This was causing all new instances launched by the autoscaling group to fail with a Heartbeat Timeout
error. These docs and this post helped me to figure out and fix the issue. The short version is that I disconnected the CodeDeploy application from the AutoscalingGroup so that deployments would stop failing, then got the orphaned hook with this command:
aws autoscaling describe-lifecycle-hooks --auto-scaling-group-name $MyAutoScalingGroupName
and deleted it with this command:
aws autoscaling delete-lifecycle-hook --auto-scaling-group-name MyAutoScalingGroupName --lifecycle-hook-name $LifeCycleHookNameFromAboveCommandResult
We no longer receive the Heartbeat Timeout
error and workers are scaling and deploying properly.