Title: (Part III of III)- ECS Rolling Update Deployment

Nagender Singh
5 min readJan 23, 2021

This is last part of a three part series.

Part1: https://nsrblogs1387.medium.com/part-i-of-iii-aws-ecs-autoscaling-deployment-rollback-demystified-2c6a0d25486d

Part2: https://nsrblogs1387.medium.com/part-ii-of-iii-ecs-blue-green-deployment-aa3db5b5a993

In this part we will go through the concepts of ECS Rolling Update Deployment.

First off, why I prefer to use Rolling Update Deployment over Blue Green Deployment:

Over the past few months AWS has been adding a lots of improvements in Rolling Update Deployment.

1) Capacity Providers: This would be the most important feature of Rolling Update Deployments. With this I can leave the scaling and instance termination worries with ECS and focus on my actual deployments.

ECS Cluster Auto Scaling (CAS) uses something called Capacity Providers in the ECS Cluster or Services to manage the scaling.

Capacity Providers are associated with ASG and they coordinate the scaling activities closely with ASG.

Why use Capacity Providers:

Before understanding Capacity Providers, you need to understand that the EC2 ASG is separate entity from ECS.

So if ASG is solely responsible for ECS Autoscaling just based in ECS metrics, it won’t be as smooth as we like it to be.

As a matter of fact, ASG is not aware of some of things ECS is doing and so can’t take decisions on behalf of ECS.

Let me simplify this further. What if load is below set target and a scale-in is required, can ASG look into the instances and identify which instances it should not terminate because the instance has some ECS tasks running on it?

No, it can’t because ASG is not aware of what and where ECS tasks are being run in particular.

I know, you fill that, to handle this ECS needs to be more involved with the Autoscaling and should have some kind of control over ASG.

And that’s exactly what Capacity Providers were designed to do.

Below are the additional features and benefits of using ECS with Capacity Providers:

  • Managed Scaling:

With the CAS, you don’t have to attach the scaling policies manually (like we did in previous section), rather ECS creates the required policy by itself in the ASG.

You just need to mention the target usage in the Capacity Provider, for example if you set it to 100 then it will try to use 100% capacity of the instances before launching new.

From my testings, I can vouch that managed scaling works better than the manual scaling.

  • Managed Instance Protection:

This feature is specifically to avoid terminating the instances in case of scale-in which have some tasks running on it.

So, if you go to the ASG in Console and check the ‘Actions’ under ‘Instance Management’, it will show options like ‘Set Scale-in protection’ and ‘Remove Scale-in Protection’

If an instance has tasks running on it, ECS CAS will set Scale-in Protection on it automatically, this will make the instance immune to scale-in activities.

  • Attachment of multiple ASGs with the ECS Cluster:

With Capacity Providers, you can associate more than one ASGs with your ECS cluster.

This can help you to run your ECS cluster workload on different type of compute environments. For example, critical tasks on on-demand ASG and trivial ones on spot.

This is an advance use-case and I plan to do a blog on it.

2) CircuitBreaker: Previously, Rolling Update deployments had no rollback mechanisms in place. Meaning, if a deployment task failed then ECS will create a new task until it becomes healthy and this would go indefinitely without taking in account the failed tasks.

This has been fixed recently with CircuitBreaker. CircuitBreaker detects failure behaviours and rolls back automatically.

So trade-off would be better GUI (Blue/Green) vs better features (Rolling Update). Choose wisely!!

Let us see how Rolling Update deployment works.

Stage 1: Start of deployment:

Rolling Update Deployment Started

So there are 3 existing tasks and the new deployment starts a single task (not 3 like B/G). In total, there are 4 tasks running now.

Stage 2 (In case of success): Task Registers healthy:

One of the tasks is healthy

The task started by new deployment registers healthy, this will terminate one of the old task and start a new task. In total, there are 4 tasks running.

Stage 3 (In case of success):

Another tasks registers healthy

Deployment still in progress…

Last one registers healthy as well
New deployment fully in place now

This process goes on until all the tasks are replaced.

Stage 2 (In case of failure without CircuitBreaker):

Bad News!

If new deployment tasks fail and there is no CircuitBreaker configuration, ECS will keep launching new tasks indefinitely.

Stage 2 (In case of failure with CircuitBreaker):

No worries..can handle it…

With CircuitBreaker in place, it detects failures of tasks from the new deployment. No further tasks are triggered and deployment is stopped.

So this would be the end of this series, I hope that this would help you to make right decision when using ECS.

Feel free to share your thoughts, suggestions and queries in the comment box.

Cheers!!!

--

--