This is the first part of a two-part tutorial. You can find Pt.2 here.
Organizations who host a large portion of their infrastructure on Amazon Web Services (AWS) may eventually consider migrating to containers for a variety of reasons. If you’re considering migrating to containers, you’ve probably read about interesting open source frameworks like Rancher, Docker Swarm, DC/OS, Mesos and Marathon, Kubernetes, Nomad, and the native AWS service, EC2 Container Service (ECS).
In this post, we’ll compare the lesser-known ECS service of AWS with the more popular Kubernetes orchestrator. We’ll also use the Kublr platform to avoid manually installing and configuring Kubernetes.
After reading this post, you’ll understand the strengths and weaknesses of ECS. You’ll also see how containerized applications are supported natively on AWS without the need for external tools and clusters. (For example, the service discovery mechanism, based on Application Load Balancer, allows us to avoid using Consul/Zookeeper or a similar service discovery tool.)
EC2 Container Service Overview
ECS was announced in April 2015 as a very basic solution with limited features. ECS didn’t have a dynamic port mapping of containers to the host, which means that if you have 10 containers exposing port 80, for example, you can run only one such container per EC2 instance. An attempt to schedule a second container with port 80, on the same machine, would throw a “this port already in use on that instance” error, rendering the ECS service rather useless because it was impossible to place many containers of the same type on the same host.
Despite ECS not being feature rich at introduction, the Amazon team continued development and now it is a viable solution for containerized workloads. ECS has a very strong selling point: you don’t have to maintain a highly available, complex Kubernetes cluster by yourself, which is significant work when maintaining and upgrading a large multi-master Kubernetes cluster.
Instead, AWS takes care of everything related to ECS agent logic and scheduling decisions, with no need to install your own “master” nodes. Also, ECS automatically manages auto scaling based on CloudWatch alarms, allowing you to restrict access to Docker images with IAM policies and assign the IAM role to a container.
Besides the fact that you don’t have to worry about multi-master setup of the “brain” of your orchestration system (Kubernetes masters), there are few other benefits of using the ECS service over Kubernetes. The workload types ECS can handle are very basic, and features like connecting Application Load Balancer to route traffic to containers are available also for Kubernetes (due to the fact that Kubernetes has good integration with AWS, developed by many Kubernetes contributors). The things you can do with ECS and cannot do with Kubernetes are very few (and even those have “workarounds” built by the community, like the ability to use an IAM role in Kubernetes pod). In summary, the main reason to choose ECS over Kubernetes would only be the lack of capacity within the DevOps team to maintain their own highly available Kubernetes cluster. Even then, you may choose to use a service like Kublr to remove the complexity, allow you to benefit from all the Kubernetes ecosystem can offer, and avoid complex manual cluster maintenance.
To start using ECS, you simply create a “cluster”. In ECS terms this is a logical clustering unit, created with a few clicks through AWS console.
Next, install a simple ECS agent on all of your instances that need to connect to that cluster. Alternately, you can use a predefined AMI image that includes the ECS agent; this AMI also has a proper EBS data disk partition optimized for Docker. Upload your Docker images to the ECR image registry, and create a new Task Definition. The Task Definition defines your container settings like volumes, image name, environment variables, etc. If you’re familiar with Kubernetes, think of this as a “pod template”. You can read more about all the options available in a Task Definition; you’ll see these are the bare minimum to launch a Docker container.
Here is an example of a small ECS-based deployment
Here are a few important details about the main components of every ECS deployment
- ECS Cluster: This is a logical cluster of instances. You connect any chosen EC2 instance to a particular cluster to schedule the cluster tasks on the instance. There can be many logical “clusters” in your AWS account (current default soft limit is 1000 per region). These are simple metadata objects and do not cost anything; you pay only the regular EC2 price of used instances. But the container scheduling mechanism is based on clusters, so if you add one instance to cluster “A” and five instances to cluster “B”, any task definition launched into cluster “A” will only be able to run on that one machine in cluster “A” and will never scale or move to machines of cluster “B”. Autoscaling only runs tasks within a single ECS cluster.
- ECS instances: These are EC2 instances you connect to an ECS cluster by defining the cluster name in the “/etc/ecs/ecs.config” file. The ECS agent will connect to a cluster and communicate about available resources of the instance, receiving signals back from the cluster about the containers it has to run.
- Task Definitions: These are container definitions with few features. One feature to note: you can constrain a task to run only on instances that have a particular “attribute”. There are a few built-in attributes such as “availability zone” and “instance type”, which automatically exist for each ECS instance, but you can add “Custom attributes” to your instances (for example, “production”, “QA”, “development”). You can read more about constraints.
- Task: This is a running job created from a Task Definition. It may contain one or more running containers.
- Scheduled Task: This is the same as a standalone task, but will run repeatedly on the schedule specified in CloudWatch Events.
- Services: These are similar to a “replica set” of “deployment” in Kubernetes. A Service maintains a given number of replicas running and restarts failed tasks when needed. It can be connected to a load balancer like ELB or ALB, automatically writing the rules into load balancer settings when tasks are added or removed and ensuring the load balancer always knows which containers it needs to point to and at which ports on the instances. In the next section, we’ll do a quick tutorial to introduce ECS Services and other components of ECS.
- ECR registry: This is similar to any other Docker registry, where you can upload and store your images. You have an option to restrict access to images and repositories based on IAM users and roles. You can learn more about related IAM policies.
Now that you have a basic understanding of ECS terminology, we can compare ECS to Kubernetes
|Kubernetes cluster and its “Namespaces”. The Namespace is a logical separation to “Virtual clusters”, and allows access control for users and quota limits. Used in service discovery as part of a DNS name which leads to a deployed service. Cluster worker nodes are shared by all Namespaces by default, but network separation is also possible to tighten security within the same cluster (in addition to RBAC roles security).||ECS Cluster. Both a logical and physical separation of deployments and services. All instances assigned to a cluster can be used only by that cluster.|
|Kubernetes “Master” nodes contain the components that do the logic of scheduling and replacing pods, storing and updating cluster state, exposing the API to other components like the dashboard, interacting with the underlying cloud API to provision resources, and more. Here is a list of master node components.||No counterpart|
|Kubernetes worker “Node”, a server or cloud instance that runs the workloads.||ECS Instance. An EC2 instance connected to a cluster.|
|Kubernetes “Pod definition”, a manifest describing containers to run. Holds all details of container configuration, and used as a template to “spawn” one or more replicas of described containers to the worker nodes.||ECS Task Definition. The template for Tasks.|
|Kubernetes “Pod”, a set of one or more running containers.||ECS Task. Running workload, can have one or more containers.|
|Kubernetes “Cron Job”, will run any pod on schedule.||ECS Scheduled Task. Run selected tasks on schedule.|
|Kubernetes “Deployment”, a resource that takes care of running a particular set of containers at all time. Auto scales based on metrics, and performs rolling updates when a new version of a pod is deployed. Kubernetes deployment is much more feature rich than ECS Service.||ECS Service. Contains the deployment settings (which Task Definition to use), auto scaling policy, and determines which load balancer to connect running nodes to.|
|Kubernetes “Service“, an internal load balancer that features an easy to use service discovery using a DNS format of “service-name.namespace.svc.cluster.local” and “pod-ip-address.my-namespace.pod.cluster.local”.||No counterpart|
|Kubernetes “Stateful Set”, a deployment model with statically named pods, which will start in strict order, allowing for bootstrap and auto-heal of a stateful service like database.||No counterpart|
|Kubernetes “Daemon Set”, a type of task which will run a single instance of a pod on every node in a cluster. Useful for log collectors, network plugins, and any “sidecars” that have to run on each node.||No counterpart|
|Kubernetes “Persistent volume claim”, a definition of disk resources needed for a pod. Can be used by pods to claim particular disk space on a storage type of choice.||No counterpart|