Encryption All The Way To The Container In ECS With Envoy

Introduction

Setting up SSL for microservices is hard. You could terminate the TLS session at a load balancer at the edge of your network, but that may not be secure enough. You could set up each of your Docker images to get certificates and private keys at runtime. But that gets tricky if your services use a lot of different languages and frameworks.

Fortunately, there's a simple, unified way to set up TLS for all your microservices. In this post, I'll walk you through an example project that sets up Envoy to encrypt internal traffic in an AWS ECS service. All it takes is a config file and a few extra lines in your task definitions.

First, let's consider a different approach - SSL termination at the edge. A lot of architectures don’t encrypt internal traffic. These systems usually terminate SSL sessions at a load balancer at the edge of the network. They then forward the requests as plain HTTP to the backend services.

This approach provides security up to the edge of the network, and no further. For some enterprises, this is good enough. And it makes deployment a lot easier. Since the load balancer has decrypted the traffic, your apps don't have to worry about SSL.

But what if you can't send your data around in plaintext, even inside the walls of your private network? You may be required to use transport layer security all the way to the hosts where your containers are running. This is where Envoy comes in handy. Envoy helps with service discovery, tracing, and SSL. It only requires putting a container in ECR and putting a few extra lines in your task definitions.

This blog will introduce Envoy, and then walk you through the steps to set it up in ECS. Or if you want to skip straight to the example project, you can clone it here.

What is Envoy?

If you aren’t familiar with Envoy I recommend that you check out this blog post on the subject before reading any further.

You are probably familiar with front proxies that take inbound traffic and send it to backing services according to the path or port number. Envoy is different. It’s a sidecar proxy, a lightweight process that runs in a container alongside your app and proxies all traffic in and out.

Instead of sending your requests to a load balancer running somewhere else, you can send them to a proxy running on localhost. In other words, Envoy abstracts the network away from your services. In addition to abstracting away the network, Envoy can apply filters to inbound and outbound traffic to add encryption or tracing.

How do we set up Envoy to handle SSL traffic?

Envoy reads configuration from a yaml file at runtime. In this yaml file you can specify listeners to handle traffic. In our walkthrough, we’ll tell Envoy to use a http_connection_manager filter to use handle TLS traffic and proxy it to a Flask server listening on a different port on the same host.

In this example everything will run in ECS using self-signed certs, but in production you’ll want to use certificates from ACM.

In this exercise, we will use Terraform to:

  1. Build docker images for Envoy and a small Flask service. The Envoy image contains a startup script that retrieves the instance’s DNS name from an environmental variable, and uses it to create a self-signed SSL cert.
  2. Create an ECS cluster, service, and task definition, and a route53 DNS name pointing to it. By setting up an ECS service registry when we create our ECS service, CloudMap will automatically create the DNS name and map it to the Fargate instances where our tasks are running.

To follow this walkthrough you’ll need an AWS account with access to ECS. The resources created during this tutorial should not cost more than $5. Remember to call "terraform destroy" when you’re finished.

Prerequisites

Walkthrough

  1. Clone the example repo from Github

  2. Provision the ECS cluster, service, and task definition with Terraform. You must use terraform v0.12.

    You’ll be prompted for a VPC and subnet ID, as well as a role ARN for the ECS task, so have those handy. The role is used as both task role and task execution role. It needs (at least) permission to read from ECR, create log streams, and put records in logs in Cloudwatch
$ terraform init
$ terraform apply
  1. Go to the ECS console. You should see a new cluster called envoy-ssl-tutorial.

  2. You’ll need to SSH into an EC2 instance in your AWS account to test the HTTPS connection, since we’re using a private DNS namespace. From your instance, run the following command:

    The "-k" flag tells curl to operate in “insecure mode,” so that it accepts the self-signed cert
$ curl -k https://envoy-ssl-tutorial.dev/service

  1. If the connection works, you should see a message that begins "Hello from behind Envoy SSL proxy!" If the connection doesn’t work, check the VPC and subnet IDs that you passed to terraform, and make sure you passed the correct role ARN.

  2. Call terraform destroy to tear down the stack

Recommended Reading

Hopefully this blog and the connected example repo have inspired you to learn more. The following blog posts from TurbineLabs provide an excellent introduction.

  • terraform
  • aws