Bringing Squid Proxy Into The 21st Century With AWS Fargate
When talking about building and deploying applications in the AWS ecosystem, one topic that comes up without fail is how to securely manage outbound internet traffic from private subnets. How can you operate a controlled environment that prevents data exfiltration or possible data leaks with a minimum amount of management overhead?
There are commercial tools available such as NextGen Firewall or Web Proxy that can filter/block outbound web traffic but these tools require a license as well as ongoing maintenance of the software and the related AWS EC2 infrastructure.
AWS has published an excellent article on How to Add DNS Filtering to Your NAT Instance with Squid, that covers the reasons for choosing a Squid-based solution to solve this problem.
Inspired by this solution, I want to take the architecture and apply modern AWS technologies like AWS Fargate and the Network Load Balancer to bring the solution into the cloud-native realm.
The solution is based on the following principles:
- Provides a secure internet connection to a wide AWS landscape (multi-account/ multi-region)
- No Servers to Maintain/Update/Upgrade
- Needs to support high bandwidth throughput
- Highly available solution
- Flexible cost based on usage
Why did I use these AWS services?
AWS Fargate is a compute engine for Amazon ECS that allows you to run containers without having to manage servers or clusters. With AWS Fargate, you no longer have to provision, configure, and scale clusters of virtual machines to run containers. This removes the need to choose server types, decide when to scale your clusters or optimize cluster packing. AWS Application Scaling enables you to configure automatic scaling for AWS Fargate in a matter of minutes.
operates at the connection level (Layer 4), routing connections to targets - Amazon EC2 instances, micro-services, and containers - within Amazon Virtual Private Cloud (Amazon VPC) based on IP protocol data. Ideal for load balancing of TCP traffic, Network Load Balancer is capable of handling millions of requests per second while maintaining ultra-low latencies.
AWS Cloudwatch Logs service allows you to collect and store logs from your resources, applications, and services in near real-time. Using the AWS ECS awslogs drivers makes it possible to publish the output log of Docker without any additional tool.
Why use AWS Network Load Balancer?
AWS Network Load Balancer offers a very flexible configuration and high-performance connection but it also introduces the ability to configure a "Service Endpoint". Using Service Endpoint enables you to publish the Squid UTM Service across multiple accounts and across multiple regions, using VPC PrivateLink Inter-Region, in a secure way controlling the allowed/blocked traffic in a single location.
This solution combines the Infrastructure As A Code using Terraform and the AWS ECS deploying a strategy to update the configuration of the Squid Farm, using a zero-downtime strategy.
This solution enabled:
- Internet access using a proxy with a controlled whitelist/blacklist
- Avoid using AWS VPC peering with complex routing using AWS Service Endpoint
- ECS provides the high-availability required maintaining the Fargate count required
- No Patch/Updates will be required anymore to maintain the base OS
The final solution satisfies all principles enabling the usage of Squid in a highly dynamic environment:
- AWS ECS will handle zero-downtime deployment on every configuration change and also ensuring the high-availability and load scaling process of AWS Fargate.
- AWS Network Loadbalancer will guarantee high throughput and ultra-low latency cross region and cross-account connectivity.
These services combined together will transform and modernise the URL filtering with Squid into a cloud-friendly design.
All terraform code and Docker configurations are available on GitHub please help us improve this solution.
If you have any implementation or troubleshooting questions, please open an issue in our repository.