How To Simplify Your Terraform Code Structure

Cloudreach Cloud Systems Developer, Manuel Belsak, discusses how to approach organizing and simplifying your Terraform code.

Terraform, it is an open source tool that allows you to create, change and improve your infrastructure as code. You can share the code with colleagues so they can reuse it and improve it. The state of infrastructure is saved in a local or remote location, adding flexibility to deployment. 

My interaction with Terraform started just over a year ago. In that time, the structure and organization of my Terraform code have continuously developed and I have discovered new ways to improve and simplify my code. The biggest problem at the start was how to organize Terraform code in the way that it is clean and reusable. In this blog post, I will describe how you could organize your Terraform code.

This blog post assumes that you have basic knowledge about Terraform and it will focus on how to organize your code (You can find more information about Terraform here. FYI, documentation is structured, clear and with a lot of examples). All examples in this article will be based on AWS provisioner but the same logic can be applied for any Terraform provisioner.

 

The goal

  1. Build a central place from which everyone can pull Terraform code, contribute and reuse it
  2. Build independent project components
  3. Have a centralized remote place for storing Terraform state files
  4. Allow independent development of project components so developers are not blocking each other
  5. Use desired hosting service for version control. In this case, I will use Bitbucket

With these guidelines there are multiple possible solutions but the most important thing is that the solution is flexible and easy to use and understand.

Centralized Terraform modules

Terraform comes with functionality that allows you to group resources into a meaningful unity.  This is called a Terraform module. You can imagine a module as a container for multiple resources and each container is parametrized. It means that you can deploy multiple group resources by reusing the same module but just pass different parameters. 

For example, in AWS when you want to create a VPC. Usually, when you create VPC you also want to create DHCP options or Internet gateway if you want public access. In Terraform, you can build a module that will contain the creation of this AWS resources. After definition, you reference the modules with desired parameters and all the resources will be deployed. With this option, you are creating standardized Terraform code and you have full control of what and how will be deployed.

There are multiple ways how you can group Terraform modules, but the two main ones are:

  1. Have all Terraform modules in one centralized place – for example, a git repository
  2. Have decentralized Terraform modules – each module in a separate git repository

Both of the approaches have pros and cons, and, in the end, the structure of modules depends on the requirements of your project and your preferences.

AWS has covered the implementation of decentralized Terraform modules, so I will focus on a centralized organisational example. 

The idea is that you need one Bitbucket (or any other source control service) repository that will contain all Terraform modules. Each of the modules will be controlled by you and everyone can contribute to this repository. When your colleague adds a new module to the repository it creates a Pull Request, this Pull Request is reviewed and then merged to the repository. With that, you have the quality control of the modules. 

Example repo name: “aws-central-Terraform-modules”

The picture above is the example structure of Terraform modules. The root level of Bitbucket repository(aws-central-Terraform-modules) contains folders and each folder represents a logical group of project components. For example, “audit” folder(group) will contain module definitions for AWSCloudTrail, AWS Config.

The “aws-central-Terraform-modules/network” contains all AWS networking related Terraform modules, like nat_gateway, route53, subnet, vpc and vpn. Every one of these modules have the standardized file structure (main.tf, output.tf, variables.tf and README.md) which is nicely covered in official Terraform documentation. This file structure is not mandatory but it is recommended and it is standard in the community. The extra file is naming.tf, and it contains Terraform code to enforce naming convention of deployed resources. 

As another example, let’s take a look into folder and file structure of the instance profile module:

The picture is really similar to the previous one. The logical folder group for instance profile, from AWS perspective, is IAM. In the root folder of “aws-central-Terraform-modules” we have “iam” folder, and that folder contains three IAM modules: instance_profile_policy and role. Again, each of the modules has the same file structure, only Terraform code inside of the files is adjusted to a specific purpose.

Implementation of Terraform modules

What is defined for now is a centralized place for Terraform modules and logical structure of each module backed by standardized naming convention(which is optional). To use Terraform modules you need to build “Terraform solutions”, and by solution I mean Terraform code that will call pre-defined centralized Terraform modules with specific parameters. What you want to achieve is standardisation of solutions, and by that each solution will follow the same implementation standard. For example, the solution can be any web application, monitoring solution, logging solutions, AWS account creation and management, etc.

In abstraction of solutions you need to make sure that every solution is decentralized and isolated as much as possible, because in that case it is easier to have parallel developments and managing of state files is simplified.

For a really simple example let’s take deployment of WordPress site to AWS, and let’s assume that we want to separate deployment into two solutions (logical groups):

  1. Deployment of VPC, Subnets, NACLs, Route53 and IGW. This solution will contain all VPC and connectivity related components.
  2. Deployment of Instance profile, AutoScaling group, Security group, etc. This solution will contain all compute components and security groups.

Each of solutions has a distinguished purpose and can be developed on its own. Responsibilities are clearly divided and people with more specialized knowledge can control two separate solutions. The WordPress developers can fully focus on website development and networking/security specialists can work on hardening underlying infrastructure, and in the end, both sides are aware of each other’s work.

Every solution has a separate Bitbucket repository with the following structure:

  1. parameters folder – Containing Terraform parameters for specific deployment environment.
  2. pipelines – If you are deploying solution using CI/CD you can put your pipelines here
  3. solutions – Containing implementation code of defined centralized modules. Here you call defined modules and pass parameters
  4. tf_modules – This is the folder that contains the pointer to centralized Terraform modules. If you are using git this folder can be the centralized Terraform modules submodule

 

The repository(“aws-compute-wordpress”) Terraform solution that contains all VPC related components looks like this:

The content of solutions folder:

  1. lambda_sources –This is recommended default folder for all Terraform solutions. It will contain source code for lambda functions
  2. user-data –This is recommended folder for all Terraform solutions. It will contain user-data scripts or templates
  3. outputs.tf –This is recommended file for all Terraform solutions. It will contain Terraform outputs of solution
  4. variables.tf –This is recommended file for all Terraform solutions. It will contain Terraform variables of solution
  5. Vpn.tf, vpc.tf, route53.tf – These are specific Terraform files that contain actual calls and parameterization of centralized Terraform modules

The parameters folder contains the Terraform parameters for each environment in a specific region. The folder structure is:

  1. [environment name] –This folder will contain environment and region-specific Terraform parameters file. You can have multiple environment folders for one solution.
  2. global – This folder will contain common parameters for all environments and regions.

The idea is to deploy the same solution to multiple environments(dev, prod, stage) and also with the option of different AWS regions. For each environment, you need a separate folder under the parameters folder. Each folder represents an individual environment and it contains regional Terraform parameter files and also Terraform config file. The naming for regional parameters is ‘aws-region-name.tfvars’, for Frankfurt it will be ‘eu-central-1.tfvars’, for Paris it will be ‘eu-west-3.tfvars’ and so on. 

The config.tf file will contain Terraform backend specification, like:

  1. Where is your Terraform state file located
  2. Any remote state files you are using

It is recommended to get possible input values from remote Terraform state files. For example, if you are deploying an EC2 instance and you want to get vpc_id from remote state file and not asked it as Terraform input

In this case, when on ‘Terraform init’,  you have to specify the path to config.tf, so Terraform will know where to get state files. More about loading backend configuration can be found here.

You can have different strategies about where to save state file, how to lock it, how to use workspaces and so on. I don’t want to go too deep into that topic in this article. 

If you are using git to manage your code, I would suggest that you use git submodules for tf_modules folder. It works perfectly with public git servers or private ones.

Terraform solutions Overview

The described approach is just one way you can make your Terraform code more flexible, with centralized and reusable modules and decentralized solution implementations. Try to avoid making your Terraform modules too granular, try to build them with specific purpose and you can manage behaviour of module with parameters. For example, don’t build a Terraform module that will build just plain subnet, because the deployment of subnets always comes with NACLs, Route tables, Route table associations and Route table rules. So why not merge all of resources into one module, with parameters define what and how you want to deploy them?

The WordPress solution would look like this in the end:

You will have:

  1. A repository that will contain the implementation of VPC Terraform components
  2. A repository that will contain WordPress compute Terraform components
  3. A repository that will contain centralized Terraform modules
  4. A remote location where you can save your Terraform state files
  5. The WordPress compute solution Terraform configuration will contain a reference to remote Terraform state file of VPC Terraform component solution
  • terraform