In this post, Cloud Architect, Cyon John dives into the inner workings of Terraform providing guidance on developing new resources for a Terraform provider.

Before we begin, this article is written under the assumption that the reader is already familiar with Terraform and how it works from an end-user perspective. In this post, we are looking into part of the inner workings of Terraform, with a focus on developing new resources for a Terraform provider. This doesn’t cover adding a new provider itself, rather only adding resources to an existing provider.

High-Level View

Terraform has a simple and highly decoupled model that makes it easy for third-parties to develop and extend the features.

Terraform

Terraform is split into a core and a number of plugins with which it interacts. The core and each of the plugins are independent, statically linked, binary executables. The core loads the plugins and communicates with them via RPC.

The core provides common functionalities like:

  • Language support – CLI, HCL, interpolation, functions, modules etc.
  • State management
  • Dependency graph and plan execution
  • Loading plugins and delegating the work to plugins

Terraform depends on plugins for provisioning/managing resources and performing any actions on them. There are two types of plugins:

  • Provider plugins – create, update or delete actual resources. E.g AWS, GCP, Azure, Dominos pizza etc. (yes, there is a provider for ordering dominos pizza via Terraform!)
  • Provisioner plugins – Takes actions on resources post-creation e.g Ansible, remote-exec, file etc.

Our focus here is provider plugins and adding resources in an existing provider. Terraform has defined a mini SDK (earlier part of core, recently separated into an independent go module) that defines an interface for the resources and a few helper functions to aid development.

Terraform Workflow

Each provider plugin registers a schema, configuration function and a list of resources and data sources with the core. Each resource, in turn, consists of a schema and a set of CRUD functions. Terraform core uses the schema to parse the resource attributes from the code and then invokes the CRUD functions as required to create, update, read or delete the resource. The SDK provides many utilities functions for quickly defining reliable resources.

Workflow

Before starting the development, make sure that Golang 1.13 or newer is installed and configured properly.

The high-level flow is listed below:

  1. Fork the GitHub repo for the provider plugin. Each plugin is maintained in its own dedicated SCM repository
  2. Clone and check out the source code into your workspace
  3. Add new resource
  4. Register the new resource in the provider
  5. Add acceptance tests and make sure all the tests succeed
  6. Add documentation
  7. Push changes to your forked repo
  8. Create a PR to main repository following the guidelines.

Here we will focus on steps 4, 5, 6 and 7.

Source repositories

Terraform core – https://github.com/hashicorp/Terraform

Providers – https://github.com/Terraform-providers

For developing a new resource we only need the code repository of the specific provider which we are enhancing, everything else is vendored in. In this article, it is the AWS provider and is present at https://github.com/Terraform-providers/Terraform-provider-aws. All source code references in this are relative to Terraform core or the AWS provider plugin.

Resource

Typically, each provider repository contains a directory (Golang package) named after the provider itself. Each resource is defined in its own source file named using the convention:

resource_<provider>_<resource name>.go

We will use the example of AWS traffic mirror filter as an example and our first task is to create a new file

aws/resource_aws_traffic_mirror_filter.go

Next, we create a private initialization function (in Golang it means the function name starts with a small letter) named appropriately –

func resourceAwsTrafficMirrorFilterRule() *schema.Resource 

This function simply returns a Golang struct of type `schema.Resource`. Here `schema` is a package defined in Terraform plugin SDK and `Resource` is the data structure representing a resource in Terraform. Here is a quick peek at the `Resource` data type (showing only minimum required fields, please refer to the source code for full definition. Though there is not much of an SDK documentation, the source code is very well commented).

type Resource struct {
  Schema map[string]*Schema
  ...
  Create CreateFunc
  Read   ReadFunc
  Update UpdateFunc
  Delete DeleteFunc
  ...
}

NOTE: The word `schema` is occurring here with multiple meanings – package name in the SDK, a type and name of a variable.

Schema

The variable `Schema` inside `Resource` is a map whose keys are the names of the attributes supported by the resource and its values describe the nature of those attributes. The type `Schema` in the SDK is used to describe the nature of attributes of resources in a consistent way across providers.

Here is a quick look at type `Schema` with most common fields

type Schema struct {
  Type ValueType // should be one of the constants defined in ValueTypes
...
  Optional bool // whether the attribute is optional
  Required bool // whether the attribute is a must
  ...
  Elem interface{} // When the type is list, set or map, this indicates the value
  ...
  Description string
  ...
  ForceNew bool
  ...
  ConflictsWith []string // A list of attributes is conflicting with this
  ExactlyOneOf []string // Only of the attributes in the list can be specified
  AtLeastOneOf  []string // A list of attributes of which at least one is required
  ...
  ValidateFunc SchemaValidateFunc
  …
}

Here is how the schema for traffic mirror filter resource looks like:

 

    Schema: map[string]*schema.Schema{
      "description": {
        Type:     schema.TypeString,
        Optional: true,
        ForceNew: true,
      },
      "network_services": {
        Type:     schema.TypeSet,
        Optional: true,
        Elem: &schema.Schema{
          Type: schema.TypeString,
          ValidateFunc: validation.StringInSlice([]string{
            "amazon-dns",
          }, false),
        },
      },
    },

And here is how the corresponding resource definition would look in Terraform:

 

resource "aws_traffic_mirror_filter" "filter" {
  description = "test filter"

  network_services = ["amazon-dns"]
}

CRUD Functions

Once we are ready with schema for the resource we need to provide the functions for creating, updating, reading and deleting (CRUD) the resource. The functions are defined in the same source file as private functions and the pointer/reference is stored in the `Resource` structure in appropriate fields.

Another common field in the `Resource` structure is the field `Importer`. Setting this field makes the resource importable.

Finally, the initialization function for traffic mirror filter would look like below:

 

func resourceAwsTrafficMirrorFilter() *schema.Resource {
  return &schema.Resource{
    Create: resourceAwsTrafficMirrorinFilterCreate,
    Read:   resourceAwsTrafficMirrorFilterRead,
    Update: resourceAwsTrafficMirrorFilterUpdate,
    Delete: resourceAwsTrafficMirrorFilterDelete,
    Importer: &schema.ResourceImporter{
      State: schema.ImportStatePassthrough,
    },
    Schema: map[string]*schema.Schema{
      ...  // contents of schema is shown earlier
},
  }
}

Anatomy of a CRUD function

All CRUD function should have the following signature:

func (d *schema.ResourceData, meta interface{}) error

CRUD function receives two arguments when invoked by the core and is expected to return an error object which in case of success should be set `nil`. The two arguments are explained in detail below

  • ResourceData – This holds the details of the resource including the parsed config and the current state. This also has a set of utility functions, some of which we examine quite soon.
  • The second argument is an interface that is provider-specific and should be type-casted to the appropriate type before using. This holds the connection object to provider.  In the case of an AWS provider, this is of type `AWSClient` and contains an established connection to the AWS as defined in the provider configuration.

Note: The AWSClient has a member field for each client type supported by AWS SDK. If the new resource being added is managed by a new client type then there are additional tasks to be done in `aws/config.go` to define a new field for the client and initialize it as part of provider config.

Here is a short description of the utility function provided by `ResourceData`

  • SetId(id string) – Set the ID of the resource in the Terraform state. If the value of  `id` is blank it would destroy the resource
  • Id() – Returns the ID of the resource.
  • GetOk(attribute string) (interface{}, bool) – Returns the value of the attribute in the Terraform configuration and whether it was set or not. The first value is valid only if the second return value is true.
  • Set(attribute string, value interface{}) – Sets the value of an attribute in the state. The value should match the type for the attribute defined in the schema.
  • HasChange(attribute string) bool – Returns whether the value of the attribute has changed in configuration. This is convenient in update function to identify the changed attributes
  • GetChange(attribute string) (old, new interface{}) – Should be called only if HasChange() returned true, in which case this returns the old and new values of the attribute.
  • Partial(bool) – Enable or disable partial state mode. When enabled only the keys specified via SetPartial method will be saved in the final state. This can be helpful in cases where a resource in Terraform needs multiple API calls and is created in steps by avoiding to destroy and recreate in case of errors.
  • SetPartial(attribute string) – Make an attribute part of the partial state that would be preserved. This has an effect only when partial mode is enabled.

At high-level the CRUD function follow the below pattern –

  1. Retrieve the provider connection object
  2. Fetch config/state details from the ResourceData
  3. Initialize the input structures for provider API call and make the call
  4. Update the state of the resource
  5. Return success/error.

Create function – Reads the attributes from the ResourceData, invokes provider API to provision the resource and update the attributes of the created resource in the ResourceData. Special care should be taken to set a unique value (from the provider perspective) for the Id() as we would need to retrieve the resource details in read function just using this value. If there is no single value for the resource that can identify itself, then it might be required to create a composite ID.

Update function – Identify the configuration changes and update the resource with minimal impact. If the only way to update a resource is to destroy and recreate then this function could be left empty and Terraform core would utilize destroy and create to achieve the update.

Read function – Retrieve the Id() from ResourceData and fetch the latest details of the resource from the provider and update the ResourceData.

Delete function – Retrieve the Id() from ResourceData and delete the resource in the provider. If successful set the Id() to blank.

As an example, here is the create function for traffic mirror filter create:

 

func resourceAwsTrafficMirrorinFilterCreate(d *schema.ResourceData, meta interface{}) error {
  conn := meta.(*AWSClient).ec2conn
  input :=  &ec2.CreateTrafficMirrorFilterInput{}
  if description, ok := d.GetOk("description"); ok {
    input.Description = aws.String(description.(string))
  }
  out, err := conn.CreateTrafficMirrorFilter(input)
  if err != nil {
    return fmt.Errorf("Error while creating traffic filter %s", err)
  }
  d.Partial(true)
  d.SetPartial("description")
  d.Partial(false)
  d.SetId(*out.TrafficMirrorFilter.TrafficMirrorFilterId)
  return resourceAwsTrafficMirrorFilterUpdate(d, meta)
}

Registering new resource

This is easy, go into source file `aws/provider.go`. Look for a variable `ResourcesMap` in `Provider` data structure. This is a map with keys being the resource name (as it would be in the configuration) and value is `Resource` data structure which by convention is set by calling the initialization function we defined.

Documentation for Terraform

Building and testing

Terraform provides a make file as wrapper to run go build commands. Though the plugin could be built directly using go commands, it is better to use the make file as it reduces the chance of build failures in CI when we create pull request.  Run the following commands in the root directory of the source repository to build the plugin

$ make lint
$ make build

 

The plugin would be placed in the “$GOPATH/bin/” directory with the name of the provider. In case of AWS it would `Terraform-provider-aws`.

Now for testing the new resource, I prefer writing the Terraform configuration and running Terraform commands. This is because I find it faster because testing using the acceptance test framework takes too much time as each step would need to create and destroy resources every time. But for this to work we should make plugin visible to Terraform. Detailed steps for plugin discovery is documented here – https://www.Terraform.io/docs/extend/how-Terraform-works.html. What I found easiest is to copy the binary which we built to the directory where the Terraform code resides following a naming convention. For e.g

$ cd <Terraform config directory>
$ cp $GOPATH/bin/Terraform-aws-provider ./Terraform-provider-aws_v3.35.0_x4
$ Terraform init
$ Terraform plan -out apply.out
$ Terraform apply apply.out

Note: Terraform needs to be re-initialized to use the new plugin, so running “Terraform init” is important.

 

Acceptance Tests

Hashicorp doesn’t accept any pull requests that doesn’t implement appropriate acceptance tests for the resources or bug fixes. Terraform uses the standard testing framework of Golang and has added a set of helper functions to writing test cases easier. For each resource there should be corresponding tests defined in a source file in the same directory with the following naming convention:

resource_<provider>_<resource name>_test.go

For example, the tests for traffic mirror resource defined in resource_aws_trraffic_mirror_filter.go should be in the file resource_aws_traffic_mirror_filter_test.go.

The `TestCase` data structure in the resource package of SDK, should be used to define a single acceptance test case that tests create/update/delete lifecycle of a resource. Read gets tested implicitly in the process. A single acceptance test can simulate multiple Terraform applies in sequence. SDK also provides a function `ParallelTest()` to trigger the test. Here are some of the important fields for TestCase

  • PreCheck – A callback function to verify any prerequisites for the test case. At the bare minimum the `testAccPreCheck()` function from aws/provider_test.go can be reused to verify the connectivity to provider. If possible and additional custom function can be implemented to quickly verify the provided credentials have the required permissions needed for the resources.
  • Providers – Provider to be used for testing. Most of the time it would sufficient to set this to global variable `testAccProviders` which is enough. But if the test requires multiple providers then provider factories will need to be used, but is not covered here in detail.
  • CheckDestroy – A callback function that would called at the end of the test case. The function is expected to iterate through all resources in the state and make sure they are actually deleted from the provider.
  • Steps – List of TestSteps that makes the test case. TestSteps are described in more detail below.

Each TestCase can have multiple steps with each step corresponding to a Terraform apply action. Since the Terraform state is passed from one test step to another this can be used to simulate updates/changes to resources. At the end of the entire test case the framework deletes all resources that is present in the state. So there is no need to define a test case for delete, but instead is done by passing a callback function that can verify that all resources are indeed deleted.

Each test step takes two fields

  1. Config – Terraform config that defines the resource under test. Framework performs the apply action on this config and when ready invokes the `Check` function to verify.
  2. Check – A callback function that verifies that resource has been created as desired. If the resource is not in desired state then the function should return an error.

There are many utility functions provided to verify the desired state of the resources and here are some of the common ones –

  • TestCheckNoResourceAttr(resource string) – Verifies that the resource is present in the Terraform state.
  • TestCheckResourceAttr(resource string, attribute string, expected string) – Verifies that the named resources has the expected value for the specified attribute in Terraform state.
  • TestCheckNoResourceAttr(resource string, attribute string) – Verifies that the resource doesn’t have the attribute set in Terraform state.
  • ComposeTestCheckFunc(fn …TestCheckFunc) – Helper to combine multiple assertions into a single check function. The function accepts a list of TestCheck* functions and returns a function that returns error if any one if it fails.

The basic test flow is to define a Terraform configuration for the resource and pass it

One important point to keep in mind with respect to Terraform testing is that the state is passed from one TestStep to next TestStep. This could be made use to test scenarios for updates and deletes.

Here is the test case for traffic mirror filter:

 

func TestAccAWSTrafficMirrorFilter_basic(t *testing.T) {
  resourceName := "aws_traffic_mirror_filter.filter"
  description := "test filter"

  resource.ParallelTest(t, resource.TestCase{
    PreCheck: func() {
      testAccPreCheck(t)
      testAccPreCheckAWSTrafficMirrorFilter(t)
    },
    Providers:    testAccProviders,
    CheckDestroy: testAccCheckAwsTrafficMirrorFilterDestroy,
    Steps: []resource.TestStep{
      //create
      {
        Config: testAccTrafficMirrorFilterConfig(description),
        Check: resource.ComposeTestCheckFunc(
          testAccCheckAwsTrafficMirrorFilterExists(resourceName),
          resource.TestCheckResourceAttr(resourceName, "description", description),
          resource.TestCheckResourceAttr(resourceName, "network_services.#", "1"),
        ),
      },
      // Test Disable DNS service
      {
        Config: testAccTrafficMirrorFilterConfigWithoutDNS(description),
        Check: resource.ComposeTestCheckFunc(
          testAccCheckAwsTrafficMirrorFilterExists(resourceName),
          resource.TestCheckNoResourceAttr(resourceName, "network_services"),
        ),
      },
      // Test Enable DNS service
      {
        Config: testAccTrafficMirrorFilterConfig(description),
        Check: resource.ComposeTestCheckFunc(
          testAccCheckAwsTrafficMirrorFilterExists(resourceName),
          resource.TestCheckResourceAttr(resourceName, "description", description),
          resource.TestCheckResourceAttr(resourceName, "network_services.#", "1"),
        ),
      },
      {
        ResourceName:      resourceName,
        ImportState:       true,
        ImportStateVerify: true,
      },
    },
  })
}

Running the tests

Running Terraform acceptance tests creates real resources and can cost money, so please make sure that the correct account is configured by setting AWS_PROFILE properly.

NOTE: if your profile is in ~/.aws/config instead of ~/.aws/credentials, note that the Golang AWS SDK doesn’t load the config file by default. You need to explicitly set AWS_SDK_LOAD_CONFIG=true

The test can be invoked from the root directory using the following command:

$ make testacc TEST=./aws TESTARGS=”-run=TestAccAWSTrafficMirrorFilter”

There is a community in gitter where you could look for community help in case you encounter some issues.

Documentation

Hasicorp also requires the pull request to have appropriate documentation for the resources. For a new resource, first we need to add appropriate html in the file `website/aws.erb`.  The Easiest way is to copy the lines corresponding to an existing resource and insert it in an appropriate place (alphabetically ordered) and update the text. This is used by CI tools to generate the sidebar which we see in Terraform documentation page. Here is an example of changes made for traffic mirroring resources:

Documentation for Terraform

Now we also need to create a new file (one per resource) that corresponds to the hyperlink we mentioned on the page. These are markdown documents residing in the directory `website/docs/r/<resource name>.html.markdown`. Again, it is easier to copy an existing document and update it so that is easy to conform to the requirements. You could also run the following command to perform some sanity checks on the documentation:

$ make docscheck

Pull Request

Once all the tests are passing and documents are ready, you can create a pull request from your forked repository to the original repository. There are a set of guidelines to follow while creating the pull requests and is described here:

https://github.com/Terraform-providers/Terraform-provider-aws/blob/master/.github/CONTRIBUTING.md#pull-requests

References

Read more blogs from the Cloudreach Tech Community here.