038: Infrastructure provisioning with Terraform Skip to main content

038: Infrastructure provisioning with Terraform

Terraform is an Infrastructure-as-Code framework that is used to provision network, storage and computing resources on (nearly) any Cloud environment.


The weekly mood

I am back from vacation while many colleagues are still on leave. It has the advantage for me to review my past achievements, potentially go back to some topics I had not yet well understood or remembered, and think about further plans. I realize how much difficult it is to build my network and catch-up with projects when working from home. Naturally, I spend more time reading than chatting, which has the consequence than I don't learn anybody or anything by hazard, only on purpose. Also, I wonder a bit about my goals and areas of focus which are still disparate between concrete use-cases and general practices. 

I still have to get more confident with DevOps therefore I am now looking at Terraform.


What is Terraform

Terraform is an open-source infrastructure-as-code (IaC aka. "InfraCode") software tool created by HashiCorp. In case you never heard of HashiCorp, they are also the maintainers of Vagrant, Consul, Packer, Nomad and Vault.
Terraform helps at defining, versioning, provisioning (day 1) and maintaining (day n+) operational infrastructures. 
As opposed to Ansible which is procedural InfraCode and therefore based on conditional instructions (i.e. "how to setup and repair"), Terraform is descriptive and therefore based on desired state (i.e. "what to setup and repair").
The configuration is actually done via a high-level language called Hashicorp Configuration Language (HCL) in JSON format. The Hashicorp Interpolation Language (HIL) can be used as a complement since HCL does not support templating or dynamic expressions.
Terraform configuration parameters are always specific to an infrastructure provider since Terraform doesn't have a concept of platform-independent resource types.

Terraform components
  • (Free) Terraform CLI is the core application available for Mac, Linux, Windows, FreeBSD and Solaris.
  • (Free) Terraform Providers are infrastructure specific plugins that connect to Iaas/PaaS, ex. AWS
  • (Commercial) Terraform Cloud is the paid platform that includes:
    • either access to the managed service or the self-hosted distribution called "Terraform Enterprise"
    • developer collaboration (registry)
    • workflow orchestration (control plane)
    • support
Value of using Commercial vs. Free offering are discussed here.


Further utilities
  • HCL/HIL support available as Jetbrain Intellij/PyCharm plugin and Microsoft VSCode extension
  • Terragrunt is an open-source thin wrapper for Terraform CLI that provides extra tools for keeping your configurations DRY, working with multiple Terraform modules, and managing remote state.
  • Kubestack is an open-source framework that builds on top of Terraform for DevOps teams that want to implement GitOps as their operational model.
  • Terraform visual is an open-source lean editor for Terraform configuration
  • Codeherent is a general purpose visual development platform which integrates with Terraform

Terraform configurations
  • Data source is the input of a configuration, ex. HTTP
  • Resource is the molecular object ex. AWS Security group, Subnet etc.
  • Resource module is the atomic asset or collection of connected resources ex. AWS VPC
  • Infrastrcuture module is the grouping object or collection of resource modules
  • Infrastructrure composition is the combining object or collection of infrastructure modules
As already mentioned further above, configuration format is standardized but objects (ex. data sources and resources) are specific to the provider. So you need to refer to the provider reference page.


Terraform setup
$ snap install terraform
$ terraform -v
Terraform v0.11.11
$ terraform -install-autocomplete

Terraform commands
  • Day 1
    • init: Download providers required by all .tf files of current path (1 directory = 1 composition)
    • plan: Define execution setps to reach the desired state (-out for persisting)
    • apply [dir-or-plan]: Provision infrastructure as per configuration directory or execution plan
  • Day n+
    • refresh: Compare control view with the real-world
    • plan
    • apply
    • destroy: Decommission infrastructure as per the rollback plan
  • Optional
    • version: List currently used client + plugins and version
    • validate: Verify configuration syntax
    • fmt: Make configuration format pretty
    • graph: Generate a graph definition in Graphviz dot format
    • providers: List currently used providers and versions
    • state list: List managed configurations
    • output: Extract output values from current state

Docker provider

In this example, we will provision a simple Docker container based on a given public image (nginx).
You need to have docker installed on the Terraform client/operator machine.

We'll start by creating a project directory with two configuration files: 1 for variables and 1 for objects.
$ mkdir tf-docker & cd tf-docker
$ cat << EOF > variables.tf
variable "nginx_version" {
  default = "1.7.8"
}
EOF

$ cat << EOF > main.tf 
# provider with remote API enabled
provider "docker" {
  host = "http://127.0.0.1:2375"
}

# image resource
resource "docker_image" "nginx" {
  name         = "nginx:${var.nginx_version}"
  keep_locally = false
}

# container resource
resource "docker_container" "nginx" {
  image = "${docker_image.nginx.latest}"
  name  = "tutorial"
  ports {
    internal = 80
    external = 8000
  }
}

EOF
As we can see, not only variables (ex. var.nginx_version) but also resource values (ex. docker_image.nginx.latest) can be re-used as part of the configuration. Note that second one does not refer to the latest image version but to the actual defined value, so that the container will definitely deploy in the version defined as part of variables.tf.
$ terraform init
$ terraform -v
Terraform v0.11.11
+ provider.docker
For the beauty of it, let us generate and preview the graph for our configuration.
$ terraform graph > main.dot
$ dotty main.dot  



Now is time for execution, which will generate both a tfstate file, an image pull and a container instance.
$ terraform plan
$ terraform apply -auto-approve
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

$ ls *.tfstate
terraform.tfstate
$ docker images | grep nginx
nginx                                                                                                                   1.7.8                                 a343d51dff65        5 years ago         91.7MB
$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                  NAMES
94ae6b9827a5        8cf1bfb43ff5        "/docker-entrypoint.…"   2 seconds ago       Up 2 seconds        0.0.0.0:8000->80/tcp   tutorial 
Finally, let us clean and check our environment.
$ terraform destroy -auto-approve
Apply complete! Resources: 0 added, 0 changed, 1 destroyed.

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                  NAMES
$ docker images | grep nginx 

AWS provider

In this example, we will provision a VM instance on AWS (EC2) based on a given image (AMI).
You need to have awscli installed on the Terraform client/operator machine, as well as valid access and credentials to an AWS account.
$ mkdir tf-aws & cd tf-aws
$ cat << EOF > main.tf 
provider "aws" {
  profile = "default"
  region  = "us-east-1"
}
resource "aws_instance" "example" {
  ami           = "ami-2757f631"
  instance_type = "t2.micro"
}
output "instance_ip_addr" {
  value       = "${aws_instance.example.*.public_ip}"
  description = "Pulic IP address of the AWS EC2 instance."
}
EOF
As compared to the previous example, we added an output value that can be very handy for later operational monitoring.
$ terraform init
$ terraform -v
Terraform v0.11.11
+ provider.aws
$ terraform plan
$ terraform apply -auto-approve
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Outputs:

instance_ip_addr = [
    35.171.188.144
]

$ aws ec2 describe-instances --filters "Name=image-id,Values=ami-2757f631" --query "Reservations[].Instances[].InstanceId"
[
    "i-0ab668f067e787408"
]
$ terraform output instance_ip_addr 
35.171.188.144
$ terraform destroy -auto-approve
Apply complete! Resources: 0 added, 0 changed, 1 destroyed.

$ aws ec2 describe-instances --filters "Name=image-id,Values=ami-2757f631" --query "Reservations[].Instances[].State" 
[
    {
        "Code": 48,
        "Name": "terminated"
    }
]
Kubernetes

In this example, we will create a Kubernetes deployment resource and a pod operating an nginx Docker container.
The Terraform client/operator machine may run inside or outside the cluster. You need to have access to a Kubernetes cluster, and kubectl installed.
$ mkdir tf-k8s & cd tf-k8s
$ cat << EOF > main.tf
resource "kubernetes_deployment" "example" {
  metadata {
    name = "terraform-example"
    labels = {
      test = "MyExampleApp"
    }
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        test = "MyExampleApp"
      }
    }

    template {
      metadata {
        labels = {
          test = "MyExampleApp"
        }
      }

      spec {
        container {
          image = "nginx:1.7.8"
          name  = "example"
        }
      }
    }
  }
}
EOF

$ terraform init
$ terraform -v
Terraform v0.11.11
+ provider.kubernetes
$ terraform plan
$ terraform apply -auto-approve
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

$ kubectl get pods
NAME                                 READY   STATUS    RESTARTS   AGE
terraform-example-7877b6cf45-nl2q5   1/1     Running   0          26s
$ terraform destroy  -auto-approve
Apply complete! Resources: 0 added, 0 changed, 1 destroyed.

$ kubectl get pods
NAME                                 READY   STATUS    RESTARTS   AGE

See also

Take-away

Terraform is...
With 100+ infrastructure and platform providers supported, Terraform offers probably one of the largest configuration set of options for modern provisioning. Descriptive InfraCode can be centralized for version control (i.e. shared repository) and for pipeline automation (i.e. operator pattern). The core client builds on a plug-able architecture in order to translate high-level configurations into low-level native commands. Platform providers may offer their own plugin allowing DevOps and Customers to deploy to different Clouds and on-premise infrastructures.

Terraform is not...
Terraform does not check or notify about any architecture design mistake, duplicated development, wrong sizing, operational issue etc. Although the configuration structure is standardized, individual features and parameters unfortunately also do not allow any support for provider-agnostic configuration (ex. for multi-provider or migration scenario). Also, it is important to notice that Terraform plugins often require a native client (ex. awscli) to be setup and maintained by you on the operator machine, and that it might introduce a configuration bottleneck in regards to actual API capabilities and updates.

Where to go from here...
It is worth checking what other InfraCode languages and platforms like RedHat Ansible, AWS CloudFormation templates, AWS CDK, Azure Resource ManagerGCP Deployment Manager are all about, because I believe that there are pros and cons for using each one of them. This evaluation was limited to basic examples for the purpose of learning but could extend to a real-world use-case based on larger infrastructure in the future, involving more concepts left apart until now like for example data sources. 


Source code


Comments