Terraform: multiple environments
Why terraform?
When one starts a new project, there is a high probability that the project will be placed in the cloud. Having a project in the cloud carries a couple of issues. One of them is that the project grows in size over time, and new modules, services, and databases are added. After some time, it’s hard to visualize the whole project from a higher perspective and know why that particular component is used. Nowadays, it’s a standard that we have multiple environments for the same application. Days when we deployed our code straight to production are gone. Here, to the help comes terraform. It externalizes the infrastructure to the code and allows the deployment of the same infrastructure multiple times to various environments.
Application setup
Terraform is just a tool, and there are many ways to use it. In our example, let’s imagine we have to deploy our application to 2 environments, DEV and PROD, and we use GCP as a cloud provider.
We want to end up with the following:
So, our application consists of:
- Cloud run
- Cloud storage bucket
We have the same components in both environments, but they differ in their properties. Our example shows a short retention policy on the GCP bucket in the DEV environment. We don’t have to keep data as long as it is in production. The cloud run difference is that PROD always has one instance available to handle the traffic.
DEV and PROD are two separate GCP projects.
Ways to handle multiple environments
- Git branches
- Folders for each environment
- Terraform variable files
- Terraform workspaces
Git branches
We can create as many branches in Git as we want. In that example, we have two environments and two branches.
> git branch
* dev
prod
On the dev branch, we create the terraform files corresponding to the DEV environment; on the prod branch, we create terraform files corresponding to the PROD environment. Simple as that. This is probably the easiest way to handle multiple environments, but that solution has many disadvantages.
As an example, let’s check how the infrastructure will look in each branch
- Dev branch:
resource "google_storage_bucket" "example-bucket" {
name = "example-bucket"
location = "US"
project = "terraform-example-dev"
lifecycle_rule {
condition {
age = 1
}
action {
type = "Delete"
}
}
}
resource "google_cloud_run_service" "default" {
name = "cloudrun-service"
location = "us-central1"
project = "terraform-example-dev"
template {
metadata {
annotations = {
"autoscaling.knative.dev/minScale" = 0
}
}
spec {
containers {
image = "us-docker.pkg.dev/cloudrun/container/hello"
}
}
}
}
- PROD branch:
resource "google_storage_bucket" "example-bucket" {
name = "example-bucket"
location = "US"
project = "terraform-example-prod"
lifecycle_rule {
condition {
age = 30
}
action {
type = "Delete"
}
}
}
resource "google_cloud_run_service" "default" {
name = "cloudrun-service"
location = "us-central1"
project = "terraform-example-prod"
template {
metadata {
annotations = {
"autoscaling.knative.dev/minScale" = 1
}
}
spec {
containers {
image = "us-docker.pkg.dev/cloudrun/container/hello"
}
}
}
}
To apply the configuration we run
terraform apply
on each branch.
PROS:
- Everyone knows how to use Git – low entry barrier.
CONS:
- As you can see, there is considerable code duplication. We have to copy everything from one branch to another, and on top of that, we have to change some field values. This is error-prone; it’s straightforward to make a mistake.
- We don’t have a high-level overview of the differences between the two environments.
- It’s not easy to scale – if we want to add an additional environment, we must go file by file and check which properties we should change.
- If we want to add a new component as a developer, we have to create multiple pull requests – one for each environment.
Folders for each environment
Instead of using git branches we can utilize folders to structure our components. We achieve almost the same, but we also have an overall view of everything. This is how it would look like:
PROS:
- We have a good overview of the entire infrastructure
- It’s easy to add a new component which affects only the environment we want
- We can create one pull request that affects multiple environments
CONS:
- Still, we have a vast code duplication.
- It’s not easy to scale
Terraform variable files
Terraform gives us a way to externalize differences by using variables. This is how the files look like in the repository.
Each component has its file, and we also have variables.tf and *.tfvars files. Content of the variables.tf file is below. It contains the definition of the variables.
variable "project_id" {
type = string
default = ""
}
variable "retention_policy_in_days" {
type = number
description = "How long should we keep files before deletion"
default = 1
}
variable "min_instance_count" {
type = number
description = "Minimum number of instances which are always available"
default = 0
}
Content of the dev.tfvars file. Here are the values for the above definitions
project_id = "terraform-example-dev"
retention_policy_in_days = 1min_instance_count = 0
Content of the prod.tfvars file:
project_id = "terraform-example-prod"
retention_policy_in_days = 30min_instance_count = 1
Content of the bucket.tf file:
resource "google_storage_bucket" "example-bucket" {
name = "example-bucket"
location = "US"
project = var.project_id
lifecycle_rule {
condition {
age = var.retention_policy_in_days
}
action {
type = "Delete"
}
}
}
Content of the cloud-run.tf file:
resource "google_cloud_run_service" "default" {
name = "cloudrun-service"
location = "us-central1"
project = var.project_id
template { metadata {
annotations = {
"autoscaling.knative.dev/minScale" = var.min_instance_count
}
}
spec {
containers {
image = "us-docker.pkg.dev/cloudrun/container/hello"
} }
}
}
The terraform state in this example will be stored in a GCS bucket. Each project is going to have its state bucket. We accomplish that with the following configuration:
# main.tf file
terraform {
backend "gcs" {
}
required_providers {
google = {
version = "~> 4.65.2"
}
}
}
provider "google" {
project = var.project_id
region = var.region
}
# backend-dev.tfvars file
bucket = "terraform-example-dev-bucket"
To apply the configuration we run the following command:
# DEV environmentterraform init -backend-config="environments/backend-dev.tfvars
"terraform apply -var-file="environments/dev.tfvars"
# PROD environmentterraform init -backend-config="environments/backend-prod.tfvars"
terraform apply -var-file="environments/prod.tfvars"
As you see, this approach is much cleaner than the previous one. We have only one file per component. Each resource property that differs between environments is extracted to *.tfvars file. We can also isolate backends with -backend-config switch and appropriate backend-*.tfvars file.
PROS:
- It’s scalable – we can easily add a component to each environment
- No code duplication
CONS:
- You have to learn a little more about terraform and the usage of variable files
Terraform workspaces
Terraform has to know what resources are already created and which are new. In other words, Terraform has to manage the state of the resources. To do that, it stores the data in a backend. The backend can be a local folder or remote type. In our case, we use a GCP storage bucket to store the Terraform state. In the previous example, we had two backend buckets, one in its project.
What if we could have only one GCP project but still would like to have multiple environments? We can deploy DEV and PROD cloud runs in the same project, but to do it efficiently, terraform workspaces come to the rescue. Numerous copies of resources can be created by using workspaces. When we develop components, they are created in the default workspace.
This is what our bucket is going to look like:
resource "google_storage_bucket" "example-bucket" {
name = "example-bucket-${terraform.workspace}"
location = "US"
project = var.project_id
lifecycle_rule {
condition {
age = var.retention_policy_in_days
}
action {
type = "Delete"
}
}
}
In this scenario we are going to have variable files as well. In addition we also used built-in terraform.workspace variable. So at the end we will have 2 buckets example-bucket-dev and example-bucket-prod in the same project
How do we apply our changes?
First we create new workspaces for dev and prod environment:
# DEVterraform workspace new dev# PROD
terraform workspace new prod
We can list all our workspaces with command:
terraform workspace list
Or we can display current workspace:
terraform workspace show
To switch between workspaces we use the following command:
terraform workspace select <workspace_name>
So when we apply our changes we issue these commands:
# DEVterraform workspace select dev
terraform apply -var-file="environments/dev.tfvars"
# PRODterraform workspace select prod
terraform apply -var-file="environments/prod.tfvars"
PROS
- No code duplication
- It’s scalable
- Native terraform feature
CONS
- One backend for all environments – there is no isolation between them
- Some extra knowledge of terraform is required
Conclusion
Which approach we should use depends on the use case and what we want to achieve. In our case, we have multiple environments, and each environment lies in its own project, so the terraform variable files approach was the most convenient for us. We want to have the same components in each environment and avoid code duplication as much as possible. Here is a short table with the differences between these 4 approaches:
Git branches | Folders for each environment | Terraform variable files | Terraform workspaces | |
Code duplication | X | X | ||
Error prone | X | X | ||
No extra learning required | X | X | ||
Easy to add new component on one environment only | X | X | ||
Scalable – easy to add new component on all environments | X | X | ||
Scalable – easy to add new environment | X | X | ||
Different backend (state) locations for each environment | X | X | X |