Learning Path & Prerequisites
This comprehensive guide is designed to take you from a complete beginner to an expert in HashiCorp Terraform. Each chapter builds upon the previous ones, so we recommend following the sequence for optimal learning.
Prerequisites
- Basic understanding of cloud computing concepts
- Familiarity with command-line interfaces
- Basic knowledge of any programming language (helpful but not required)
- AWS/Azure/GCP account for hands-on practice
Learning Roadmap
graph LR
A[Beginner] --> B[Core Concepts]
B --> C[Intermediate]
C --> D[Advanced]
D --> E[Expert]
A --> F[Chapters 1-4]
B --> G[Chapters 5-8]
C --> H[Chapters 9-12]
D --> I[Chapters 13-16]
E --> J[Chapters 17-19]Table of Contents
Part I: Foundation (Beginner)
- Introduction to Infrastructure as Code
- Getting Started with Terraform
- Terraform Core Concepts
- Configuration Language (HCL)
Part II: Core Skills (Intermediate)
Part III: Advanced Implementation (Advanced)
- Advanced Terraform Concepts
- Testing and Validation
- Best Practices and Patterns
- Security and Compliance
Part IV: Production & Operations (Expert)
Part V: Real-World Applications
1. Introduction to Infrastructure as Code
What is Infrastructure as Code (IaC)?
Infrastructure as Code (IaC) is the practice of managing and provisioning computing infrastructure through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools.
Think of it this way: instead of manually clicking through cloud provider consoles to create servers, databases, and networks, you write code that describes what you want, and the tool creates it for you.
graph TB
A[Traditional Infrastructure] --> B[Manual Configuration]
A --> C[GUI-based Management]
A --> D[Scripts and Tools]
A --> E[Documentation Issues]
A --> F[Configuration Drift]
G[Infrastructure as Code] --> H[Declarative Configuration]
G --> I[Version Control]
G --> J[Automation]
G --> K[Reproducibility]
G --> L[Consistency]
G --> M[Self-Documenting]
H --> N[Terraform]
H --> O[CloudFormation]
H --> P[ARM Templates]
H --> Q[Pulumi]The Problems IaC Solves
Manual Infrastructure Problems:
- Human Error: Clicking wrong buttons, missing configurations
- Inconsistency: Different environments configured differently
- Lack of Documentation: “Who changed what and when?”
- Slow Deployment: Hours or days to provision resources
- No Version Control: Can’t track changes or rollback
- Scalability Issues: Manual processes don’t scale
IaC Solutions:
- Consistency: Same code creates identical environments
- Speed: Deploy complex infrastructure in minutes
- Version Control: Track every change with Git
- Collaboration: Team can review and approve changes
- Rollback Capability: Easily revert to previous versions
- Documentation: Code is the documentation
Why Terraform?
HashiCorp Terraform is an open-source IaC tool that allows you to define and provision infrastructure using a declarative configuration language.
Key Benefits:
- Multi-cloud support: Works with AWS, Azure, GCP, and 100+ providers
- Declarative syntax: Describe what you want, not how to get there
- State management: Tracks infrastructure changes and current state
- Plan and apply: Preview changes before applying them
- Modularity: Create reusable infrastructure components
- Large ecosystem: Thousands of providers and modules available
graph LR
A[Write Code] --> B[terraform plan]
B --> C[Preview Changes]
C --> D[Review & Approve]
D --> E[terraform apply]
E --> F[Infrastructure Created]
F --> G[State Updated]
H[terraform destroy] --> I[Infrastructure Removed]
G --> HTerraform vs. Other IaC Tools
graph TB
A[IaC Tools Comparison] --> B[Terraform]
A --> C[CloudFormation]
A --> D[Azure ARM]
A --> E[Pulumi]
A --> F[Ansible]
B --> B1[Multi-cloud]
B --> B2[HCL Language]
B --> B3[Large Community]
B --> B4[State Management]
C --> C1[AWS Only]
C --> C2[JSON/YAML]
C --> C3[AWS Native]
D --> D1[Azure Only]
D --> D2[JSON]
E --> E1[Multi-cloud]
E --> E2[Real Languages]
E --> E3[Newer Tool]
F --> F1[Configuration Mgmt]
F --> F2[YAML Playbooks]Real-World Example: Manual vs. Terraform
Manual Process (Traditional):
- Log into AWS Console
- Navigate to EC2 → Launch Instance
- Choose AMI, instance type, configure network
- Create security group with rules
- Launch instance
- Navigate to RDS → Create Database
- Configure database settings, security groups
- Set up monitoring, backups manually
- Document everything (often forgotten)
- Repeat for staging/prod (with variations)
Terraform Process:
# One file describes everything
resource "aws_instance" "web" {
ami = "ami-0c02fb55956c7d316"
instance_type = "t3.micro"
vpc_security_group_ids = [aws_security_group.web.id]
tags = {
Name = "web-server"
Environment = var.environment
}
}
resource "aws_db_instance" "database" {
engine = "mysql"
db_name = "myapp"
# ... configuration
}HCLThen run:
terraform plan # Preview changes
terraform apply # Create infrastructureBashResult: Identical infrastructure every time, in any environment!
2. Getting Started with Terraform
Installation
Windows Installation
# Using Chocolatey (Recommended)
choco install terraform
# Using Scoop
scoop install terraform
# Using Winget
winget install Hashicorp.Terraform
# Manual installation
# 1. Download from https://www.terraform.io/downloads.html
# 2. Extract to a folder (e.g., C:\terraform)
# 3. Add to PATH environment variableBashmacOS Installation
# Using Homebrew (Recommended)
brew install terraform
# Using tfenv for version management
brew install tfenv
tfenv install latest
tfenv use latestBashLinux Installation
# Ubuntu/Debian
wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor | sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform
# CentOS/RHEL/Fedora
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
sudo yum -y install terraformBashVerification
terraform version
# Output: Terraform v1.5.xBashSetting Up Your Development Environment
Essential Tools
# Install additional helpful tools
# VS Code with Terraform extension
code --install-extension hashicorp.terraform
# Pre-commit hooks for code quality
pip install pre-commit
pre-commit install
# TFLint for additional validation
curl -s https://raw.githubusercontent.com/terraform-linters/tflint/master/install_linux.sh | bashBashProject Structure (Best Practice)
my-terraform-project/
├── main.tf # Main configuration
├── variables.tf # Input variables
├── outputs.tf # Output values
├── versions.tf # Provider versions
├── terraform.tfvars # Variable values (don't commit secrets)
├── .gitignore # Git ignore file
├── README.md # Project documentation
└── modules/ # Custom modules
├── vpc/
├── security-groups/
└── ec2/BashYour First Terraform Configuration
Let’s create a simple AWS S3 bucket to understand the basics:
Step 1: Create Configuration Files
# versions.tf - Always specify provider versions
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
# main.tf - Main infrastructure definition
provider "aws" {
region = var.aws_region
default_tags {
tags = {
Environment = var.environment
Project = var.project_name
ManagedBy = "Terraform"
}
}
}
resource "aws_s3_bucket" "example" {
bucket = "${var.project_name}-${var.environment}-${random_id.bucket_suffix.hex}"
}
resource "aws_s3_bucket_versioning" "example" {
bucket = aws_s3_bucket.example.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_encryption_configuration" "example" {
bucket = aws_s3_bucket.example.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
resource "random_id" "bucket_suffix" {
byte_length = 4
}HCL# variables.tf - Input variables
variable "aws_region" {
description = "AWS region for resources"
type = string
default = "us-west-2"
}
variable "environment" {
description = "Environment name"
type = string
default = "dev"
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
variable "project_name" {
description = "Name of the project"
type = string
default = "my-first-terraform-project"
}HCL# outputs.tf - Output values
output "bucket_name" {
description = "Name of the S3 bucket"
value = aws_s3_bucket.example.id
}
output "bucket_arn" {
description = "ARN of the S3 bucket"
value = aws_s3_bucket.example.arn
}
output "bucket_region" {
description = "Region of the S3 bucket"
value = aws_s3_bucket.example.region
}HCLStep 2: Configure AWS Credentials
# Option 1: AWS CLI (Recommended)
aws configure
# Enter your AWS Access Key ID, Secret Access Key, and region
# Option 2: Environment Variables
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_DEFAULT_REGION="us-west-2"
# Option 3: IAM Roles (for EC2 instances)
# Attach IAM role to your EC2 instanceBashTerraform Workflow
flowchart TD
A[Write Configuration] --> B[terraform init]
B --> C[terraform validate]
C --> D[terraform plan]
D --> E{Review Plan}
E -->|Approve| F[terraform apply]
E -->|Reject| G[Modify Configuration]
G --> D
F --> H[Infrastructure Created]
H --> I[Test & Verify]
I --> J[terraform destroy]
J --> K[Infrastructure Removed]
L[terraform fmt] --> A
M[terraform state] --> HStep-by-Step Execution
# Step 1: Initialize the working directory
terraform init
# Downloads provider plugins and sets up backend
# Step 2: Format code (optional but recommended)
terraform fmt
# Formats your configuration files
# Step 3: Validate configuration
terraform validate
# Checks for syntax errors and validates configuration
# Step 4: Create execution plan
terraform plan
# Shows what will be created, modified, or destroyed
# Step 5: Apply changes
terraform apply
# Creates the actual infrastructure
# Type 'yes' when prompted
# Step 6: Verify outputs
terraform output
# Shows the output values
# Step 7: Destroy when done (optional)
terraform destroy
# Removes all created infrastructure
# Type 'yes' when promptedBashUnderstanding the Output
When you run terraform plan, you’ll see output like this:
Terraform will perform the following actions:
# aws_s3_bucket.example will be created
+ resource "aws_s3_bucket" "example" {
+ bucket = "my-first-terraform-project-dev-12345678"
+ id = (known after apply)
+ region = (known after apply)
# ... more attributes
}
Plan: 1 to add, 0 to change, 0 to destroy.HCLLegend:
+= Resource will be created~= Resource will be modified-= Resource will be destroyed(known after apply)= Value will be determined during apply
Basic Commands Reference
# Essential commands
terraform init # Initialize working directory
terraform validate # Validate configuration syntax
terraform plan # Create execution plan
terraform apply # Apply changes
terraform destroy # Destroy infrastructure
# Formatting and maintenance
terraform fmt # Format configuration files
terraform fmt -check # Check if files are formatted
# State management
terraform show # Show current state
terraform state list # List resources in state
terraform refresh # Update state from real infrastructure
# Getting help
terraform --help # General help
terraform plan --help # Help for specific commandBashCommon Beginner Mistakes to Avoid
- Not using version constraints – Always specify provider versions
- Hardcoding values – Use variables instead
- Ignoring state files – Never manually edit state files
- No backup strategy – Use remote state with locking
- Not reviewing plans – Always review before applying
- Committing secrets – Use
.gitignoreand secret management
3. Terraform Core Concepts
The Terraform Language
Terraform uses HashiCorp Configuration Language (HCL), which is designed to be human-readable and machine-friendly.
graph TB
A[Terraform Configuration] --> B[Resources]
A --> C[Data Sources]
A --> D[Variables]
A --> E[Outputs]
A --> F[Locals]
A --> G[Modules]
B --> H[aws_instance]
B --> I[aws_vpc]
B --> J[azurerm_vm]
C --> K[aws_ami]
C --> L[aws_availability_zones]Configuration Blocks
# Terraform block
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
# Provider block
provider "aws" {
region = var.aws_region
}
# Resource block
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "main-vpc"
Environment = var.environment
}
}
# Data source block
data "aws_availability_zones" "available" {
state = "available"
}
# Local values
locals {
common_tags = {
Project = "terraform-guide"
Environment = var.environment
Owner = "DevOps Team"
}
}HCLResource Dependencies
graph TD
A[VPC] --> B[Internet Gateway]
A --> C[Subnet]
C --> D[Route Table]
B --> D
D --> E[Security Group]
C --> F[EC2 Instance]
E --> Fresource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
}
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id # Implicit dependency
}
resource "aws_subnet" "public" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
availability_zone = data.aws_availability_zones.available.names[0]
}
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
depends_on = [aws_internet_gateway.main] # Explicit dependency
}HCL4. Configuration Language (HCL)
Data Types
# String
variable "instance_type" {
type = string
default = "t2.micro"
}
# Number
variable "instance_count" {
type = number
default = 2
}
# Boolean
variable "enable_monitoring" {
type = bool
default = true
}
# List
variable "availability_zones" {
type = list(string)
default = ["us-west-2a", "us-west-2b", "us-west-2c"]
}
# Map
variable "instance_tags" {
type = map(string)
default = {
Environment = "dev"
Project = "example"
}
}
# Object
variable "server_config" {
type = object({
instance_type = string
disk_size = number
monitoring = bool
})
default = {
instance_type = "t2.micro"
disk_size = 20
monitoring = true
}
}HCLFunctions
locals {
# String functions
upper_env = upper(var.environment)
formatted_name = format("%s-%s", var.project, var.environment)
# Numeric functions
max_instances = max(2, 4, 6)
min_disk_size = min(20, 50, 100)
# Collection functions
zone_count = length(var.availability_zones)
first_zone = element(var.availability_zones, 0)
# Date/time functions
timestamp = timestamp()
# Encoding functions
encoded_data = base64encode("hello world")
# File functions
user_data = file("${path.module}/user-data.sh")
# IP network functions
subnet_cidrs = cidrsubnets("10.0.0.0/16", 8, 8, 8)
}HCLConditional Expressions
resource "aws_instance" "example" {
ami = var.environment == "prod" ? var.prod_ami : var.dev_ami
instance_type = var.environment == "prod" ? "t3.large" : "t2.micro"
# Conditional block
dynamic "ebs_block_device" {
for_each = var.environment == "prod" ? [1] : []
content {
device_name = "/dev/sdf"
volume_size = 100
volume_type = "gp3"
}
}
}HCLLoops and Iteration
# for_each with map
resource "aws_instance" "servers" {
for_each = var.servers
ami = each.value.ami
instance_type = each.value.instance_type
tags = {
Name = each.key
}
}
# for_each with set
resource "aws_subnet" "private" {
for_each = toset(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, index(var.availability_zones, each.value))
availability_zone = each.value
}
# count
resource "aws_instance" "web" {
count = var.instance_count
ami = data.aws_ami.ubuntu.id
instance_type = "t2.micro"
tags = {
Name = "web-${count.index + 1}"
}
}
# For expressions
locals {
instance_ips = [for instance in aws_instance.web : instance.private_ip]
instance_map = {for i, instance in aws_instance.web : instance.tags.Name => instance.private_ip}
}HCL5. Providers and Resources
Understanding Providers
Providers are plugins that allow Terraform to interact with cloud providers, SaaS providers, and other APIs.
graph TB
A[Terraform Core] --> B[AWS Provider]
A --> C[Azure Provider]
A --> D[GCP Provider]
A --> E[Kubernetes Provider]
A --> F[GitHub Provider]
B --> G[EC2 Instances]
B --> H[S3 Buckets]
B --> I[RDS Databases]
C --> J[Virtual Machines]
C --> K[Storage Accounts]
C --> L[SQL Databases]Provider Configuration
# AWS Provider
provider "aws" {
region = "us-west-2"
default_tags {
tags = {
Environment = var.environment
Project = var.project_name
}
}
}
# Multiple AWS Provider Instances
provider "aws" {
alias = "us-east-1"
region = "us-east-1"
}
provider "aws" {
alias = "eu-west-1"
region = "eu-west-1"
}
# Using aliased providers
resource "aws_s3_bucket" "us_bucket" {
provider = aws.us-east-1
bucket = "my-us-bucket"
}
resource "aws_s3_bucket" "eu_bucket" {
provider = aws.eu-west-1
bucket = "my-eu-bucket"
}HCLCommon AWS Resources
# VPC and Networking
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.project_name}-vpc"
}
}
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = {
Name = "${var.project_name}-public-${count.index + 1}"
Type = "public"
}
}
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.project_name}-igw"
}
}
# Security Groups
resource "aws_security_group" "web" {
name_prefix = "${var.project_name}-web-"
vpc_id = aws_vpc.main.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = [aws_vpc.main.cidr_block]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project_name}-web-sg"
}
}
# EC2 Instance
resource "aws_instance" "web" {
count = var.instance_count
ami = data.aws_ami.ubuntu.id
instance_type = var.instance_type
subnet_id = aws_subnet.public[count.index % length(aws_subnet.public)].id
vpc_security_group_ids = [aws_security_group.web.id]
key_name = aws_key_pair.deployer.key_name
user_data = base64encode(templatefile("${path.module}/user-data.sh.tpl", {
db_host = aws_rds_instance.main.endpoint
}))
root_block_device {
volume_type = "gp3"
volume_size = 20
encrypted = true
}
tags = {
Name = "${var.project_name}-web-${count.index + 1}"
}
}
# RDS Database
resource "aws_db_subnet_group" "main" {
name = "${var.project_name}-db-subnet-group"
subnet_ids = aws_subnet.private[*].id
tags = {
Name = "${var.project_name}-db-subnet-group"
}
}
resource "aws_rds_instance" "main" {
identifier = "${var.project_name}-database"
engine = "mysql"
engine_version = "8.0"
instance_class = "db.t3.micro"
allocated_storage = 20
max_allocated_storage = 100
storage_type = "gp2"
storage_encrypted = true
db_name = var.db_name
username = var.db_username
password = var.db_password
vpc_security_group_ids = [aws_security_group.db.id]
db_subnet_group_name = aws_db_subnet_group.main.name
backup_retention_period = 7
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
skip_final_snapshot = true
tags = {
Name = "${var.project_name}-database"
}
}HCLData Sources
# Get latest Ubuntu AMI
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["099720109477"] # Canonical
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
# Get availability zones
data "aws_availability_zones" "available" {
state = "available"
}
# Get current AWS region
data "aws_region" "current" {}
# Get current AWS account ID
data "aws_caller_identity" "current" {}
# Use data sources
locals {
account_id = data.aws_caller_identity.current.account_id
region = data.aws_region.current.name
azs = data.aws_availability_zones.available.names
}HCL6. State Management
Understanding Terraform State
Terraform state is a file that maps your configuration to real-world resources and tracks metadata.
graph TB
A[Terraform Configuration] --> B[terraform plan]
B --> C[Compare with State]
C --> D[Calculate Changes]
D --> E[terraform apply]
E --> F[Update Infrastructure]
F --> G[Update State File]
H[terraform.tfstate] --> C
G --> HLocal State
By default, Terraform stores state locally in a file called terraform.tfstate.
# terraform.tfstate (example structure)
{
"version": 4,
"terraform_version": "1.5.0",
"serial": 1,
"lineage": "abc123-def456-ghi789",
"outputs": {},
"resources": [
{
"mode": "managed",
"type": "aws_instance",
"name": "example",
"provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
"instances": [
{
"schema_version": 1,
"attributes": {
"id": "i-1234567890abcdef0",
"ami": "ami-0c02fb55956c7d316",
"instance_type": "t2.micro"
}
}
]
}
]
}HCLRemote State
For team collaboration, store state remotely:
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "terraform/state"
region = "us-west-2"
dynamodb_table = "terraform-state-lock"
encrypt = true
}
}HCLState Backends Comparison
graph TB
A[State Backends] --> B[Local]
A --> C[S3]
A --> D[Azure Blob]
A --> E[GCS]
A --> F[Terraform Cloud]
C --> G[State Locking with DynamoDB]
C --> H[Encryption]
C --> I[Versioning]
F --> J[Team Collaboration]
F --> K[Policy Enforcement]
F --> L[Cost Estimation]State Management Commands
# List resources in state
terraform state list
# Show specific resource
terraform state show aws_instance.example
# Remove resource from state (doesn't destroy)
terraform state rm aws_instance.example
# Move resource in state
terraform state mv aws_instance.old aws_instance.new
# Import existing resource
terraform import aws_instance.example i-1234567890abcdef0
# Refresh state
terraform refresh
# Show current state
terraform show
# Pull remote state
terraform state pull
# Push local state to remote
terraform state push terraform.tfstateBashState File Security
# Backend configuration with encryption
terraform {
backend "s3" {
bucket = "secure-terraform-state"
key = "prod/terraform.tfstate"
region = "us-west-2"
dynamodb_table = "terraform-locks"
encrypt = true
kms_key_id = "arn:aws:kms:us-west-2:123456789012:key/12345678-1234-1234-1234-123456789012"
server_side_encryption_configuration {
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
}
}HCLWorkspace Management
# List workspaces
terraform workspace list
# Create new workspace
terraform workspace new development
# Switch workspace
terraform workspace select production
# Show current workspace
terraform workspace show
# Delete workspace
terraform workspace delete developmentBash7. Variables and Outputs
Input Variables
# variables.tf
variable "aws_region" {
description = "AWS region for resources"
type = string
default = "us-west-2"
}
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t2.micro"
validation {
condition = contains([
"t2.micro", "t2.small", "t2.medium",
"t3.micro", "t3.small", "t3.medium"
], var.instance_type)
error_message = "Instance type must be a valid t2 or t3 type."
}
}
variable "environment" {
description = "Environment name"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
variable "enable_monitoring" {
description = "Enable detailed monitoring"
type = bool
default = false
}
variable "availability_zones" {
description = "List of availability zones"
type = list(string)
default = ["us-west-2a", "us-west-2b"]
}
variable "tags" {
description = "Resource tags"
type = map(string)
default = {}
}
variable "server_config" {
description = "Server configuration"
type = object({
instance_type = string
disk_size = number
monitoring = bool
})
default = {
instance_type = "t2.micro"
disk_size = 20
monitoring = false
}
}
# Sensitive variables
variable "db_password" {
description = "Database password"
type = string
sensitive = true
}HCLVariable Definition Files
# terraform.tfvars
aws_region = "us-east-1"
instance_type = "t3.small"
environment = "prod"
tags = {
Project = "web-app"
Owner = "DevOps Team"
}
server_config = {
instance_type = "t3.medium"
disk_size = 50
monitoring = true
}HCL# dev.tfvars
environment = "dev"
instance_type = "t2.micro"
server_config = {
instance_type = "t2.micro"
disk_size = 20
monitoring = false
}HCLEnvironment-Specific Variables
# Apply with specific variable file
terraform apply -var-file="dev.tfvars"
terraform apply -var-file="prod.tfvars"
# Override individual variables
terraform apply -var="instance_type=t3.large" -var="environment=staging"BashLocal Values
locals {
# Common tags for all resources
common_tags = {
Environment = var.environment
Project = "web-application"
ManagedBy = "Terraform"
CreatedAt = timestamp()
}
# Resource naming
name_prefix = "${var.environment}-${var.project_name}"
# Computed values
is_production = var.environment == "prod"
instance_count = local.is_production ? 3 : 1
# Network configuration
vpc_cidr = var.environment == "prod" ? "10.0.0.0/16" : "10.1.0.0/16"
# Subnet CIDRs
public_subnet_cidrs = [
for i in range(length(var.availability_zones)) :
cidrsubnet(local.vpc_cidr, 8, i)
]
private_subnet_cidrs = [
for i in range(length(var.availability_zones)) :
cidrsubnet(local.vpc_cidr, 8, i + 10)
]
}HCLOutput Values
# outputs.tf
output "vpc_id" {
description = "ID of the VPC"
value = aws_vpc.main.id
}
output "public_subnet_ids" {
description = "IDs of the public subnets"
value = aws_subnet.public[*].id
}
output "instance_ids" {
description = "IDs of the EC2 instances"
value = aws_instance.web[*].id
}
output "instance_public_ips" {
description = "Public IP addresses of instances"
value = aws_instance.web[*].public_ip
}
output "load_balancer_dns" {
description = "DNS name of the load balancer"
value = aws_lb.main.dns_name
}
output "database_endpoint" {
description = "RDS instance endpoint"
value = aws_rds_instance.main.endpoint
sensitive = true
}
# Conditional outputs
output "database_port" {
description = "Database port"
value = var.create_database ? aws_rds_instance.main[0].port : null
}
# Complex outputs
output "instance_details" {
description = "Detailed information about instances"
value = {
for instance in aws_instance.web :
instance.tags.Name => {
id = instance.id
public_ip = instance.public_ip
az = instance.availability_zone
}
}
}HCLUsing Outputs
# Show all outputs
terraform output
# Show specific output
terraform output vpc_id
# Get output in JSON format
terraform output -json
# Use output in scripts
VPC_ID=$(terraform output -raw vpc_id)
echo "VPC ID: $VPC_ID"BashVariable Hierarchy
graph TB
A[Environment Variables] --> B[terraform.tfvars]
B --> C[terraform.tfvars.json]
C --> D[*.auto.tfvars]
D --> E[-var-file flag]
E --> F[-var flag]
F --> G[Variable defaults]
H[Highest Priority] --> A
G --> I[Lowest Priority]8. Modules
What are Modules?
Modules are containers for multiple resources that are used together. They help organize and reuse Terraform configurations.
graph TB
A[Root Module] --> B[VPC Module]
A --> C[EC2 Module]
A --> D[RDS Module]
B --> E[VPC Resource]
B --> F[Subnet Resources]
B --> G[Gateway Resources]
C --> H[Instance Resources]
C --> I[Security Group Resources]
D --> J[RDS Instance]
D --> K[Subnet Group]Creating a VPC Module
# modules/vpc/variables.tf
variable "name" {
description = "Name prefix for resources"
type = string
}
variable "cidr_block" {
description = "CIDR block for VPC"
type = string
default = "10.0.0.0/16"
}
variable "availability_zones" {
description = "List of availability zones"
type = list(string)
}
variable "public_subnet_cidrs" {
description = "CIDR blocks for public subnets"
type = list(string)
}
variable "private_subnet_cidrs" {
description = "CIDR blocks for private subnets"
type = list(string)
}
variable "enable_nat_gateway" {
description = "Enable NAT Gateway for private subnets"
type = bool
default = true
}
variable "tags" {
description = "Tags to apply to resources"
type = map(string)
default = {}
}HCL# modules/vpc/main.tf
# VPC
resource "aws_vpc" "main" {
cidr_block = var.cidr_block
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(var.tags, {
Name = "${var.name}-vpc"
})
}
# Internet Gateway
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = merge(var.tags, {
Name = "${var.name}-igw"
})
}
# Public Subnets
resource "aws_subnet" "public" {
count = length(var.public_subnet_cidrs)
vpc_id = aws_vpc.main.id
cidr_block = var.public_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = merge(var.tags, {
Name = "${var.name}-public-${count.index + 1}"
Type = "public"
})
}
# Private Subnets
resource "aws_subnet" "private" {
count = length(var.private_subnet_cidrs)
vpc_id = aws_vpc.main.id
cidr_block = var.private_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
tags = merge(var.tags, {
Name = "${var.name}-private-${count.index + 1}"
Type = "private"
})
}
# Elastic IPs for NAT Gateways
resource "aws_eip" "nat" {
count = var.enable_nat_gateway ? length(var.public_subnet_cidrs) : 0
domain = "vpc"
tags = merge(var.tags, {
Name = "${var.name}-nat-eip-${count.index + 1}"
})
depends_on = [aws_internet_gateway.main]
}
# NAT Gateways
resource "aws_nat_gateway" "main" {
count = var.enable_nat_gateway ? length(var.public_subnet_cidrs) : 0
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id
tags = merge(var.tags, {
Name = "${var.name}-nat-${count.index + 1}"
})
depends_on = [aws_internet_gateway.main]
}
# Route Table for Public Subnets
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = merge(var.tags, {
Name = "${var.name}-public-rt"
})
}
# Route Table Associations for Public Subnets
resource "aws_route_table_association" "public" {
count = length(aws_subnet.public)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
# Route Tables for Private Subnets
resource "aws_route_table" "private" {
count = var.enable_nat_gateway ? length(var.private_subnet_cidrs) : 1
vpc_id = aws_vpc.main.id
dynamic "route" {
for_each = var.enable_nat_gateway ? [1] : []
content {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main[min(count.index, length(aws_nat_gateway.main) - 1)].id
}
}
tags = merge(var.tags, {
Name = "${var.name}-private-rt-${count.index + 1}"
})
}
# Route Table Associations for Private Subnets
resource "aws_route_table_association" "private" {
count = length(aws_subnet.private)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[var.enable_nat_gateway ? count.index : 0].id
}HCL# modules/vpc/outputs.tf
output "vpc_id" {
description = "ID of the VPC"
value = aws_vpc.main.id
}
output "vpc_cidr_block" {
description = "CIDR block of the VPC"
value = aws_vpc.main.cidr_block
}
output "public_subnet_ids" {
description = "IDs of the public subnets"
value = aws_subnet.public[*].id
}
output "private_subnet_ids" {
description = "IDs of the private subnets"
value = aws_subnet.private[*].id
}
output "internet_gateway_id" {
description = "ID of the Internet Gateway"
value = aws_internet_gateway.main.id
}
output "nat_gateway_ids" {
description = "IDs of the NAT Gateways"
value = aws_nat_gateway.main[*].id
}
output "public_route_table_id" {
description = "ID of the public route table"
value = aws_route_table.public.id
}
output "private_route_table_ids" {
description = "IDs of the private route tables"
value = aws_route_table.private[*].id
}HCLUsing the VPC Module
# main.tf
module "vpc" {
source = "./modules/vpc"
name = "my-application"
cidr_block = "10.0.0.0/16"
availability_zones = ["us-west-2a", "us-west-2b", "us-west-2c"]
public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
private_subnet_cidrs = ["10.0.10.0/24", "10.0.20.0/24", "10.0.30.0/24"]
enable_nat_gateway = true
tags = {
Environment = "production"
Project = "web-app"
}
}
# Use module outputs
resource "aws_security_group" "web" {
name_prefix = "web-"
vpc_id = module.vpc.vpc_id
# ... security group rules
}HCLModule Sources
# Local module
module "vpc" {
source = "./modules/vpc"
}
# Git repository
module "vpc" {
source = "git::https://github.com/your-org/terraform-vpc-module.git"
}
# Git repository with tag
module "vpc" {
source = "git::https://github.com/your-org/terraform-vpc-module.git?ref=v1.2.0"
}
# Terraform Registry
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 3.0"
}
# HTTP URL
module "vpc" {
source = "https://example.com/modules/vpc.zip"
}HCLModule Composition
# modules/web-app/main.tf
module "vpc" {
source = "../vpc"
name = var.name
cidr_block = var.vpc_cidr
availability_zones = var.availability_zones
public_subnet_cidrs = var.public_subnet_cidrs
private_subnet_cidrs = var.private_subnet_cidrs
tags = var.tags
}
module "security_groups" {
source = "../security-groups"
name = var.name
vpc_id = module.vpc.vpc_id
tags = var.tags
}
module "load_balancer" {
source = "../load-balancer"
name = var.name
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.public_subnet_ids
security_groups = [module.security_groups.alb_security_group_id]
tags = var.tags
}
module "auto_scaling" {
source = "../auto-scaling"
name = var.name
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
security_groups = [module.security_groups.web_security_group_id]
target_group_arn = module.load_balancer.target_group_arn
instance_type = var.instance_type
min_size = var.min_instances
max_size = var.max_instances
desired_size = var.desired_instances
tags = var.tags
}HCLModule Best Practices
graph TB
A[Module Best Practices] --> B[Single Responsibility]
A --> C[Semantic Versioning]
A --> D[Input Validation]
A --> E[Comprehensive Outputs]
A --> F[Documentation]
A --> G[Testing]
B --> H[One Purpose per Module]
C --> I[v1.0.0, v1.1.0, v2.0.0]
D --> J[Variable Validation Rules]
E --> K[All Useful Values]
F --> L[README.md]
G --> M[Terraform Validate/Plan]9. Advanced Terraform Concepts
Dynamic Blocks
# Dynamic ingress rules for security group
variable "ingress_rules" {
type = list(object({
from_port = number
to_port = number
protocol = string
cidr_blocks = list(string)
description = string
}))
default = [
{
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTP traffic"
},
{
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTPS traffic"
}
]
}
resource "aws_security_group" "web" {
name_prefix = "web-"
vpc_id = var.vpc_id
dynamic "ingress" {
for_each = var.ingress_rules
content {
from_port = ingress.value.from_port
to_port = ingress.value.to_port
protocol = ingress.value.protocol
cidr_blocks = ingress.value.cidr_blocks
description = ingress.value.description
}
}
# Dynamic tags
dynamic "tag" {
for_each = var.additional_tags
content {
key = tag.key
value = tag.value
}
}
}HCLTerraform Functions Deep Dive
locals {
# String manipulation
environment_upper = upper(var.environment)
formatted_name = format("%s-%s-%03d", var.project, var.environment, var.instance_number)
joined_azs = join(",", var.availability_zones)
# Collection functions
subnet_count = length(var.subnet_cidrs)
first_az = element(var.availability_zones, 0)
unique_azs = distinct(var.availability_zones)
# Map/object functions
merged_tags = merge(var.default_tags, var.additional_tags)
tag_keys = keys(var.tags)
tag_values = values(var.tags)
# Conditional logic
instance_type = var.environment == "prod" ? "t3.large" : "t2.micro"
# Network functions
vpc_cidr = "10.0.0.0/16"
subnet_cidrs = [
for i in range(3) : cidrsubnet(local.vpc_cidr, 8, i)
]
# File functions
user_data_script = file("${path.module}/scripts/user-data.sh")
config_template = templatefile("${path.module}/templates/config.tpl", {
database_host = var.database_host
api_key = var.api_key
})
# Date/time functions
current_time = timestamp()
expiry_date = timeadd(timestamp(), "8760h") # 1 year from now
# Encoding functions
encoded_data = base64encode(jsonencode(var.configuration))
decoded_data = jsondecode(base64decode(var.encoded_config))
# Type conversion
port_string = tostring(var.port)
zone_set = toset(var.availability_zones)
# Complex transformations
instance_map = {
for idx, instance in var.instances :
instance.name => {
type = instance.type
az = var.availability_zones[idx % length(var.availability_zones)]
}
}
# Conditional collections
production_instances = [
for instance in var.instances :
instance if instance.environment == "prod"
]
}HCLTerraform Expressions
# Conditional expressions
resource "aws_instance" "web" {
count = var.create_instance ? 1 : 0
ami = var.environment == "prod" ? var.prod_ami : var.dev_ami
instance_type = var.high_performance ? "c5.xlarge" : "t3.micro"
# Nested conditionals
monitoring = var.environment == "prod" ? true : (var.environment == "staging" ? true : false)
# Complex conditional
user_data = var.environment == "prod" ? (
var.enable_monitoring ?
base64encode(templatefile("${path.module}/user-data-prod-monitored.sh", {})) :
base64encode(templatefile("${path.module}/user-data-prod.sh", {}))
) : base64encode(templatefile("${path.module}/user-data-dev.sh", {}))
}
# Splat expressions
locals {
instance_ids = aws_instance.web[*].id
instance_azs = aws_instance.web[*].availability_zone
# Nested splat
subnet_route_table_ids = aws_subnet.private[*].route_table_id
# Conditional splat
public_ips = var.assign_public_ip ? aws_instance.web[*].public_ip : []
}
# For expressions
locals {
# List comprehension
subnet_cidrs = [
for i in range(var.subnet_count) :
cidrsubnet(var.vpc_cidr, 8, i)
]
# Map comprehension
instance_tags = {
for idx, instance in aws_instance.web :
instance.id => {
Name = "web-${idx + 1}"
AZ = instance.availability_zone
}
}
# Filtering
production_subnets = [
for subnet in aws_subnet.all :
subnet.id if subnet.tags.Environment == "prod"
]
# Conditional mapping
instance_configs = {
for name, config in var.instances :
name => merge(config, {
instance_type = config.environment == "prod" ? "t3.large" : "t2.micro"
})
}
}HCLResource Lifecycle Management
resource "aws_instance" "web" {
ami = var.ami_id
instance_type = var.instance_type
lifecycle {
# Prevent accidental deletion
prevent_destroy = true
# Create new resource before destroying old one
create_before_destroy = true
# Ignore changes to specific attributes
ignore_changes = [
ami,
user_data,
tags["LastModified"]
]
# Replace resource when certain attributes change
replace_triggered_by = [
aws_launch_template.web.latest_version
]
}
tags = {
Name = "web-server"
LastModified = timestamp()
}
}
# Null resource for custom provisioning
resource "null_resource" "app_deployment" {
# Triggers recreate when any instance changes
triggers = {
instance_ids = join(",", aws_instance.web[*].id)
app_version = var.app_version
}
provisioner "remote-exec" {
connection {
type = "ssh"
user = "ubuntu"
private_key = file(var.private_key_path)
host = aws_instance.web[0].public_ip
}
inline = [
"sudo apt-get update",
"sudo apt-get install -y docker.io",
"sudo docker pull myapp:${var.app_version}",
"sudo docker run -d -p 80:80 myapp:${var.app_version}"
]
}
lifecycle {
create_before_destroy = true
}
}HCLError Handling and Validation
variable "instance_type" {
description = "EC2 instance type"
type = string
validation {
condition = can(regex("^[tm][2-5]\\.", var.instance_type))
error_message = "Instance type must be a valid t2, t3, t4, t5, m2, m3, m4, or m5 type."
}
}
variable "environment" {
description = "Environment name"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be one of: dev, staging, prod."
}
}
variable "vpc_cidr" {
description = "CIDR block for VPC"
type = string
validation {
condition = can(cidrhost(var.vpc_cidr, 0))
error_message = "VPC CIDR must be a valid IPv4 CIDR block."
}
}
variable "tags" {
description = "Resource tags"
type = map(string)
validation {
condition = alltrue([
for tag_key in keys(var.tags) : can(regex("^[A-Za-z][A-Za-z0-9_-]*$", tag_key))
])
error_message = "All tag keys must start with a letter and contain only letters, numbers, underscores, and hyphens."
}
}
# Precondition and postcondition
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = var.instance_type
subnet_id = var.subnet_id
lifecycle {
precondition {
condition = data.aws_ami.ubuntu.architecture == "x86_64"
error_message = "The selected AMI must be for the x86_64 architecture."
}
postcondition {
condition = self.public_ip != null
error_message = "Instance must have a public IP address."
}
}
}HCLCustom Providers and Resources
terraform {
required_providers {
custom = {
source = "example.com/custom/provider"
version = "~> 1.0"
}
github = {
source = "integrations/github"
version = "~> 5.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.0"
}
}
}
# Custom provider configuration
provider "custom" {
api_endpoint = var.custom_api_endpoint
api_key = var.custom_api_key
}
# GitHub provider for repository management
provider "github" {
token = var.github_token
owner = var.github_organization
}
resource "github_repository" "app" {
name = "${var.project_name}-app"
description = "Application repository"
private = true
template {
owner = var.github_organization
repository = "app-template"
}
}
# Kubernetes provider for application deployment
provider "kubernetes" {
config_path = "~/.kube/config"
}
resource "kubernetes_deployment" "app" {
metadata {
name = var.app_name
namespace = var.namespace
}
spec {
replicas = var.replica_count
selector {
match_labels = {
app = var.app_name
}
}
template {
metadata {
labels = {
app = var.app_name
}
}
spec {
container {
image = "${var.container_registry}/${var.app_name}:${var.app_version}"
name = var.app_name
port {
container_port = var.container_port
}
}
}
}
}
}HCL10. Testing and Validation
Why Test Infrastructure Code?
Testing infrastructure code is crucial for maintaining reliability, preventing costly mistakes, and ensuring compliance. Unlike application code, infrastructure mistakes can be expensive and hard to recover from.
graph TB
A[Infrastructure Testing] --> B[Static Analysis]
A --> C[Unit Testing]
A --> D[Integration Testing]
A --> E[End-to-End Testing]
A --> F[Policy Testing]
B --> G[terraform validate]
B --> H[TFLint]
B --> I[Checkov]
C --> J[Terratest]
C --> K[Kitchen-Terraform]
D --> L[Real Environment Tests]
D --> M[Service Integration]
E --> N[Full Stack Validation]
E --> O[User Journey Testing]
F --> P[Open Policy Agent]
F --> Q[Sentinel]
F --> R[CloudFormation Guard]Static Analysis and Linting
TFLint Configuration
# .tflint.hcl
plugin "terraform" {
enabled = true
preset = "recommended"
}
plugin "aws" {
enabled = true
version = "0.21.0"
source = "github.com/terraform-linters/tflint-ruleset-aws"
}
rule "terraform_deprecated_interpolation" {
enabled = true
}
rule "terraform_unused_declarations" {
enabled = true
}
rule "terraform_comment_syntax" {
enabled = true
}
rule "terraform_documented_outputs" {
enabled = true
}
rule "terraform_documented_variables" {
enabled = true
}
rule "terraform_typed_variables" {
enabled = true
}
rule "terraform_module_pinned_source" {
enabled = true
}
rule "terraform_naming_convention" {
enabled = true
format = "snake_case"
}
rule "terraform_standard_module_structure" {
enabled = true
}HCLSecurity Scanning with Checkov
# Install Checkov
pip install checkov
# Scan Terraform files
checkov -f main.tf
checkov -d ./terraform/
# Generate reports
checkov -d . --framework terraform --output json --output-file checkov-report.json
checkov -d . --framework terraform --output sarif --output-file checkov-report.sarifBash# .checkov.yml
framework:
- terraform
- secrets
directory:
- ./
check:
- CKV_AWS_8 # Ensure S3 bucket has server side encryption enabled
- CKV_AWS_18 # Ensure S3 bucket has access logging configured
skip_check:
- CKV_AWS_144 # Skip specific checks if needed
output: cliYAMLUnit Testing with Terratest
Go-based Testing Framework
// test/terraform_test.go
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/gruntwork-io/terratest/modules/aws"
"github.com/stretchr/testify/assert"
)
func TestTerraformExample(t *testing.T) {
t.Parallel()
// Configure Terraform options
terraformOptions := &terraform.Options{
TerraformDir: "../examples/complete",
Vars: map[string]interface{}{
"environment": "test",
"instance_type": "t3.micro",
},
EnvVars: map[string]string{
"AWS_DEFAULT_REGION": "us-west-2",
},
}
// Clean up resources with "terraform destroy" at the end of the test
defer terraform.Destroy(t, terraformOptions)
// Run "terraform init" and "terraform apply"
terraform.InitAndApply(t, terraformOptions)
// Validate the results
instanceId := terraform.Output(t, terraformOptions, "instance_id")
vpcId := terraform.Output(t, terraformOptions, "vpc_id")
// Verify the EC2 instance is running
aws.GetEc2InstancesByTag(t, "us-west-2", "Name", "test-instance")
// Verify VPC exists
vpc := aws.GetVpcById(t, vpcId, "us-west-2")
assert.Equal(t, "10.0.0.0/16", vpc.CidrBlock)
}
func TestTerraformModuleVPC(t *testing.T) {
t.Parallel()
terraformOptions := &terraform.Options{
TerraformDir: "../modules/vpc",
Vars: map[string]interface{}{
"name": "test-vpc",
"cidr_block": "10.0.0.0/16",
"availability_zones": []string{"us-west-2a", "us-west-2b"},
"public_subnet_cidrs": []string{"10.0.1.0/24", "10.0.2.0/24"},
"private_subnet_cidrs": []string{"10.0.11.0/24", "10.0.12.0/24"},
},
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
// Test VPC creation
vpcId := terraform.Output(t, terraformOptions, "vpc_id")
vpc := aws.GetVpcById(t, vpcId, "us-west-2")
assert.Equal(t, "10.0.0.0/16", vpc.CidrBlock)
assert.True(t, vpc.EnableDnsHostnames)
assert.True(t, vpc.EnableDnsSupport)
// Test subnet creation
publicSubnetIds := terraform.OutputList(t, terraformOptions, "public_subnet_ids")
assert.Len(t, publicSubnetIds, 2)
privateSubnetIds := terraform.OutputList(t, terraformOptions, "private_subnet_ids")
assert.Len(t, privateSubnetIds, 2)
}GoRunning Tests
# Install Go and dependencies
go mod init terraform-tests
go get github.com/gruntwork-io/terratest/modules/terraform
go get github.com/stretchr/testify/assert
# Run all tests
go test -v ./test/
# Run specific test
go test -v ./test/ -run TestTerraformExample
# Run tests in parallel
go test -v ./test/ -parallel 10
# Run tests with timeout
go test -v ./test/ -timeout 30mBashIntegration Testing
Kitchen-Terraform Setup
# .kitchen.yml
---
driver:
name: terraform
variable_files:
- testing.tfvars
provisioner:
name: terraform
verifier:
name: terraform
systems:
- name: local
backend: local
platforms:
- name: terraform
suites:
- name: default
driver:
root_module_directory: test/fixtures/complete
verifier:
color: false
systems:
- name: local
backend: local
controls:
- operating_system
- reachable_other_hostYAML# test/integration/default/controls/default.rb
describe aws_vpc('vpc-12345678') do
it { should exist }
it { should be_available }
its('cidr_block') { should eq '10.0.0.0/16' }
its('dhcp_options_id') { should_not be_empty }
end
describe aws_security_group(group_id: 'sg-12345678') do
it { should exist }
it { should allow_in(port: 80, ipv4_range: '0.0.0.0/0') }
it { should allow_in(port: 443, ipv4_range: '0.0.0.0/0') }
end
describe aws_ec2_instance('i-12345678') do
it { should exist }
it { should be_running }
its('instance_type') { should eq 't3.micro' }
its('image_id') { should eq 'ami-0c02fb55956c7d316' }
endRubyPolicy as Code Testing
Open Policy Agent (OPA) with Rego
# policies/security.rego
package terraform.security
import input as tfplan
# Deny if S3 buckets don't have encryption
deny[msg] {
resource := tfplan.resource_changes[_]
resource.type == "aws_s3_bucket"
not has_encryption(resource)
msg := sprintf("S3 bucket '%s' must have encryption enabled", [resource.address])
}
# Check if S3 bucket has encryption configuration
has_encryption(resource) {
encryption := resource.change.after.server_side_encryption_configuration[_]
encryption.rule[_].apply_server_side_encryption_by_default.sse_algorithm
}
# Deny if security groups allow unrestricted access
deny[msg] {
resource := tfplan.resource_changes[_]
resource.type == "aws_security_group"
rule := resource.change.after.ingress[_]
rule.cidr_blocks[_] == "0.0.0.0/0"
rule.from_port == 22
msg := sprintf("Security group '%s' allows SSH access from anywhere", [resource.address])
}
# Ensure all resources are tagged
deny[msg] {
resource := tfplan.resource_changes[_]
resource.type in ["aws_instance", "aws_s3_bucket", "aws_rds_instance"]
not resource.change.after.tags.Environment
msg := sprintf("Resource '%s' must have Environment tag", [resource.address])
}BashTesting OPA Policies
# Install OPA
curl -L -o opa https://openpolicyagent.org/downloads/latest/opa_linux_amd64
chmod +x opa
# Generate plan in JSON format
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
# Test policy
opa eval -d policies/ -i plan.json "data.terraform.security.deny[x]"
# Test with formatted output
opa test policies/ -vBashAutomated Testing Pipeline
GitHub Actions Testing Workflow
# .github/workflows/terraform-test.yml
name: Terraform Tests
on:
pull_request:
branches: [ main ]
push:
branches: [ main ]
jobs:
static-analysis:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.5.0
- name: Terraform Format Check
run: terraform fmt -check -recursive
- name: Terraform Init
run: terraform init -backend=false
- name: Terraform Validate
run: terraform validate
- name: Run TFLint
uses: terraform-linters/setup-tflint@v3
with:
tflint_version: latest
- run: tflint --init
- run: tflint -f compact
- name: Run Checkov
uses: bridgecrewio/checkov-action@master
with:
directory: .
framework: terraform
output_format: sarif
output_file_path: checkov-report.sarif
- name: Upload Checkov results
uses: github/codeql-action/upload-sarif@v2
if: always()
with:
sarif_file: checkov-report.sarif
unit-tests:
runs-on: ubuntu-latest
needs: static-analysis
steps:
- uses: actions/checkout@v4
- name: Setup Go
uses: actions/setup-go@v4
with:
go-version: 1.19
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.5.0
terraform_wrapper: false
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-west-2
- name: Download Go modules
run: go mod download
- name: Run Terratest
run: |
cd test
go test -v -timeout 30m
policy-tests:
runs-on: ubuntu-latest
needs: static-analysis
steps:
- uses: actions/checkout@v4
- name: Setup OPA
uses: open-policy-agent/setup-opa@v2
with:
version: latest
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.5.0
- name: Generate Terraform plan
run: |
terraform init -backend=false
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
- name: Test OPA policies
run: |
opa fmt --diff policies/
opa test policies/ -v
opa eval -d policies/ -i plan.json "data.terraform.security.deny[x]"YAMLTest Data Management
Using Test Fixtures
# test/fixtures/complete/main.tf
module "test" {
source = "../../../"
# Test-specific variables
name = "terratest-${random_id.test.hex}"
environment = "test"
# Override defaults for testing
instance_type = var.instance_type
enable_monitoring = false
tags = {
TestId = random_id.test.hex
Purpose = "automated-testing"
}
}
resource "random_id" "test" {
byte_length = 4
}HCL# test/fixtures/complete/variables.tf
variable "instance_type" {
description = "EC2 instance type for testing"
type = string
default = "t3.micro"
}HCLTest Data Cleanup
// test/helpers/cleanup.go
package helpers
import (
"context"
"fmt"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/service/ec2"
"testing"
"time"
)
func CleanupTestResources(t *testing.T, region string, testId string) {
cfg, err := config.LoadDefaultConfig(context.TODO(),
config.WithRegion(region),
)
if err != nil {
t.Fatalf("Failed to load AWS config: %v", err)
}
ec2Client := ec2.NewFromConfig(cfg)
// Find and terminate test instances older than 2 hours
input := &ec2.DescribeInstancesInput{
Filters: []types.Filter{
{
Name: aws.String("tag:Purpose"),
Values: []string{"automated-testing"},
},
{
Name: aws.String("instance-state-name"),
Values: []string{"running", "stopped"},
},
},
}
result, err := ec2Client.DescribeInstances(context.TODO(), input)
if err != nil {
t.Logf("Failed to describe instances: %v", err)
return
}
for _, reservation := range result.Reservations {
for _, instance := range reservation.Instances {
// Check if instance is older than 2 hours
if time.Since(*instance.LaunchTime) > 2*time.Hour {
t.Logf("Terminating old test instance: %s", *instance.InstanceId)
// Terminate instance logic here
}
}
}
}GoTesting Best Practices
Test Organization
test/
├── fixtures/
│ ├── minimal/ # Minimal configuration for basic tests
│ ├── complete/ # Full configuration for integration tests
│ └── custom/ # Custom test scenarios
├── integration/
│ ├── complete_test.go
│ └── minimal_test.go
├── unit/
│ ├── vpc_test.go
│ ├── security_groups_test.go
│ └── ec2_test.go
├── policies/
│ ├── security.rego
│ ├── compliance.rego
│ └── cost.rego
└── helpers/
├── cleanup.go
├── aws.go
└── terraform.goBashTesting Guidelines
- Isolation: Each test should be independent and not affect others
- Cleanup: Always clean up resources after tests
- Parallel Execution: Use
t.Parallel()for faster test execution - Meaningful Names: Use descriptive test names
- Documentation: Document test scenarios and expected outcomes
- Environment Separation: Use separate AWS accounts/regions for testing
- Resource Limits: Implement quotas and limits for test resources
- Cost Control: Monitor and alert on test resource costs
Dynamic Blocks
# Dynamic ingress rules for security group
variable "ingress_rules" {
type = list(object({
from_port = number
to_port = number
protocol = string
cidr_blocks = list(string)
description = string
}))
default = [
{
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTP traffic"
},
{
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTPS traffic"
}
]
}
resource "aws_security_group" "web" {
name_prefix = "web-"
vpc_id = var.vpc_id
dynamic "ingress" {
for_each = var.ingress_rules
content {
from_port = ingress.value.from_port
to_port = ingress.value.to_port
protocol = ingress.value.protocol
cidr_blocks = ingress.value.cidr_blocks
description = ingress.value.description
}
}
# Dynamic tags
dynamic "tag" {
for_each = var.additional_tags
content {
key = tag.key
value = tag.value
}
}
}HCLTerraform Functions Deep Dive
locals {
# String manipulation
environment_upper = upper(var.environment)
formatted_name = format("%s-%s-%03d", var.project, var.environment, var.instance_number)
joined_azs = join(",", var.availability_zones)
# Collection functions
subnet_count = length(var.subnet_cidrs)
first_az = element(var.availability_zones, 0)
unique_azs = distinct(var.availability_zones)
# Map/object functions
merged_tags = merge(var.default_tags, var.additional_tags)
tag_keys = keys(var.tags)
tag_values = values(var.tags)
# Conditional logic
instance_type = var.environment == "prod" ? "t3.large" : "t2.micro"
# Network functions
vpc_cidr = "10.0.0.0/16"
subnet_cidrs = [
for i in range(3) : cidrsubnet(local.vpc_cidr, 8, i)
]
# File functions
user_data_script = file("${path.module}/scripts/user-data.sh")
config_template = templatefile("${path.module}/templates/config.tpl", {
database_host = var.database_host
api_key = var.api_key
})
# Date/time functions
current_time = timestamp()
expiry_date = timeadd(timestamp(), "8760h") # 1 year from now
# Encoding functions
encoded_data = base64encode(jsonencode(var.configuration))
decoded_data = jsondecode(base64decode(var.encoded_config))
# Type conversion
port_string = tostring(var.port)
zone_set = toset(var.availability_zones)
# Complex transformations
instance_map = {
for idx, instance in var.instances :
instance.name => {
type = instance.type
az = var.availability_zones[idx % length(var.availability_zones)]
}
}
# Conditional collections
production_instances = [
for instance in var.instances :
instance if instance.environment == "prod"
]
}HCLTerraform Expressions
# Conditional expressions
resource "aws_instance" "web" {
count = var.create_instance ? 1 : 0
ami = var.environment == "prod" ? var.prod_ami : var.dev_ami
instance_type = var.high_performance ? "c5.xlarge" : "t3.micro"
# Nested conditionals
monitoring = var.environment == "prod" ? true : (var.environment == "staging" ? true : false)
# Complex conditional
user_data = var.environment == "prod" ? (
var.enable_monitoring ?
base64encode(templatefile("${path.module}/user-data-prod-monitored.sh", {})) :
base64encode(templatefile("${path.module}/user-data-prod.sh", {}))
) : base64encode(templatefile("${path.module}/user-data-dev.sh", {}))
}
# Splat expressions
locals {
instance_ids = aws_instance.web[*].id
instance_azs = aws_instance.web[*].availability_zone
# Nested splat
subnet_route_table_ids = aws_subnet.private[*].route_table_id
# Conditional splat
public_ips = var.assign_public_ip ? aws_instance.web[*].public_ip : []
}
# For expressions
locals {
# List comprehension
subnet_cidrs = [
for i in range(var.subnet_count) :
cidrsubnet(var.vpc_cidr, 8, i)
]
# Map comprehension
instance_tags = {
for idx, instance in aws_instance.web :
instance.id => {
Name = "web-${idx + 1}"
AZ = instance.availability_zone
}
}
# Filtering
production_subnets = [
for subnet in aws_subnet.all :
subnet.id if subnet.tags.Environment == "prod"
]
# Conditional mapping
instance_configs = {
for name, config in var.instances :
name => merge(config, {
instance_type = config.environment == "prod" ? "t3.large" : "t2.micro"
})
}
}HCLResource Lifecycle Management
resource "aws_instance" "web" {
ami = var.ami_id
instance_type = var.instance_type
lifecycle {
# Prevent accidental deletion
prevent_destroy = true
# Create new resource before destroying old one
create_before_destroy = true
# Ignore changes to specific attributes
ignore_changes = [
ami,
user_data,
tags["LastModified"]
]
# Replace resource when certain attributes change
replace_triggered_by = [
aws_launch_template.web.latest_version
]
}
tags = {
Name = "web-server"
LastModified = timestamp()
}
}
# Null resource for custom provisioning
resource "null_resource" "app_deployment" {
# Triggers recreate when any instance changes
triggers = {
instance_ids = join(",", aws_instance.web[*].id)
app_version = var.app_version
}
provisioner "remote-exec" {
connection {
type = "ssh"
user = "ubuntu"
private_key = file(var.private_key_path)
host = aws_instance.web[0].public_ip
}
inline = [
"sudo apt-get update",
"sudo apt-get install -y docker.io",
"sudo docker pull myapp:${var.app_version}",
"sudo docker run -d -p 80:80 myapp:${var.app_version}"
]
}
lifecycle {
create_before_destroy = true
}
}HCLError Handling and Validation
variable "instance_type" {
description = "EC2 instance type"
type = string
validation {
condition = can(regex("^[tm][2-5]\\.", var.instance_type))
error_message = "Instance type must be a valid t2, t3, t4, t5, m2, m3, m4, or m5 type."
}
}
variable "environment" {
description = "Environment name"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be one of: dev, staging, prod."
}
}
variable "vpc_cidr" {
description = "CIDR block for VPC"
type = string
validation {
condition = can(cidrhost(var.vpc_cidr, 0))
error_message = "VPC CIDR must be a valid IPv4 CIDR block."
}
}
variable "tags" {
description = "Resource tags"
type = map(string)
validation {
condition = alltrue([
for tag_key in keys(var.tags) : can(regex("^[A-Za-z][A-Za-z0-9_-]*$", tag_key))
])
error_message = "All tag keys must start with a letter and contain only letters, numbers, underscores, and hyphens."
}
}
# Precondition and postcondition
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = var.instance_type
subnet_id = var.subnet_id
lifecycle {
precondition {
condition = data.aws_ami.ubuntu.architecture == "x86_64"
error_message = "The selected AMI must be for the x86_64 architecture."
}
postcondition {
condition = self.public_ip != null
error_message = "Instance must have a public IP address."
}
}
}HCLCustom Providers and Resources
terraform {
required_providers {
custom = {
source = "example.com/custom/provider"
version = "~> 1.0"
}
github = {
source = "integrations/github"
version = "~> 5.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.0"
}
}
}
# Custom provider configuration
provider "custom" {
api_endpoint = var.custom_api_endpoint
api_key = var.custom_api_key
}
# GitHub provider for repository management
provider "github" {
token = var.github_token
owner = var.github_organization
}
resource "github_repository" "app" {
name = "${var.project_name}-app"
description = "Application repository"
private = true
template {
owner = var.github_organization
repository = "app-template"
}
}
# Kubernetes provider for application deployment
provider "kubernetes" {
config_path = "~/.kube/config"
}
resource "kubernetes_deployment" "app" {
metadata {
name = var.app_name
namespace = var.namespace
}
spec {
replicas = var.replica_count
selector {
match_labels = {
app = var.app_name
}
}
template {
metadata {
labels = {
app = var.app_name
}
}
spec {
container {
image = "${var.container_registry}/${var.app_name}:${var.app_version}"
name = var.app_name
port {
container_port = var.container_port
}
}
}
}
}
}HCL11. Best Practices and Patterns
Project Structure
terraform-project/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── outputs.tf
│ │ └── terraform.tfvars
│ ├── staging/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── outputs.tf
│ │ └── terraform.tfvars
│ └── prod/
│ ├── main.tf
│ ├── variables.tf
│ ├── outputs.tf
│ └── terraform.tfvars
├── modules/
│ ├── vpc/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── outputs.tf
│ │ └── README.md
│ ├── security-groups/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── outputs.tf
│ │ └── README.md
│ └── ec2/
│ ├── main.tf
│ ├── variables.tf
│ ├── outputs.tf
│ └── README.md
├── scripts/
│ ├── user-data.sh
│ └── deploy.sh
├── templates/
│ ├── user-data.sh.tpl
│ └── config.json.tpl
├── .gitignore
├── README.md
└── MakefileBashConfiguration Standards
# Standard file headers and organization
# File: modules/vpc/main.tf
# Description: VPC module for creating AWS VPC infrastructure
# Author: DevOps Team
# Version: 1.0.0
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
# Local values for reusability
locals {
# Naming conventions
name_prefix = "${var.environment}-${var.project_name}"
# Common tags applied to all resources
common_tags = merge(var.additional_tags, {
Environment = var.environment
Project = var.project_name
ManagedBy = "Terraform"
Module = "vpc"
CreatedAt = formatdate("YYYY-MM-DD", timestamp())
})
# Network calculations
availability_zone_count = length(var.availability_zones)
public_subnet_count = var.create_public_subnets ? local.availability_zone_count : 0
private_subnet_count = var.create_private_subnets ? local.availability_zone_count : 0
}
# Data sources at the top
data "aws_region" "current" {}
data "aws_caller_identity" "current" {}
# Main VPC resource
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = var.enable_dns_hostnames
enable_dns_support = var.enable_dns_support
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-vpc"
})
}HCLNaming Conventions
locals {
# Resource naming patterns
naming_convention = {
# Format: {environment}-{project}-{resource_type}-{identifier}
vpc_name = "${var.environment}-${var.project_name}-vpc"
subnet_name = "${var.environment}-${var.project_name}-subnet"
sg_name = "${var.environment}-${var.project_name}-sg"
instance_name = "${var.environment}-${var.project_name}-instance"
# Alternative formats for different use cases
short_name = "${substr(var.environment, 0, 1)}${substr(var.project_name, 0, 8)}"
dns_friendly = replace("${var.environment}-${var.project_name}", "_", "-")
}
# Tag standardization
standard_tags = {
Environment = var.environment
Project = var.project_name
Owner = var.team_name
CostCenter = var.cost_center
ManagedBy = "Terraform"
BackupRequired = var.backup_required
MonitoringLevel = var.monitoring_level
}
}
# Resource naming examples
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
tags = merge(local.standard_tags, {
Name = local.naming_convention.vpc_name
Type = "network"
})
}
resource "aws_subnet" "public" {
count = length(var.public_subnet_cidrs)
vpc_id = aws_vpc.main.id
cidr_block = var.public_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
tags = merge(local.standard_tags, {
Name = "${local.naming_convention.subnet_name}-public-${count.index + 1}"
Type = "public"
Tier = "public"
})
}HCLEnvironment Configuration Pattern
# environments/dev/main.tf
module "infrastructure" {
source = "../../modules/infrastructure"
# Environment-specific values
environment = "dev"
# Resource sizing for development
instance_type = "t3.micro"
min_instances = 1
max_instances = 2
desired_instances = 1
# Database configuration
db_instance_class = "db.t3.micro"
db_storage_size = 20
# Feature flags
enable_monitoring = false
enable_backup = false
enable_multi_az = false
enable_encryption = false
# Network configuration
vpc_cidr = "10.1.0.0/16"
tags = {
Environment = "development"
CostCenter = "engineering"
AutoShutdown = "true"
}
}HCL# environments/prod/main.tf
module "infrastructure" {
source = "../../modules/infrastructure"
# Environment-specific values
environment = "prod"
# Resource sizing for production
instance_type = "t3.large"
min_instances = 3
max_instances = 10
desired_instances = 5
# Database configuration
db_instance_class = "db.r5.xlarge"
db_storage_size = 500
db_backup_retention = 30
db_maintenance_window = "sun:05:00-sun:06:00"
# Feature flags
enable_monitoring = true
enable_backup = true
enable_multi_az = true
enable_encryption = true
enable_deletion_protection = true
# Network configuration
vpc_cidr = "10.0.0.0/16"
tags = {
Environment = "production"
CostCenter = "operations"
Compliance = "required"
}
}HCLState Management Patterns
graph TB
A[State Management Strategies] --> B[Separate States]
A --> C[Workspaces]
A --> D[State Backends]
B --> E[Per Environment]
B --> F[Per Component]
B --> G[Per Team]
C --> H[Single Backend]
C --> I[Environment Isolation]
D --> J[S3 + DynamoDB]
D --> K[Terraform Cloud]
D --> L[Azure Blob]Secure Configuration
# Secure variable handling
variable "database_password" {
description = "Database master password"
type = string
sensitive = true
validation {
condition = length(var.database_password) >= 8
error_message = "Database password must be at least 8 characters long."
}
}
variable "api_keys" {
description = "API keys for external services"
type = map(string)
sensitive = true
default = {}
}
# Secrets management
data "aws_secretsmanager_secret_version" "database_password" {
secret_id = "prod/database/password"
}
data "aws_secretsmanager_secret_version" "api_keys" {
secret_id = "prod/api/keys"
}
locals {
database_password = jsondecode(data.aws_secretsmanager_secret_version.database_password.secret_string)["password"]
api_keys = jsondecode(data.aws_secretsmanager_secret_version.api_keys.secret_string)
}
# Secure resource configuration
resource "aws_rds_instance" "main" {
identifier = "${var.project_name}-${var.environment}"
engine = "mysql"
engine_version = "8.0.35"
instance_class = var.db_instance_class
allocated_storage = var.db_storage_size
max_allocated_storage = var.db_max_storage_size
storage_type = "gp3"
storage_encrypted = true
kms_key_id = aws_kms_key.database.arn
db_name = var.database_name
username = var.database_username
password = local.database_password
vpc_security_group_ids = [aws_security_group.database.id]
db_subnet_group_name = aws_db_subnet_group.main.name
backup_retention_period = var.backup_retention_days
backup_window = var.backup_window
maintenance_window = var.maintenance_window
auto_minor_version_upgrade = false
deletion_protection = var.enable_deletion_protection
skip_final_snapshot = false
final_snapshot_identifier = "${var.project_name}-${var.environment}-final-snapshot-${formatdate("YYYY-MM-DD-hhmm", timestamp())}"
enabled_cloudwatch_logs_exports = ["error", "general", "slow_query"]
tags = merge(local.common_tags, {
Name = "${var.project_name}-${var.environment}-database"
Backup = "required"
Monitoring = "enhanced"
})
}
# KMS key for encryption
resource "aws_kms_key" "database" {
description = "KMS key for ${var.project_name} database encryption"
deletion_window_in_days = var.environment == "prod" ? 30 : 7
tags = merge(local.common_tags, {
Name = "${var.project_name}-${var.environment}-db-key"
Purpose = "database-encryption"
})
}
resource "aws_kms_alias" "database" {
name = "alias/${var.project_name}-${var.environment}-db"
target_key_id = aws_kms_key.database.key_id
}HCLCode Quality and Linting
# .tflint.hcl
plugin "terraform" {
enabled = true
preset = "recommended"
}
plugin "aws" {
enabled = true
version = "0.21.0"
source = "github.com/terraform-linters/tflint-ruleset-aws"
}
rule "terraform_deprecated_interpolation" {
enabled = true
}
rule "terraform_unused_declarations" {
enabled = true
}
rule "terraform_comment_syntax" {
enabled = true
}
rule "terraform_documented_outputs" {
enabled = true
}
rule "terraform_documented_variables" {
enabled = true
}
rule "terraform_typed_variables" {
enabled = true
}
rule "terraform_module_pinned_source" {
enabled = true
}
rule "terraform_naming_convention" {
enabled = true
format = "snake_case"
}
rule "terraform_standard_module_structure" {
enabled = true
}HCL12. Security and Compliance
Security Best Practices
graph TB
A[Terraform Security] --> B[State Security]
A --> C[Credential Management]
A --> D[Resource Security]
A --> E[Network Security]
A --> F[Compliance]
B --> G[Encrypted Backend]
B --> H[Access Control]
B --> I[State Locking]
C --> J[Environment Variables]
C --> K[Secret Managers]
C --> L[IAM Roles]
D --> M[Encryption at Rest]
D --> N[Encryption in Transit]
D --> O[Access Policies]Secrets Management
# Using AWS Secrets Manager
data "aws_secretsmanager_secret" "database_credentials" {
name = "${var.environment}/database/credentials"
}
data "aws_secretsmanager_secret_version" "database_credentials" {
secret_id = data.aws_secretsmanager_secret.database_credentials.id
}
locals {
db_creds = jsondecode(data.aws_secretsmanager_secret_version.database_credentials.secret_string)
}
# Using HashiCorp Vault
data "vault_generic_secret" "database_credentials" {
path = "secret/${var.environment}/database"
}
# Secure variable definitions
variable "sensitive_data" {
description = "Sensitive configuration data"
type = object({
api_key = string
secret_key = string
})
sensitive = true
}
# Environment-based secret retrieval
locals {
secret_name = "${var.environment}/app/secrets"
secrets = jsondecode(data.aws_secretsmanager_secret_version.app_secrets.secret_string)
}HCLIAM Security Patterns
# Least privilege IAM policy
data "aws_iam_policy_document" "ec2_assume_role" {
statement {
actions = ["sts:AssumeRole"]
principals {
type = "Service"
identifiers = ["ec2.amazonaws.com"]
}
condition {
test = "StringEquals"
variable = "sts:ExternalId"
values = [random_uuid.external_id.result]
}
}
}
resource "aws_iam_role" "ec2_role" {
name = "${var.project_name}-${var.environment}-ec2-role"
assume_role_policy = data.aws_iam_policy_document.ec2_assume_role.json
tags = local.common_tags
}
data "aws_iam_policy_document" "ec2_permissions" {
# S3 access for specific bucket only
statement {
sid = "S3Access"
effect = "Allow"
actions = [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
]
resources = [
"${aws_s3_bucket.app_bucket.arn}/*"
]
}
# CloudWatch logs
statement {
sid = "CloudWatchLogs"
effect = "Allow"
actions = [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
]
resources = [
"arn:aws:logs:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:log-group:/aws/ec2/${var.project_name}/*"
]
}
# Systems Manager for patching
statement {
sid = "SystemsManager"
effect = "Allow"
actions = [
"ssm:UpdateInstanceInformation",
"ssm:SendCommand",
"ssm:GetCommandInvocation"
]
resources = ["*"]
condition {
test = "StringEquals"
variable = "ssm:ResourceTag/Project"
values = [var.project_name]
}
}
}
resource "aws_iam_policy" "ec2_policy" {
name = "${var.project_name}-${var.environment}-ec2-policy"
policy = data.aws_iam_policy_document.ec2_permissions.json
}
resource "aws_iam_role_policy_attachment" "ec2_policy_attachment" {
role = aws_iam_role.ec2_role.name
policy_arn = aws_iam_policy.ec2_policy.arn
}HCLNetwork Security
# Security groups with principle of least privilege
resource "aws_security_group" "web_tier" {
name_prefix = "${var.project_name}-web-"
vpc_id = var.vpc_id
description = "Security group for web tier"
# Inbound rules
ingress {
description = "HTTPS from ALB"
from_port = 443
to_port = 443
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
ingress {
description = "HTTP from ALB"
from_port = 80
to_port = 80
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
# Outbound rules - restrictive
egress {
description = "HTTPS to internet"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
description = "MySQL to database"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_groups = [aws_security_group.database.id]
}
tags = merge(local.common_tags, {
Name = "${var.project_name}-web-sg"
Tier = "web"
})
}
# Database security group
resource "aws_security_group" "database" {
name_prefix = "${var.project_name}-db-"
vpc_id = var.vpc_id
description = "Security group for database tier"
ingress {
description = "MySQL from web tier"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_groups = [aws_security_group.web_tier.id]
}
# No outbound internet access for database
egress {
description = "No outbound access"
from_port = 0
to_port = 0
protocol = "-1"
self = true
}
tags = merge(local.common_tags, {
Name = "${var.project_name}-db-sg"
Tier = "database"
})
}
# Network ACLs for additional security
resource "aws_network_acl" "private" {
vpc_id = var.vpc_id
subnet_ids = var.private_subnet_ids
# Allow inbound from VPC
ingress {
rule_no = 100
protocol = "-1"
from_port = 0
to_port = 0
cidr_block = var.vpc_cidr
action = "allow"
}
# Allow outbound to VPC
egress {
rule_no = 100
protocol = "-1"
from_port = 0
to_port = 0
cidr_block = var.vpc_cidr
action = "allow"
}
# Allow HTTPS outbound for updates
egress {
rule_no = 200
protocol = "tcp"
from_port = 443
to_port = 443
cidr_block = "0.0.0.0/0"
action = "allow"
}
tags = merge(local.common_tags, {
Name = "${var.project_name}-private-nacl"
})
}HCLCompliance and Governance
# Resource tagging for compliance
locals {
compliance_tags = {
# Required tags for compliance
DataClassification = var.data_classification
Compliance = var.compliance_requirements
Owner = var.resource_owner
CostCenter = var.cost_center
Project = var.project_name
Environment = var.environment
# Operational tags
BackupRequired = var.backup_required
MonitoringLevel = var.monitoring_level
PatchGroup = var.patch_group
# Security tags
SecurityZone = var.security_zone
EncryptionRequired = "true"
# Audit tags
CreatedBy = "Terraform"
CreatedDate = formatdate("YYYY-MM-DD", timestamp())
LastModified = formatdate("YYYY-MM-DD", timestamp())
}
}
# Compliance validation
variable "data_classification" {
description = "Data classification level"
type = string
validation {
condition = contains([
"public", "internal", "confidential", "restricted"
], var.data_classification)
error_message = "Data classification must be one of: public, internal, confidential, restricted."
}
}
# Resource compliance checks
resource "aws_s3_bucket" "secure_bucket" {
bucket = "${var.project_name}-${var.environment}-secure-${random_id.bucket_suffix.hex}"
tags = local.compliance_tags
lifecycle {
precondition {
condition = var.data_classification != "public"
error_message = "Secure buckets cannot be used for public data."
}
}
}
resource "aws_s3_bucket_encryption_configuration" "secure_bucket" {
bucket = aws_s3_bucket.secure_bucket.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = aws_kms_key.s3_key.arn
}
bucket_key_enabled = true
}
}
resource "aws_s3_bucket_public_access_block" "secure_bucket" {
bucket = aws_s3_bucket.secure_bucket.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
# CloudTrail for audit logging
resource "aws_cloudtrail" "audit_trail" {
name = "${var.project_name}-${var.environment}-audit-trail"
s3_bucket_name = aws_s3_bucket.audit_logs.bucket
include_global_service_events = true
is_multi_region_trail = true
enable_logging = true
event_selector {
read_write_type = "All"
include_management_events = true
data_resource {
type = "AWS::S3::Object"
values = ["${aws_s3_bucket.secure_bucket.arn}/*"]
}
}
tags = local.compliance_tags
}HCL14. Cost Optimization
Understanding Cloud Costs with Terraform
Cost optimization in Terraform involves designing infrastructure that meets performance requirements while minimizing expenses. This includes right-sizing resources, using cost-effective services, and implementing automated cost controls.
graph TB
A[Cost Optimization Strategy] --> B[Resource Right-Sizing]
A --> C[Reserved Instances]
A --> D[Spot Instances]
A --> E[Auto Scaling]
A --> F[Storage Optimization]
A --> G[Cost Monitoring]
A --> H[Lifecycle Management]
B --> I[CPU/Memory Analysis]
B --> J[Performance Monitoring]
C --> K[1-3 Year Commitments]
C --> L[Significant Savings]
D --> M[Up to 90% Savings]
D --> N[Fault-Tolerant Workloads]
E --> O[Scale Based on Demand]
E --> P[Automatic Optimization]
F --> Q[Storage Classes]
F --> R[Data Lifecycle]
G --> S[Cost Alerts]
G --> T[Budget Controls]
H --> U[Automated Cleanup]
H --> V[Resource Scheduling]Cost-Aware Resource Configuration
Right-Sizing EC2 Instances
# Cost-optimized instance selection
locals {
# Define cost-optimized instance types per environment
instance_types = {
dev = {
web = "t3.micro" # $0.0104/hour
app = "t3.small" # $0.0208/hour
db = "t3.micro" # For RDS
}
staging = {
web = "t3.small" # $0.0208/hour
app = "t3.medium" # $0.0416/hour
db = "t3.small" # For RDS
}
prod = {
web = "c5.large" # $0.085/hour - CPU optimized
app = "m5.xlarge" # $0.192/hour - Balanced
db = "r5.large" # Memory optimized for database
}
}
# Calculate optimal instance type based on requirements
web_instance_type = local.instance_types[var.environment].web
app_instance_type = local.instance_types[var.environment].app
}
# Cost-optimized Auto Scaling configuration
resource "aws_autoscaling_group" "web" {
name = "${var.project_name}-${var.environment}-web-asg"
target_group_arns = [aws_lb_target_group.web.arn]
health_check_type = "ELB"
# Cost optimization: Scale based on actual demand
min_size = var.environment == "prod" ? 2 : 1
max_size = var.environment == "prod" ? 10 : 3
desired_capacity = var.environment == "prod" ? 3 : 1
# Use multiple instance types for cost optimization
mixed_instances_policy {
instances_distribution {
on_demand_base_capacity = var.environment == "prod" ? 2 : 1
on_demand_percentage_above_base_capacity = var.environment == "prod" ? 20 : 100
spot_allocation_strategy = "diversified"
}
launch_template {
launch_template_specification {
launch_template_id = aws_launch_template.web.id
version = "$Latest"
}
# Define multiple instance types for flexibility and cost savings
override {
instance_type = "t3.medium"
weighted_capacity = "1"
}
override {
instance_type = "t3.large"
weighted_capacity = "2"
}
override {
instance_type = "m5.large"
weighted_capacity = "2"
}
}
}
# Cost optimization: Spread across AZs for better spot pricing
vpc_zone_identifier = var.private_subnet_ids
tag {
key = "Name"
value = "${var.project_name}-${var.environment}-web"
propagate_at_launch = true
}
tag {
key = "Environment"
value = var.environment
propagate_at_launch = true
}
tag {
key = "CostCenter"
value = var.cost_center
propagate_at_launch = true
}
}HCL13. CI/CD Integration
GitHub Actions Integration
# .github/workflows/terraform.yml
name: 'Terraform'
on:
push:
branches: [ "main" ]
paths: [ "terraform/**" ]
pull_request:
branches: [ "main" ]
paths: [ "terraform/**" ]
permissions:
contents: read
pull-requests: write
jobs:
terraform:
name: 'Terraform'
runs-on: ubuntu-latest
environment: production
defaults:
run:
shell: bash
working-directory: ./terraform
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: ~1.0
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
aws-region: us-west-2
- name: Terraform Format
id: fmt
run: terraform fmt -check
continue-on-error: true
- name: Terraform Init
id: init
run: terraform init
- name: Terraform Validate
id: validate
run: terraform validate -no-color
- name: Terraform Plan
id: plan
if: github.event_name == 'pull_request'
run: terraform plan -no-color -input=false
continue-on-error: true
- name: Update Pull Request
uses: actions/github-script@v7
if: github.event_name == 'pull_request'
env:
PLAN: "terraform\n${{ steps.plan.outputs.stdout }}"
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
const output = `#### Terraform Format and Style 🖌\`${{ steps.fmt.outcome }}\`
#### Terraform Initialization ⚙️\`${{ steps.init.outcome }}\`
#### Terraform Validation 🤖\`${{ steps.validate.outcome }}\`
#### Terraform Plan 📖\`${{ steps.plan.outcome }}\`
<details><summary>Show Plan</summary>
\`\`\`\n
${process.env.PLAN}
\`\`\`
</details>
*Pushed by: @${{ github.actor }}, Action: \`${{ github.event_name }}\`*`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
})
- name: Terraform Plan Status
if: steps.plan.outcome == 'failure'
run: exit 1
- name: Terraform Apply
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: terraform apply -auto-approve -input=falseYAMLAzure DevOps Pipeline
# azure-pipelines.yml
trigger:
branches:
include:
- main
paths:
include:
- terraform/*
pool:
vmImage: 'ubuntu-latest'
variables:
terraformVersion: '1.5.0'
workingDirectory: '$(System.DefaultWorkingDirectory)/terraform'
stages:
- stage: validate
jobs:
- job: validate
displayName: 'Terraform Validate'
steps:
- task: TerraformInstaller@0
displayName: 'Install Terraform'
inputs:
terraformVersion: $(terraformVersion)
- task: TerraformTaskV2@2
displayName: 'Terraform Init'
inputs:
provider: 'aws'
command: 'init'
workingDirectory: $(workingDirectory)
backendServiceAWS: 'aws-service-connection'
backendAWSBucketName: 'terraform-state-bucket'
backendAWSKey: 'terraform.tfstate'
- task: TerraformTaskV2@2
displayName: 'Terraform Validate'
inputs:
provider: 'aws'
command: 'validate'
workingDirectory: $(workingDirectory)
- task: TerraformTaskV2@2
displayName: 'Terraform Plan'
inputs:
provider: 'aws'
command: 'plan'
workingDirectory: $(workingDirectory)
environmentServiceNameAWS: 'aws-service-connection'
- stage: deploy
condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
jobs:
- deployment: deploy_infrastructure
displayName: 'Deploy Infrastructure'
environment: 'production'
strategy:
runOnce:
deploy:
steps:
- task: TerraformTaskV2@2
displayName: 'Terraform Apply'
inputs:
provider: 'aws'
command: 'apply'
workingDirectory: $(workingDirectory)
environmentServiceNameAWS: 'aws-service-connection'YAMLGitLab CI/CD
# .gitlab-ci.yml
image: hashicorp/terraform:1.5
variables:
TF_ROOT: ${CI_PROJECT_DIR}/terraform
TF_ADDRESS: ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_COMMIT_REF_NAME}
cache:
key: "${CI_COMMIT_REF_NAME}"
paths:
- ${TF_ROOT}/.terraform
before_script:
- cd ${TF_ROOT}
stages:
- validate
- plan
- apply
validate:
stage: validate
script:
- terraform --version
- terraform init
- terraform validate
- terraform fmt -check
plan:
stage: plan
script:
- terraform init
- terraform plan -out="planfile"
artifacts:
paths:
- ${TF_ROOT}/planfile
expire_in: 1 week
only:
- merge_requests
apply:
stage: apply
script:
- terraform init
- terraform apply -input=false "planfile"
dependencies:
- plan
only:
- main
when: manualYAMLMulti-Environment Pipeline
graph LR
A[Code Commit] --> B[Validate & Lint]
B --> C[Plan Dev]
C --> D[Apply Dev]
D --> E[Test Dev]
E --> F[Plan Staging]
F --> G[Apply Staging]
G --> H[Test Staging]
H --> I[Plan Prod]
I --> J[Manual Approval]
J --> K[Apply Prod]# multi-environment-pipeline.yml
name: Multi-Environment Deployment
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- name: Terraform Format Check
run: terraform fmt -check -recursive
- name: Terraform Validate
run: |
for env in dev staging prod; do
cd environments/$env
terraform init -backend=false
terraform validate
cd ../..
done
plan-dev:
needs: validate
runs-on: ubuntu-latest
environment: development
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_DEV }}
aws-region: us-west-2
- name: Terraform Plan Dev
working-directory: ./environments/dev
run: |
terraform init
terraform plan -out=tfplan
deploy-dev:
needs: plan-dev
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: development
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_DEV }}
aws-region: us-west-2
- name: Terraform Apply Dev
working-directory: ./environments/dev
run: |
terraform init
terraform apply -auto-approve
plan-staging:
needs: deploy-dev
runs-on: ubuntu-latest
environment: staging
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_STAGING }}
aws-region: us-west-2
- name: Terraform Plan Staging
working-directory: ./environments/staging
run: |
terraform init
terraform plan -out=tfplan
deploy-staging:
needs: plan-staging
runs-on: ubuntu-latest
environment: staging
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_STAGING }}
aws-region: us-west-2
- name: Terraform Apply Staging
working-directory: ./environments/staging
run: |
terraform init
terraform apply -auto-approve
plan-prod:
needs: deploy-staging
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_PROD }}
aws-region: us-west-2
- name: Terraform Plan Production
working-directory: ./environments/prod
run: |
terraform init
terraform plan -out=tfplan
deploy-prod:
needs: plan-prod
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_PROD }}
aws-region: us-west-2
- name: Terraform Apply Production
working-directory: ./environments/prod
run: |
terraform init
terraform apply -auto-approveYAML15. Troubleshooting and Debugging
Common Error Patterns
graph TB
A[Terraform Errors] --> B[State Issues]
A --> C[Provider Issues]
A --> D[Resource Conflicts]
A --> E[Permission Issues]
A --> F[Configuration Errors]
B --> G[State Lock]
B --> H[State Drift]
B --> I[Corrupted State]
C --> J[Version Conflicts]
C --> K[Authentication]
C --> L[Rate Limiting]
D --> M[Resource Already Exists]
D --> N[Dependency Cycles]
D --> O[Name Conflicts]Debugging Techniques
# Enable debug logging
export TF_LOG=DEBUG
export TF_LOG_PATH=./terraform.log
terraform apply
# Trace specific operations
export TF_LOG=TRACE
terraform plan
# Provider-specific logging
export TF_LOG_PROVIDER=DEBUG
# Core Terraform logging
export TF_LOG_CORE=DEBUG
# State debugging
terraform state list
terraform state show aws_instance.example
terraform show -json | jq '.values.root_module.resources[]'
# Import existing resources
terraform import aws_instance.example i-1234567890abcdef0
# Force unlock state
terraform force-unlock LOCK_ID
# Refresh state manually
terraform refresh
# Target specific resources
terraform plan -target=aws_instance.example
terraform apply -target=module.vpc
# Validate configuration
terraform validate
terraform fmt -check -diffBashState Recovery
# Backup current state
terraform state pull > terraform.tfstate.backup
# List all resources in state
terraform state list
# Remove corrupted resource from state
terraform state rm aws_instance.corrupted
# Move resource to new address
terraform state mv aws_instance.old aws_instance.new
# Replace provider configuration
terraform state replace-provider registry.terraform.io/hashicorp/aws hashicorp/aws
# Import missing resources
terraform import aws_vpc.main vpc-12345678
terraform import aws_subnet.public subnet-12345678
# Manually edit state (dangerous!)
terraform state pull > state.json
# Edit state.json carefully
terraform state push state.jsonBashError Resolution Examples
# Fix: Resource already exists error
resource "aws_s3_bucket" "example" {
bucket = "my-unique-bucket-${random_id.bucket_suffix.hex}"
lifecycle {
# Prevent destruction of existing bucket
prevent_destroy = true
}
}
resource "random_id" "bucket_suffix" {
byte_length = 4
}
# Fix: Dependency cycle
# Before (creates cycle):
resource "aws_security_group" "web" {
ingress {
security_groups = [aws_security_group.db.id]
}
}
resource "aws_security_group" "db" {
ingress {
security_groups = [aws_security_group.web.id]
}
}
# After (using security group rules):
resource "aws_security_group" "web" {
# Base security group without rules
}
resource "aws_security_group" "db" {
# Base security group without rules
}
resource "aws_security_group_rule" "web_to_db" {
type = "egress"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_group_id = aws_security_group.web.id
source_security_group_id = aws_security_group.db.id
}
resource "aws_security_group_rule" "db_from_web" {
type = "ingress"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_group_id = aws_security_group.db.id
source_security_group_id = aws_security_group.web.id
}
# Fix: Provider authentication
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = var.aws_region
# Use assume role for cross-account access
assume_role {
role_arn = "arn:aws:iam::123456789012:role/TerraformRole"
session_name = "terraform-session"
external_id = var.external_id
}
# Default tags for all resources
default_tags {
tags = {
ManagedBy = "Terraform"
Environment = var.environment
}
}
}HCLPerformance Optimization
# Parallel execution
terraform plan -parallelism=20
terraform apply -parallelism=20
# Reduce plan time with targeted operations
terraform plan -target=module.vpc
terraform plan -refresh=false
# Use partial configuration for backends
terraform init \
-backend-config="bucket=my-terraform-state" \
-backend-config="key=terraform.tfstate" \
-backend-config="region=us-west-2"
# Optimize provider configuration
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
# Reduce provider initialization time
experiments = [module_variable_optional_attrs]
}
provider "aws" {
region = var.aws_region
# Skip metadata API check for faster execution
skip_metadata_api_check = true
# Skip region validation
skip_region_validation = true
# Skip credentials validation
skip_credentials_validation = true
}Bash16. Multi-Cloud Strategies
Understanding Multi-Cloud Architecture
Multi-cloud strategies involve using multiple cloud providers to avoid vendor lock-in, improve resilience, and leverage best-of-breed services from different providers.
graph TB
A[Multi-Cloud Strategy] --> B[Hybrid Deployment]
A --> C[Disaster Recovery]
A --> D[Cost Optimization]
A --> E[Compliance Requirements]
A --> F[Best-of-Breed Services]
B --> G[Primary: AWS]
B --> H[Secondary: Azure]
B --> I[Tertiary: GCP]
C --> J[Cross-Region Backup]
C --> K[Failover Mechanisms]
D --> L[Provider Comparison]
D --> M[Spot Pricing]
E --> N[Data Sovereignty]
E --> O[Regional Compliance]
F --> P[AWS: EC2/S3]
F --> Q[Azure: AD/Functions]
F --> R[GCP: BigQuery/AI]Provider Configuration
Multi-Provider Setup
# providers.tf
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0"
}
google = {
source = "hashicorp/google"
version = "~> 4.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.0"
}
}
}
# AWS Provider Configuration
provider "aws" {
region = var.aws_region
default_tags {
tags = {
Environment = var.environment
Project = var.project_name
ManagedBy = "Terraform"
CloudProvider = "AWS"
}
}
}
# Azure Provider Configuration
provider "azurerm" {
features {
resource_group {
prevent_deletion_if_contains_resources = false
}
key_vault {
purge_soft_delete_on_destroy = true
recover_soft_deleted_key_vaults = true
}
}
subscription_id = var.azure_subscription_id
tenant_id = var.azure_tenant_id
client_id = var.azure_client_id
client_secret = var.azure_client_secret
}
# Google Cloud Provider Configuration
provider "google" {
project = var.gcp_project_id
region = var.gcp_region
zone = var.gcp_zone
credentials = var.gcp_credentials_file
}
# Kubernetes Provider (can work with any cloud)
provider "kubernetes" {
config_path = var.kubeconfig_path
config_context = var.kubernetes_context
}HCLCloud-Agnostic Infrastructure Patterns
Abstract Infrastructure Module
# modules/cloud-agnostic-compute/main.tf
variable "cloud_provider" {
description = "Target cloud provider"
type = string
validation {
condition = contains(["aws", "azure", "gcp"], var.cloud_provider)
error_message = "Cloud provider must be aws, azure, or gcp."
}
}
variable "instance_config" {
description = "Instance configuration"
type = object({
name = string
instance_type = string
image_id = string
subnet_id = string
key_name = string
})
}
# AWS Implementation
resource "aws_instance" "compute" {
count = var.cloud_provider == "aws" ? 1 : 0
ami = var.instance_config.image_id
instance_type = var.instance_config.instance_type
subnet_id = var.instance_config.subnet_id
key_name = var.instance_config.key_name
tags = {
Name = var.instance_config.name
}
}
# Azure Implementation
resource "azurerm_linux_virtual_machine" "compute" {
count = var.cloud_provider == "azure" ? 1 : 0
name = var.instance_config.name
resource_group_name = var.azure_resource_group
location = var.azure_location
size = var.instance_config.instance_type
admin_username = "azureuser"
disable_password_authentication = true
network_interface_ids = [
azurerm_network_interface.compute[0].id,
]
admin_ssh_key {
username = "azureuser"
public_key = file(var.instance_config.key_name)
}
os_disk {
caching = "ReadWrite"
storage_account_type = "Premium_LRS"
}
source_image_reference {
publisher = "Canonical"
offer = "0001-com-ubuntu-server-focal"
sku = "20_04-lts-gen2"
version = "latest"
}
}
# GCP Implementation
resource "google_compute_instance" "compute" {
count = var.cloud_provider == "gcp" ? 1 : 0
name = var.instance_config.name
machine_type = var.instance_config.instance_type
zone = var.gcp_zone
boot_disk {
initialize_params {
image = var.instance_config.image_id
}
}
network_interface {
subnetwork = var.instance_config.subnet_id
access_config {
// Ephemeral public IP
}
}
metadata = {
ssh-keys = "ubuntu:${file(var.instance_config.key_name)}"
}
service_account {
scopes = ["cloud-platform"]
}
}HCLCross-Cloud Networking
VPN Connections Between Clouds
# AWS VPC
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.project_name}-aws-vpc"
}
}
# AWS VPN Gateway
resource "aws_vpn_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.project_name}-aws-vpn-gateway"
}
}
# Azure Virtual Network
resource "azurerm_virtual_network" "main" {
name = "${var.project_name}-azure-vnet"
address_space = ["10.1.0.0/16"]
location = var.azure_location
resource_group_name = azurerm_resource_group.main.name
}
# Azure VPN Gateway
resource "azurerm_virtual_network_gateway" "main" {
name = "${var.project_name}-azure-vpn-gateway"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
type = "Vpn"
vpn_type = "RouteBased"
active_active = false
enable_bgp = false
sku = "VpnGw1"
ip_configuration {
name = "vnetGatewayConfig"
public_ip_address_id = azurerm_public_ip.vpn_gateway.id
private_ip_address_allocation = "Dynamic"
subnet_id = azurerm_subnet.gateway_subnet.id
}
}
# Site-to-Site VPN Connection
resource "aws_vpn_connection" "aws_to_azure" {
vpn_gateway_id = aws_vpn_gateway.main.id
customer_gateway_id = aws_customer_gateway.azure.id
type = "ipsec.1"
static_routes_only = true
tags = {
Name = "${var.project_name}-aws-to-azure-vpn"
}
}
resource "aws_customer_gateway" "azure" {
bgp_asn = 65000
ip_address = azurerm_public_ip.vpn_gateway.ip_address
type = "ipsec.1"
tags = {
Name = "${var.project_name}-azure-customer-gateway"
}
}HCLMulti-Cloud Data Strategy
Cross-Cloud Data Replication
# AWS S3 Bucket
resource "aws_s3_bucket" "primary" {
bucket = "${var.project_name}-primary-${random_id.bucket_suffix.hex}"
tags = {
Name = "${var.project_name}-primary-storage"
Provider = "AWS"
Purpose = "primary-data"
}
}
# Azure Storage Account
resource "azurerm_storage_account" "secondary" {
name = "${replace(var.project_name, "-", "")}secondary${random_id.bucket_suffix.hex}"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
account_tier = "Standard"
account_replication_type = "LRS"
tags = {
Name = "${var.project_name}-secondary-storage"
Provider = "Azure"
Purpose = "secondary-data"
}
}
# GCP Storage Bucket
resource "google_storage_bucket" "tertiary" {
name = "${var.project_name}-tertiary-${random_id.bucket_suffix.hex}"
location = var.gcp_region
versioning {
enabled = true
}
labels = {
name = "${replace(var.project_name, "-", "_")}_tertiary_storage"
provider = "gcp"
purpose = "tertiary_data"
}
}
# Cross-cloud replication using Lambda/Functions
resource "aws_lambda_function" "data_replication" {
filename = "data_replication.zip"
function_name = "${var.project_name}-data-replication"
role = aws_iam_role.lambda_replication.arn
handler = "index.handler"
source_code_hash = filebase64sha256("data_replication.zip")
runtime = "python3.9"
timeout = 300
environment {
variables = {
AZURE_STORAGE_ACCOUNT = azurerm_storage_account.secondary.name
AZURE_STORAGE_KEY = azurerm_storage_account.secondary.primary_access_key
GCP_BUCKET_NAME = google_storage_bucket.tertiary.name
GCP_PROJECT_ID = var.gcp_project_id
}
}
tags = {
Name = "${var.project_name}-data-replication"
Purpose = "cross-cloud-sync"
}
}
# S3 Event trigger for replication
resource "aws_s3_bucket_notification" "replication_trigger" {
bucket = aws_s3_bucket.primary.id
lambda_function {
lambda_function_arn = aws_lambda_function.data_replication.arn
events = ["s3:ObjectCreated:*"]
}
depends_on = [aws_lambda_permission.s3_invoke_lambda]
}HCLDisaster Recovery Across Clouds
Multi-Cloud Backup Strategy
# Disaster Recovery Configuration
locals {
dr_config = {
primary_cloud = "aws"
secondary_cloud = "azure"
tertiary_cloud = "gcp"
rpo_minutes = 60 # Recovery Point Objective
rto_minutes = 30 # Recovery Time Objective
}
}
# AWS Primary Database
resource "aws_rds_instance" "primary_db" {
identifier = "${var.project_name}-primary-db"
engine = "mysql"
engine_version = "8.0.35"
instance_class = "db.t3.medium"
allocated_storage = 100
max_allocated_storage = 1000
storage_encrypted = true
db_name = var.database_name
username = var.db_username
password = random_password.db_password.result
backup_retention_period = 35
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
# Enable automated backups to S3
copy_tags_to_snapshot = true
tags = {
Name = "${var.project_name}-primary-database"
Provider = "AWS"
Purpose = "primary-database"
}
}
# Azure Secondary Database
resource "azurerm_mysql_server" "secondary_db" {
name = "${var.project_name}-secondary-db"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
administrator_login = var.db_username
administrator_login_password = random_password.db_password.result
sku_name = "B_Gen5_2"
storage_mb = 102400
version = "8.0"
auto_grow_enabled = true
backup_retention_days = 35
geo_redundant_backup_enabled = true
infrastructure_encryption_enabled = true
public_network_access_enabled = false
ssl_enforcement_enabled = true
ssl_minimal_tls_version_enforced = "TLS1_2"
tags = {
Name = "${var.project_name}-secondary-database"
Provider = "Azure"
Purpose = "secondary-database"
}
}
# Database replication monitoring
resource "aws_cloudwatch_metric_alarm" "database_replication_lag" {
alarm_name = "${var.project_name}-db-replication-lag"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "DatabaseConnections"
namespace = "AWS/RDS"
period = "300"
statistic = "Average"
threshold = "100"
alarm_description = "This metric monitors RDS database connections"
dimensions = {
DBInstanceIdentifier = aws_rds_instance.primary_db.id
}
alarm_actions = [aws_sns_topic.dr_alerts.arn]
tags = {
Name = "${var.project_name}-db-monitoring"
Purpose = "disaster-recovery"
}
}HCLCloud-Agnostic Deployment Patterns
Kubernetes Multi-Cloud Deployment
# Kubernetes cluster definitions for different clouds
resource "aws_eks_cluster" "aws_cluster" {
count = var.deploy_to_aws ? 1 : 0
name = "${var.project_name}-aws-cluster"
role_arn = aws_iam_role.eks_cluster[0].arn
version = var.kubernetes_version
vpc_config {
subnet_ids = var.aws_subnet_ids
endpoint_private_access = true
endpoint_public_access = true
public_access_cidrs = var.allowed_cidr_blocks
}
encryption_config {
provider {
key_arn = aws_kms_key.eks[0].arn
}
resources = ["secrets"]
}
tags = {
Name = "${var.project_name}-aws-eks"
Provider = "AWS"
}
}
resource "azurerm_kubernetes_cluster" "azure_cluster" {
count = var.deploy_to_azure ? 1 : 0
name = "${var.project_name}-azure-cluster"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
dns_prefix = "${var.project_name}-azure"
kubernetes_version = var.kubernetes_version
default_node_pool {
name = "default"
node_count = 3
vm_size = "Standard_D2_v2"
}
identity {
type = "SystemAssigned"
}
network_profile {
network_plugin = "azure"
}
tags = {
Name = "${var.project_name}-azure-aks"
Provider = "Azure"
}
}
resource "google_container_cluster" "gcp_cluster" {
count = var.deploy_to_gcp ? 1 : 0
name = "${var.project_name}-gcp-cluster"
location = var.gcp_region
# We can't create a cluster with no node pool defined, but we want to only use
# separately managed node pools. So we create the smallest possible default
# node pool and immediately delete it.
remove_default_node_pool = true
initial_node_count = 1
master_version = var.kubernetes_version
network = var.gcp_network
subnetwork = var.gcp_subnetwork
# Enable network policy
network_policy {
enabled = true
}
# Enable IP alias
ip_allocation_policy {}
# Enable Workload Identity
workload_identity_config {
workload_pool = "${var.gcp_project_id}.svc.id.goog"
}
}
# Application deployment across clusters
resource "kubernetes_deployment" "app" {
depends_on = [
aws_eks_cluster.aws_cluster,
azurerm_kubernetes_cluster.azure_cluster,
google_container_cluster.gcp_cluster
]
metadata {
name = var.app_name
labels = {
app = var.app_name
}
}
spec {
replicas = var.replica_count
selector {
match_labels = {
app = var.app_name
}
}
template {
metadata {
labels = {
app = var.app_name
}
}
spec {
container {
image = var.app_image
name = var.app_name
port {
container_port = var.app_port
}
env {
name = "CLOUD_PROVIDER"
value = var.current_cloud_provider
}
env {
name = "ENVIRONMENT"
value = var.environment
}
resources {
limits = {
cpu = "500m"
memory = "512Mi"
}
requests = {
cpu = "250m"
memory = "256Mi"
}
}
}
}
}
}
}HCLCost Optimization Across Clouds
Multi-Cloud Cost Comparison
# Cost optimization data sources
data "external" "cloud_pricing" {
program = ["python", "${path.module}/scripts/multi_cloud_pricing.py"]
query = {
instance_type = var.instance_type
storage_gb = var.storage_size
region_aws = var.aws_region
region_azure = var.azure_location
region_gcp = var.gcp_region
}
}
# Dynamic cloud selection based on cost
locals {
cloud_costs = jsondecode(data.external.cloud_pricing.result.costs)
cheapest_cloud = data.external.cloud_pricing.result.cheapest_provider
# Deploy to the most cost-effective cloud
deploy_to_aws = local.cheapest_cloud == "aws"
deploy_to_azure = local.cheapest_cloud == "azure"
deploy_to_gcp = local.cheapest_cloud == "gcp"
}
# Output cost comparison
output "cloud_cost_comparison" {
description = "Cost comparison across cloud providers"
value = {
aws_monthly_cost = local.cloud_costs.aws
azure_monthly_cost = local.cloud_costs.azure
gcp_monthly_cost = local.cloud_costs.gcp
cheapest_provider = local.cheapest_cloud
savings_percentage = data.external.cloud_pricing.result.savings_percentage
}
}HCLMulti-Cloud Security and Compliance
Unified Security Policies
# Cross-cloud security policy enforcement
resource "aws_config_configuration_recorder" "recorder" {
count = var.deploy_to_aws ? 1 : 0
name = "${var.project_name}-aws-config-recorder"
role_arn = aws_iam_role.config[0].arn
recording_group {
all_supported = true
include_global_resource_types = true
}
}
resource "azurerm_policy_assignment" "security_baseline" {
count = var.deploy_to_azure ? 1 : 0
name = "${var.project_name}-security-baseline"
scope = azurerm_resource_group.main.id
policy_definition_id = "/providers/Microsoft.Authorization/policySetDefinitions/179d1daa-458f-4e47-8086-2a68d0d6c38f"
parameters = jsonencode({
logAnalyticsWorkspaceId = {
value = azurerm_log_analytics_workspace.main[0].id
}
})
}
# Cloud Security Posture Management (CSPM) integration
resource "aws_securityhub_account" "main" {
count = var.deploy_to_aws ? 1 : 0
}
resource "azurerm_security_center_subscription_pricing" "main" {
count = var.deploy_to_azure ? 1 : 0
tier = "Standard"
resource_type = "VirtualMachines"
}HCLThis multi-cloud strategies section provides comprehensive guidance for managing infrastructure across multiple cloud providers, including networking, data replication, disaster recovery, and cost optimization strategies.
17. Advanced Deployment Patterns
Understanding Advanced Deployment Strategies
Advanced deployment patterns go beyond basic infrastructure provisioning to include sophisticated strategies for rolling updates, canary deployments, blue-green deployments, and infrastructure evolution without downtime.
graph TB
A[Advanced Deployment Patterns] --> B[Blue-Green Deployment]
A --> C[Canary Deployment]
A --> D[Rolling Updates]
A --> E[Feature Toggles]
A --> F[Infrastructure Versioning]
A --> G[Progressive Delivery]
A --> H[Chaos Engineering]
B --> I[Zero Downtime]
B --> J[Quick Rollback]
C --> K[Risk Mitigation]
C --> L[Gradual Rollout]
D --> M[Minimal Disruption]
D --> N[Automated Updates]
E --> O[Runtime Control]
E --> P[A/B Testing]
F --> Q[Immutable Infrastructure]
F --> R[Version Tracking]
G --> S[Observability-Driven]
G --> T[Automated Decisions]Blue-Green Deployment Pattern
Blue-green deployment maintains two identical production environments, allowing instant switching between versions with zero downtime.
# modules/blue-green-deployment/main.tf
locals {
# Determine active and standby environments
blue_active = var.active_environment == "blue"
green_active = var.active_environment == "green"
# Current and next environment configurations
current_env = var.active_environment
next_env = var.active_environment == "blue" ? "green" : "blue"
# Environment-specific configurations
environments = {
blue = {
target_group_arn = aws_lb_target_group.blue.arn
asg_name = aws_autoscaling_group.blue.name
color = "blue"
}
green = {
target_group_arn = aws_lb_target_group.green.arn
asg_name = aws_autoscaling_group.green.name
color = "green"
}
}
}
# Application Load Balancer (shared)
resource "aws_lb" "main" {
name = "${var.project_name}-${var.environment}-alb"
internal = false
load_balancer_type = "application"
security_groups = var.security_group_ids
subnets = var.public_subnet_ids
enable_deletion_protection = var.environment == "prod"
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-alb"
DeploymentPattern = "blue-green"
})
}
# Blue Environment Target Group
resource "aws_lb_target_group" "blue" {
name = "${var.project_name}-${var.environment}-blue-tg"
port = var.application_port
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
enabled = true
healthy_threshold = 2
unhealthy_threshold = 2
timeout = 5
interval = 30
path = var.health_check_path
matcher = "200"
port = "traffic-port"
protocol = "HTTP"
}
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-blue-tg"
Environment = "blue"
})
}
# Green Environment Target Group
resource "aws_lb_target_group" "green" {
name = "${var.project_name}-${var.environment}-green-tg"
port = var.application_port
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
enabled = true
healthy_threshold = 2
unhealthy_threshold = 2
timeout = 5
interval = 30
path = var.health_check_path
matcher = "200"
port = "traffic-port"
protocol = "HTTP"
}
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-green-tg"
Environment = "green"
})
}
# ALB Listener with traffic routing
resource "aws_lb_listener" "main" {
load_balancer_arn = aws_lb.main.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
forward {
target_group {
arn = local.blue_active ? aws_lb_target_group.blue.arn : aws_lb_target_group.green.arn
weight = 100
}
# Standby environment for testing
target_group {
arn = local.blue_active ? aws_lb_target_group.green.arn : aws_lb_target_group.blue.arn
weight = 0
}
}
}
tags = var.tags
}
# Blue Environment Auto Scaling Group
resource "aws_autoscaling_group" "blue" {
name = "${var.project_name}-${var.environment}-blue-asg"
vpc_zone_identifier = var.private_subnet_ids
target_group_arns = [aws_lb_target_group.blue.arn]
min_size = local.blue_active ? var.min_instances : 0
max_size = local.blue_active ? var.max_instances : var.max_instances
desired_capacity = local.blue_active ? var.desired_instances : 0
launch_template {
id = aws_launch_template.blue.id
version = "$Latest"
}
# Health check configuration
health_check_type = "ELB"
health_check_grace_period = 300
# Instance refresh for rolling updates
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
instance_warmup = 300
}
}
tag {
key = "Name"
value = "${var.project_name}-${var.environment}-blue"
propagate_at_launch = true
}
tag {
key = "Environment"
value = "blue"
propagate_at_launch = true
}
tag {
key = "Active"
value = local.blue_active ? "true" : "false"
propagate_at_launch = true
}
lifecycle {
create_before_destroy = true
}
}
# Green Environment Auto Scaling Group
resource "aws_autoscaling_group" "green" {
name = "${var.project_name}-${var.environment}-green-asg"
vpc_zone_identifier = var.private_subnet_ids
target_group_arns = [aws_lb_target_group.green.arn]
min_size = local.green_active ? var.min_instances : 0
max_size = local.green_active ? var.max_instances : var.max_instances
desired_capacity = local.green_active ? var.desired_instances : 0
launch_template {
id = aws_launch_template.green.id
version = "$Latest"
}
health_check_type = "ELB"
health_check_grace_period = 300
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
instance_warmup = 300
}
}
tag {
key = "Name"
value = "${var.project_name}-${var.environment}-green"
propagate_at_launch = true
}
tag {
key = "Environment"
value = "green"
propagate_at_launch = true
}
tag {
key = "Active"
value = local.green_active ? "true" : "false"
propagate_at_launch = true
}
lifecycle {
create_before_destroy = true
}
}
# Launch Templates for Blue Environment
resource "aws_launch_template" "blue" {
name_prefix = "${var.project_name}-${var.environment}-blue-"
image_id = var.blue_ami_id
instance_type = var.instance_type
key_name = var.key_name
vpc_security_group_ids = var.security_group_ids
user_data = base64encode(templatefile("${path.module}/templates/user-data.sh.tpl", {
environment = "blue"
application_port = var.application_port
app_version = var.blue_app_version
config_bucket = var.config_bucket
region = var.aws_region
}))
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = var.volume_size
volume_type = "gp3"
encrypted = true
}
}
iam_instance_profile {
name = var.instance_profile_name
}
tag_specifications {
resource_type = "instance"
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-blue"
Environment = "blue"
})
}
lifecycle {
create_before_destroy = true
}
}
# Launch Templates for Green Environment
resource "aws_launch_template" "green" {
name_prefix = "${var.project_name}-${var.environment}-green-"
image_id = var.green_ami_id
instance_type = var.instance_type
key_name = var.key_name
vpc_security_group_ids = var.security_group_ids
user_data = base64encode(templatefile("${path.module}/templates/user-data.sh.tpl", {
environment = "green"
application_port = var.application_port
app_version = var.green_app_version
config_bucket = var.config_bucket
region = var.aws_region
}))
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = var.volume_size
volume_type = "gp3"
encrypted = true
}
}
iam_instance_profile {
name = var.instance_profile_name
}
tag_specifications {
resource_type = "instance"
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-green"
Environment = "green"
})
}
lifecycle {
create_before_destroy = true
}
}
# CloudWatch Alarms for Blue Environment
resource "aws_cloudwatch_metric_alarm" "blue_high_cpu" {
count = local.blue_active ? 1 : 0
alarm_name = "${var.project_name}-${var.environment}-blue-high-cpu"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "300"
statistic = "Average"
threshold = "80"
alarm_description = "This metric monitors ec2 cpu utilization for blue environment"
dimensions = {
AutoScalingGroupName = aws_autoscaling_group.blue.name
}
alarm_actions = [var.sns_topic_arn]
tags = merge(var.tags, {
Environment = "blue"
})
}
# CloudWatch Alarms for Green Environment
resource "aws_cloudwatch_metric_alarm" "green_high_cpu" {
count = local.green_active ? 1 : 0
alarm_name = "${var.project_name}-${var.environment}-green-high-cpu"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "300"
statistic = "Average"
threshold = "80"
alarm_description = "This metric monitors ec2 cpu utilization for green environment"
dimensions = {
AutoScalingGroupName = aws_autoscaling_group.green.name
}
alarm_actions = [var.sns_topic_arn]
tags = merge(var.tags, {
Environment = "green"
})
}HCLCanary Deployment Pattern
Canary deployments gradually shift traffic from the stable version to the new version, allowing for real-world testing with minimal risk.
# modules/canary-deployment/main.tf
locals {
# Calculate traffic weights
canary_weight = var.enable_canary ? var.canary_traffic_percentage : 0
stable_weight = 100 - local.canary_weight
# Determine if canary is healthy
canary_healthy = var.enable_canary && var.canary_health_check_passed
}
# Application Load Balancer
resource "aws_lb" "main" {
name = "${var.project_name}-${var.environment}-canary-alb"
internal = false
load_balancer_type = "application"
security_groups = var.security_group_ids
subnets = var.public_subnet_ids
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-canary-alb"
DeploymentPattern = "canary"
})
}
# Stable Environment Target Group
resource "aws_lb_target_group" "stable" {
name = "${var.project_name}-${var.environment}-stable-tg"
port = var.application_port
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
enabled = true
healthy_threshold = 2
unhealthy_threshold = 3
timeout = 5
interval = 30
path = var.health_check_path
matcher = "200"
port = "traffic-port"
protocol = "HTTP"
}
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-stable-tg"
Purpose = "stable"
})
}
# Canary Environment Target Group
resource "aws_lb_target_group" "canary" {
name = "${var.project_name}-${var.environment}-canary-tg"
port = var.application_port
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
enabled = true
healthy_threshold = 2
unhealthy_threshold = 3
timeout = 5
interval = 30
path = var.health_check_path
matcher = "200"
port = "traffic-port"
protocol = "HTTP"
}
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-canary-tg"
Purpose = "canary"
})
}
# ALB Listener with weighted routing
resource "aws_lb_listener" "main" {
load_balancer_arn = aws_lb.main.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
forward {
# Stable environment
target_group {
arn = aws_lb_target_group.stable.arn
weight = local.stable_weight
}
# Canary environment (only if enabled)
dynamic "target_group" {
for_each = var.enable_canary ? [1] : []
content {
arn = aws_lb_target_group.canary.arn
weight = local.canary_weight
}
}
}
}
tags = var.tags
}
# Stable Environment Auto Scaling Group
resource "aws_autoscaling_group" "stable" {
name = "${var.project_name}-${var.environment}-stable-asg"
vpc_zone_identifier = var.private_subnet_ids
target_group_arns = [aws_lb_target_group.stable.arn]
min_size = var.stable_min_instances
max_size = var.stable_max_instances
desired_capacity = var.stable_desired_instances
launch_template {
id = aws_launch_template.stable.id
version = "$Latest"
}
health_check_type = "ELB"
health_check_grace_period = 300
tag {
key = "Name"
value = "${var.project_name}-${var.environment}-stable"
propagate_at_launch = true
}
tag {
key = "Purpose"
value = "stable"
propagate_at_launch = true
}
tag {
key = "AppVersion"
value = var.stable_app_version
propagate_at_launch = true
}
lifecycle {
create_before_destroy = true
}
}
# Canary Environment Auto Scaling Group
resource "aws_autoscaling_group" "canary" {
count = var.enable_canary ? 1 : 0
name = "${var.project_name}-${var.environment}-canary-asg"
vpc_zone_identifier = var.private_subnet_ids
target_group_arns = [aws_lb_target_group.canary.arn]
min_size = var.canary_min_instances
max_size = var.canary_max_instances
desired_capacity = var.canary_desired_instances
launch_template {
id = aws_launch_template.canary[0].id
version = "$Latest"
}
health_check_type = "ELB"
health_check_grace_period = 300
tag {
key = "Name"
value = "${var.project_name}-${var.environment}-canary"
propagate_at_launch = true
}
tag {
key = "Purpose"
value = "canary"
propagate_at_launch = true
}
tag {
key = "AppVersion"
value = var.canary_app_version
propagate_at_launch = true
}
lifecycle {
create_before_destroy = true
}
}
# Canary Monitoring and Automated Rollback
resource "aws_cloudwatch_metric_alarm" "canary_error_rate" {
count = var.enable_canary ? 1 : 0
alarm_name = "${var.project_name}-${var.environment}-canary-error-rate"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "HTTPCode_Target_5XX_Count"
namespace = "AWS/ApplicationELB"
period = "300"
statistic = "Sum"
threshold = var.canary_error_threshold
alarm_description = "This alarm monitors canary deployment error rate"
dimensions = {
LoadBalancer = aws_lb.main.arn_suffix
TargetGroup = aws_lb_target_group.canary.arn_suffix
}
alarm_actions = [
aws_sns_topic.canary_alerts.arn,
aws_lambda_function.canary_rollback[0].arn
]
tags = merge(var.tags, {
Purpose = "canary-monitoring"
})
}
# Lambda function for automated canary rollback
resource "aws_lambda_function" "canary_rollback" {
count = var.enable_canary ? 1 : 0
filename = "canary_rollback.zip"
function_name = "${var.project_name}-${var.environment}-canary-rollback"
role = aws_iam_role.lambda_rollback[0].arn
handler = "index.handler"
runtime = "python3.9"
timeout = 300
environment {
variables = {
ALB_LISTENER_ARN = aws_lb_listener.main.arn
STABLE_TARGET_GROUP = aws_lb_target_group.stable.arn
CANARY_ASG_NAME = aws_autoscaling_group.canary[0].name
SNS_TOPIC_ARN = aws_sns_topic.canary_alerts.arn
}
}
tags = merge(var.tags, {
Purpose = "canary-rollback"
})
}
# CloudWatch custom metrics for canary analysis
resource "aws_cloudwatch_log_metric_filter" "canary_success_rate" {
count = var.enable_canary ? 1 : 0
name = "${var.project_name}-${var.environment}-canary-success-rate"
log_group_name = var.application_log_group
pattern = "[timestamp, request_id, \"SUCCESS\", ...]"
metric_transformation {
name = "CanarySuccessRate"
namespace = "Custom/CanaryDeployment"
value = "1"
}
}
# Progressive traffic shifting
resource "aws_lambda_function" "traffic_shifter" {
count = var.enable_canary ? 1 : 0
filename = "traffic_shifter.zip"
function_name = "${var.project_name}-${var.environment}-traffic-shifter"
role = aws_iam_role.lambda_traffic_shifter[0].arn
handler = "index.handler"
runtime = "python3.9"
timeout = 300
environment {
variables = {
ALB_LISTENER_ARN = aws_lb_listener.main.arn
STABLE_TARGET_GROUP = aws_lb_target_group.stable.arn
CANARY_TARGET_GROUP = aws_lb_target_group.canary.arn
TRAFFIC_SHIFT_STEPS = jsonencode(var.traffic_shift_steps)
CLOUDWATCH_NAMESPACE = "Custom/CanaryDeployment"
}
}
tags = merge(var.tags, {
Purpose = "traffic-shifting"
})
}
# EventBridge rule for scheduled traffic shifting
resource "aws_cloudwatch_event_rule" "traffic_shift_schedule" {
count = var.enable_canary ? 1 : 0
name = "${var.project_name}-${var.environment}-traffic-shift"
description = "Trigger traffic shifting for canary deployment"
schedule_expression = "rate(${var.traffic_shift_interval_minutes} minutes)"
tags = var.tags
}
resource "aws_cloudwatch_event_target" "traffic_shift_target" {
count = var.enable_canary ? 1 : 0
rule = aws_cloudwatch_event_rule.traffic_shift_schedule[0].name
target_id = "TrafficShifterLambdaTarget"
arn = aws_lambda_function.traffic_shifter[0].arn
}HCLRolling Update Strategy
Rolling updates gradually replace instances in an Auto Scaling Group while maintaining service availability.
# modules/rolling-update/main.tf
resource "aws_autoscaling_group" "rolling" {
name = "${var.project_name}-${var.environment}-rolling-asg"
vpc_zone_identifier = var.private_subnet_ids
target_group_arns = var.target_group_arns
min_size = var.min_instances
max_size = var.max_instances
desired_capacity = var.desired_instances
# Launch template with versioning
launch_template {
id = aws_launch_template.app.id
version = var.use_latest_template ? "$Latest" : var.template_version
}
# Health check configuration
health_check_type = "ELB"
health_check_grace_period = var.health_check_grace_period
# Instance refresh for rolling updates
instance_refresh {
strategy = "Rolling"
preferences {
# Percentage of instances to keep healthy during refresh
min_healthy_percentage = var.min_healthy_percentage
# Time to wait before considering instance healthy
instance_warmup = var.instance_warmup_seconds
# Maximum percentage to replace at once
max_healthy_percentage = var.max_healthy_percentage
# Skip matching instances
skip_matching = var.skip_matching_instances
}
triggers = var.instance_refresh_triggers
}
# Lifecycle hooks for graceful shutdown
initial_lifecycle_hook {
name = "${var.project_name}-${var.environment}-terminating-hook"
default_result = "ABANDON"
heartbeat_timeout = var.termination_timeout_seconds
lifecycle_transition = "autoscaling:EC2_INSTANCE_TERMINATING"
notification_target_arn = aws_sns_topic.lifecycle_notifications.arn
role_arn = aws_iam_role.lifecycle_hook.arn
}
# Warm pool for faster scaling
warm_pool {
pool_state = "Stopped"
min_size = var.warm_pool_min_size
max_group_prepared_capacity = var.warm_pool_max_size
instance_reuse_policy {
reuse_on_scale_in = true
}
}
# Tags
dynamic "tag" {
for_each = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-rolling"
DeploymentStrategy = "rolling"
Version = var.app_version
})
content {
key = tag.key
value = tag.value
propagate_at_launch = true
}
}
lifecycle {
create_before_destroy = true
ignore_changes = [desired_capacity]
}
}
# Launch template with advanced configuration
resource "aws_launch_template" "app" {
name_prefix = "${var.project_name}-${var.environment}-"
image_id = var.ami_id
instance_type = var.instance_type
key_name = var.key_name
vpc_security_group_ids = var.security_group_ids
# Advanced user data with health checks
user_data = base64encode(templatefile("${path.module}/templates/advanced-user-data.sh.tpl", {
app_version = var.app_version
health_check_path = var.health_check_path
graceful_shutdown_timeout = var.graceful_shutdown_timeout
log_group_name = aws_cloudwatch_log_group.app.name
region = var.aws_region
config_bucket = var.config_bucket
deployment_id = random_uuid.deployment_id.result
}))
# Instance metadata service configuration
metadata_options {
http_endpoint = "enabled"
http_tokens = "required"
http_put_response_hop_limit = 2
instance_metadata_tags = "enabled"
}
# EBS optimization
ebs_optimized = true
# Block device mappings
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = var.root_volume_size
volume_type = "gp3"
iops = var.root_volume_iops
throughput = var.root_volume_throughput
encrypted = true
kms_key_id = var.kms_key_id
delete_on_termination = true
}
}
# IAM instance profile
iam_instance_profile {
name = var.instance_profile_name
}
# Network interface configuration
network_interfaces {
associate_public_ip_address = false
security_groups = var.security_group_ids
delete_on_termination = true
}
# Instance monitoring
monitoring {
enabled = var.detailed_monitoring
}
# Tag specifications
tag_specifications {
resource_type = "instance"
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}"
AppVersion = var.app_version
DeploymentId = random_uuid.deployment_id.result
})
}
tag_specifications {
resource_type = "volume"
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-volume"
})
}
lifecycle {
create_before_destroy = true
}
}
# Deployment tracking
resource "random_uuid" "deployment_id" {
keepers = {
ami_id = var.ami_id
app_version = var.app_version
timestamp = timestamp()
}
}
# CloudWatch Log Group for application logs
resource "aws_cloudwatch_log_group" "app" {
name = "/aws/ec2/${var.project_name}-${var.environment}"
retention_in_days = var.log_retention_days
kms_key_id = var.cloudwatch_kms_key_id
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-logs"
})
}
# SNS topic for lifecycle notifications
resource "aws_sns_topic" "lifecycle_notifications" {
name = "${var.project_name}-${var.environment}-lifecycle"
tags = var.tags
}
# Lambda function for lifecycle hook processing
resource "aws_lambda_function" "lifecycle_processor" {
filename = "lifecycle_processor.zip"
function_name = "${var.project_name}-${var.environment}-lifecycle-processor"
role = aws_iam_role.lifecycle_lambda.arn
handler = "index.handler"
runtime = "python3.9"
timeout = var.lifecycle_lambda_timeout
environment {
variables = {
LOG_GROUP_NAME = aws_cloudwatch_log_group.app.name
GRACEFUL_TIMEOUT = var.graceful_shutdown_timeout
HEALTH_CHECK_PATH = var.health_check_path
}
}
tags = merge(var.tags, {
Purpose = "lifecycle-processing"
})
}
# SNS subscription for lifecycle hook
resource "aws_sns_topic_subscription" "lifecycle_lambda" {
topic_arn = aws_sns_topic.lifecycle_notifications.arn
protocol = "lambda"
endpoint = aws_lambda_function.lifecycle_processor.arn
}
# CloudWatch custom metrics for deployment tracking
resource "aws_cloudwatch_metric_alarm" "deployment_success_rate" {
alarm_name = "${var.project_name}-${var.environment}-deployment-success"
comparison_operator = "LessThanThreshold"
evaluation_periods = "2"
metric_name = "DeploymentSuccessRate"
namespace = "Custom/Deployment"
period = "300"
statistic = "Average"
threshold = var.deployment_success_threshold
alarm_description = "This alarm monitors deployment success rate"
dimensions = {
DeploymentId = random_uuid.deployment_id.result
Environment = var.environment
}
alarm_actions = [var.alert_sns_topic_arn]
tags = var.tags
}HCLFeature Toggle Infrastructure
Feature toggles allow runtime control of application features without redeployment.
# modules/feature-toggles/main.tf
# DynamoDB table for feature flags
resource "aws_dynamodb_table" "feature_flags" {
name = "${var.project_name}-${var.environment}-feature-flags"
billing_mode = "PAY_PER_REQUEST"
hash_key = "feature_name"
stream_enabled = true
stream_view_type = "NEW_AND_OLD_IMAGES"
attribute {
name = "feature_name"
type = "S"
}
attribute {
name = "environment"
type = "S"
}
global_secondary_index {
name = "environment-index"
hash_key = "environment"
}
ttl {
attribute_name = "ttl"
enabled = true
}
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-feature-flags"
Purpose = "feature-toggles"
})
}
# Lambda function for feature flag management
resource "aws_lambda_function" "feature_flag_manager" {
filename = "feature_flag_manager.zip"
function_name = "${var.project_name}-${var.environment}-feature-flag-manager"
role = aws_iam_role.feature_flag_lambda.arn
handler = "index.handler"
runtime = "python3.9"
timeout = 30
environment {
variables = {
FEATURE_FLAGS_TABLE = aws_dynamodb_table.feature_flags.name
ENVIRONMENT = var.environment
CACHE_TTL_SECONDS = var.cache_ttl_seconds
}
}
tags = merge(var.tags, {
Purpose = "feature-flag-management"
})
}
# API Gateway for feature flag API
resource "aws_apigatewayv2_api" "feature_flags" {
name = "${var.project_name}-${var.environment}-feature-flags-api"
protocol_type = "HTTP"
description = "Feature flags management API"
cors_configuration {
allow_credentials = false
allow_headers = ["content-type", "x-amz-date", "authorization"]
allow_methods = ["GET", "POST", "PUT", "DELETE", "OPTIONS"]
allow_origins = var.allowed_origins
expose_headers = ["date", "keep-alive"]
max_age = 86400
}
tags = var.tags
}
# ElastiCache for feature flag caching
resource "aws_elasticache_subnet_group" "feature_flags" {
name = "${var.project_name}-${var.environment}-ff-cache-subnet-group"
subnet_ids = var.private_subnet_ids
tags = var.tags
}
resource "aws_elasticache_replication_group" "feature_flags" {
replication_group_id = "${var.project_name}-${var.environment}-ff-cache"
description = "Feature flags cache"
node_type = var.cache_node_type
port = 6379
parameter_group_name = "default.redis7"
num_cache_clusters = var.cache_num_nodes
automatic_failover_enabled = var.cache_num_nodes > 1
subnet_group_name = aws_elasticache_subnet_group.feature_flags.name
security_group_ids = [aws_security_group.feature_flags_cache.id]
at_rest_encryption_enabled = true
transit_encryption_enabled = true
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-ff-cache"
Purpose = "feature-flags-cache"
})
}
# Security group for cache
resource "aws_security_group" "feature_flags_cache" {
name_prefix = "${var.project_name}-${var.environment}-ff-cache-"
vpc_id = var.vpc_id
ingress {
from_port = 6379
to_port = 6379
protocol = "tcp"
security_groups = var.application_security_group_ids
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-ff-cache-sg"
})
}
# Lambda function for cache invalidation
resource "aws_lambda_function" "cache_invalidation" {
filename = "cache_invalidation.zip"
function_name = "${var.project_name}-${var.environment}-cache-invalidation"
role = aws_iam_role.cache_invalidation_lambda.arn
handler = "index.handler"
runtime = "python3.9"
timeout = 30
environment {
variables = {
REDIS_ENDPOINT = aws_elasticache_replication_group.feature_flags.primary_endpoint_address
REDIS_PORT = aws_elasticache_replication_group.feature_flags.port
}
}
tags = merge(var.tags, {
Purpose = "cache-invalidation"
})
}
# DynamoDB Stream trigger for cache invalidation
resource "aws_lambda_event_source_mapping" "dynamodb_stream" {
event_source_arn = aws_dynamodb_table.feature_flags.stream_arn
function_name = aws_lambda_function.cache_invalidation.arn
starting_position = "LATEST"
filter_criteria {
filter {
pattern = jsonencode({
eventName = ["INSERT", "MODIFY", "REMOVE"]
})
}
}
}
# CloudWatch dashboard for feature flag metrics
resource "aws_cloudwatch_dashboard" "feature_flags" {
dashboard_name = "${var.project_name}-${var.environment}-feature-flags"
dashboard_body = jsonencode({
widgets = [
{
type = "metric"
x = 0
y = 0
width = 12
height = 6
properties = {
metrics = [
["AWS/DynamoDB", "ConsumedReadCapacityUnits", "TableName", aws_dynamodb_table.feature_flags.name],
[".", "ConsumedWriteCapacityUnits", ".", "."],
]
view = "timeSeries"
stacked = false
region = var.aws_region
title = "DynamoDB Feature Flags Usage"
period = 300
}
},
{
type = "metric"
x = 0
y = 6
width = 12
height = 6
properties = {
metrics = [
["AWS/ElastiCache", "CacheHits", "CacheClusterId", aws_elasticache_replication_group.feature_flags.replication_group_id],
[".", "CacheMisses", ".", "."],
]
view = "timeSeries"
stacked = false
region = var.aws_region
title = "ElastiCache Performance"
period = 300
}
}
]
})
tags = var.tags
}HCLImmutable Infrastructure Pattern
Immutable infrastructure treats infrastructure components as disposable, creating new instances rather than modifying existing ones.
# modules/immutable-infrastructure/main.tf
locals {
# Generate unique identifiers for immutable deployments
deployment_id = "${var.app_version}-${substr(sha256(jsonencode({
ami_id = var.ami_id
app_version = var.app_version
config_hash = var.config_hash
timestamp = var.deployment_timestamp
})), 0, 8)}"
}
# AMI builder using Packer (triggered by Terraform)
resource "null_resource" "ami_builder" {
triggers = {
app_version = var.app_version
base_ami = var.base_ami_id
config_hash = var.config_hash
}
provisioner "local-exec" {
command = <<-EOT
packer build \
-var 'app_version=${var.app_version}' \
-var 'base_ami=${var.base_ami_id}' \
-var 'region=${var.aws_region}' \
-var 'deployment_id=${local.deployment_id}' \
${path.module}/packer/app-ami.pkr.hcl
EOT
}
}
# Data source to get the newly built AMI
data "aws_ami" "app" {
depends_on = [null_resource.ami_builder]
most_recent = true
owners = ["self"]
filter {
name = "name"
values = ["${var.project_name}-${var.environment}-${local.deployment_id}"]
}
filter {
name = "state"
values = ["available"]
}
}
# Launch template for immutable infrastructure
resource "aws_launch_template" "immutable" {
name_prefix = "${var.project_name}-${var.environment}-${local.deployment_id}-"
image_id = data.aws_ami.app.id
instance_type = var.instance_type
key_name = var.key_name
vpc_security_group_ids = var.security_group_ids
# Minimal user data since everything is baked into AMI
user_data = base64encode(templatefile("${path.module}/templates/immutable-user-data.sh.tpl", {
deployment_id = local.deployment_id
environment = var.environment
log_group_name = aws_cloudwatch_log_group.deployment.name
region = var.aws_region
}))
# Instance metadata
metadata_options {
http_endpoint = "enabled"
http_tokens = "required"
}
# Block device mappings
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = var.root_volume_size
volume_type = "gp3"
encrypted = true
delete_on_termination = true
}
}
# IAM instance profile
iam_instance_profile {
name = var.instance_profile_name
}
# Tags
tag_specifications {
resource_type = "instance"
tags = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-${local.deployment_id}"
DeploymentId = local.deployment_id
AppVersion = var.app_version
AMI = data.aws_ami.app.id
Immutable = "true"
})
}
lifecycle {
create_before_destroy = true
}
}
# Auto Scaling Group with immutable updates
resource "aws_autoscaling_group" "immutable" {
name = "${var.project_name}-${var.environment}-${local.deployment_id}"
vpc_zone_identifier = var.private_subnet_ids
target_group_arns = var.target_group_arns
min_size = var.min_instances
max_size = var.max_instances
desired_capacity = var.desired_instances
launch_template {
id = aws_launch_template.immutable.id
version = "$Latest"
}
# Health checks
health_check_type = "ELB"
health_check_grace_period = var.health_check_grace_period
# Instance refresh for immutable updates
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
instance_warmup = var.instance_warmup_seconds
}
}
# Lifecycle management
termination_policies = ["OldestInstance"]
# Tags
dynamic "tag" {
for_each = merge(var.tags, {
Name = "${var.project_name}-${var.environment}-${local.deployment_id}"
DeploymentId = local.deployment_id
AppVersion = var.app_version
Immutable = "true"
})
content {
key = tag.key
value = tag.value
propagate_at_launch = true
}
}
lifecycle {
create_before_destroy = true
}
}
# CloudWatch Log Group for deployment tracking
resource "aws_cloudwatch_log_group" "deployment" {
name = "/aws/deployment/${var.project_name}-${var.environment}"
retention_in_days = var.log_retention_days
tags = merge(var.tags, {
DeploymentId = local.deployment_id
})
}
# CloudWatch custom metrics for deployment tracking
resource "aws_cloudwatch_log_metric_filter" "deployment_events" {
name = "${var.project_name}-${var.environment}-deployment-events"
log_group_name = aws_cloudwatch_log_group.deployment.name
pattern = "[timestamp, level=\"INFO\", message=\"DEPLOYMENT_*\", ...]"
metric_transformation {
name = "DeploymentEvents"
namespace = "Custom/ImmutableDeployment"
value = "1"
default_value = "0"
}
}
# Lambda function for old AMI cleanup
resource "aws_lambda_function" "ami_cleanup" {
filename = "ami_cleanup.zip"
function_name = "${var.project_name}-${var.environment}-ami-cleanup"
role = aws_iam_role.ami_cleanup_lambda.arn
handler = "index.handler"
runtime = "python3.9"
timeout = 300
environment {
variables = {
PROJECT_NAME = var.project_name
ENVIRONMENT = var.environment
RETAIN_VERSIONS = var.retain_ami_versions
DRY_RUN = var.ami_cleanup_dry_run
}
}
tags = merge(var.tags, {
Purpose = "ami-cleanup"
})
}
# EventBridge rule for scheduled AMI cleanup
resource "aws_cloudwatch_event_rule" "ami_cleanup_schedule" {
name = "${var.project_name}-${var.environment}-ami-cleanup"
description = "Trigger AMI cleanup for immutable infrastructure"
schedule_expression = "rate(${var.ami_cleanup_interval_hours} hours)"
tags = var.tags
}
resource "aws_cloudwatch_event_target" "ami_cleanup_target" {
rule = aws_cloudwatch_event_rule.ami_cleanup_schedule.name
target_id = "AMICleanupLambdaTarget"
arn = aws_lambda_function.ami_cleanup.arn
}
# SNS topic for deployment notifications
resource "aws_sns_topic" "deployment_notifications" {
name = "${var.project_name}-${var.environment}-deployment-notifications"
tags = merge(var.tags, {
Purpose = "deployment-notifications"
})
}
# Lambda function for deployment notifications
resource "aws_lambda_function" "deployment_notifier" {
filename = "deployment_notifier.zip"
function_name = "${var.project_name}-${var.environment}-deployment-notifier"
role = aws_iam_role.deployment_notifier_lambda.arn
handler = "index.handler"
runtime = "python3.9"
timeout = 60
environment {
variables = {
SNS_TOPIC_ARN = aws_sns_topic.deployment_notifications.arn
SLACK_WEBHOOK = var.slack_webhook_url
DEPLOYMENT_ID = local.deployment_id
APP_VERSION = var.app_version
ENVIRONMENT = var.environment
}
}
tags = merge(var.tags, {
Purpose = "deployment-notifications"
})
}HCLProgressive Delivery with Observability
Progressive delivery combines deployment strategies with comprehensive observability to make data-driven decisions about rollouts.
# modules/progressive-delivery/main.tf
# CloudWatch Observability
resource "aws_cloudwatch_dashboard" "progressive_delivery" {
dashboard_name = "${var.project_name}-${var.environment}-progressive-delivery"
dashboard_body = jsonencode({
widgets = [
{
type = "metric"
x = 0
y = 0
width = 8
height = 6
properties = {
metrics = [
["AWS/ApplicationELB", "TargetResponseTime", "LoadBalancer", var.load_balancer_arn_suffix],
[".", "HTTPCode_Target_2XX_Count", ".", "."],
[".", "HTTPCode_Target_4XX_Count", ".", "."],
[".", "HTTPCode_Target_5XX_Count", ".", "."]
]
view = "timeSeries"
stacked = false
region = var.aws_region
title = "Application Performance Metrics"
period = 300
}
},
{
type = "metric"
x = 8
y = 0
width = 8
height = 6
properties = {
metrics = [
["Custom/Application", "BusinessMetric1", "Environment", var.environment],
[".", "BusinessMetric2", ".", "."],
[".", "ConversionRate", ".", "."]
]
view = "timeSeries"
stacked = false
region = var.aws_region
title = "Business Metrics"
period = 300
}
}
]
})
tags = var.tags
}
# X-Ray tracing for observability
resource "aws_xray_sampling_rule" "progressive_delivery" {
rule_name = "${var.project_name}-${var.environment}-progressive-delivery"
priority = 9000
version = 1
reservoir_size = 1
fixed_rate = 0.1
url_path = "*"
host = "*"
http_method = "*"
service_type = "*"
service_name = "*"
resource_arn = "*"
tags = var.tags
}
# Lambda function for deployment decision making
resource "aws_lambda_function" "deployment_orchestrator" {
filename = "deployment_orchestrator.zip"
function_name = "${var.project_name}-${var.environment}-deployment-orchestrator"
role = aws_iam_role.deployment_orchestrator.arn
handler = "index.handler"
runtime = "python3.9"
timeout = 900
environment {
variables = {
PROJECT_NAME = var.project_name
ENVIRONMENT = var.environment
CLOUDWATCH_NAMESPACE = "Custom/ProgressiveDelivery"
ROLLBACK_THRESHOLD = var.rollback_threshold
SUCCESS_CRITERIA = jsonencode(var.success_criteria)
NOTIFICATION_TOPIC = aws_sns_topic.deployment_notifications.arn
}
}
tags = merge(var.tags, {
Purpose = "deployment-orchestration"
})
}
# Step Functions for deployment workflow
resource "aws_sfn_state_machine" "progressive_deployment" {
name = "${var.project_name}-${var.environment}-progressive-deployment"
role_arn = aws_iam_role.step_functions.arn
definition = jsonencode({
Comment = "Progressive deployment workflow"
StartAt = "InitiateDeployment"
States = {
InitiateDeployment = {
Type = "Task"
Resource = aws_lambda_function.deployment_orchestrator.arn
Parameters = {
action = "initiate"
deployment_config = var.deployment_config
}
Next = "WaitForStabilization"
}
WaitForStabilization = {
Type = "Wait"
Seconds = var.stabilization_wait_seconds
Next = "EvaluateMetrics"
}
EvaluateMetrics = {
Type = "Task"
Resource = aws_lambda_function.deployment_orchestrator.arn
Parameters = {
action = "evaluate"
}
Next = "DecisionGate"
}
DecisionGate = {
Type = "Choice"
Choices = [
{
Variable = "$.metrics.success_rate"
NumericGreaterThan = var.success_threshold
Next = "ProceedDeployment"
}
]
Default = "RollbackDeployment"
}
ProceedDeployment = {
Type = "Task"
Resource = aws_lambda_function.deployment_orchestrator.arn
Parameters = {
action = "proceed"
}
Next = "CheckCompletion"
}
CheckCompletion = {
Type = "Choice"
Choices = [
{
Variable = "$.deployment.complete"
BooleanEquals = true
Next = "DeploymentSuccess"
}
]
Default = "WaitForStabilization"
}
RollbackDeployment = {
Type = "Task"
Resource = aws_lambda_function.deployment_orchestrator.arn
Parameters = {
action = "rollback"
}
Next = "DeploymentFailed"
}
DeploymentSuccess = {
Type = "Succeed"
}
DeploymentFailed = {
Type = "Fail"
Cause = "Deployment failed metrics evaluation"
}
}
})
tags = var.tags
}
# EventBridge rule for automated deployment triggers
resource "aws_cloudwatch_event_rule" "deployment_trigger" {
name = "${var.project_name}-${var.environment}-deployment-trigger"
description = "Trigger progressive deployment workflow"
event_pattern = jsonencode({
source = ["custom.deployment"]
detail-type = ["Deployment Request"]
detail = {
environment = [var.environment]
project = [var.project_name]
}
})
tags = var.tags
}
resource "aws_cloudwatch_event_target" "deployment_workflow" {
rule = aws_cloudwatch_event_rule.deployment_trigger.name
target_id = "ProgressiveDeploymentWorkflow"
arn = aws_sfn_state_machine.progressive_deployment.arn
role_arn = aws_iam_role.eventbridge_sfn.arn
}
# Custom CloudWatch metrics for deployment tracking
resource "aws_cloudwatch_metric_alarm" "deployment_success_rate" {
alarm_name = "${var.project_name}-${var.environment}-deployment-success-rate"
comparison_operator = "LessThanThreshold"
evaluation_periods = "3"
metric_name = "SuccessRate"
namespace = "Custom/ProgressiveDelivery"
period = "300"
statistic = "Average"
threshold = var.success_rate_threshold
alarm_description = "Deployment success rate below threshold"
dimensions = {
Environment = var.environment
DeploymentId = var.deployment_id
}
alarm_actions = [
aws_sns_topic.deployment_notifications.arn,
aws_lambda_function.emergency_rollback.arn
]
tags = var.tags
}
# Emergency rollback Lambda
resource "aws_lambda_function" "emergency_rollback" {
filename = "emergency_rollback.zip"
function_name = "${var.project_name}-${var.environment}-emergency-rollback"
role = aws_iam_role.emergency_rollback.arn
handler = "index.handler"
runtime = "python3.9"
timeout = 300
environment {
variables = {
PROJECT_NAME = var.project_name
ENVIRONMENT = var.environment
STATE_MACHINE = aws_sfn_state_machine.progressive_deployment.arn
}
}
tags = merge(var.tags, {
Purpose = "emergency-rollback"
})
}HCL18. Real-world Projects
Project 1: Three-Tier Web Application
graph TB
A[Internet] --> B[Application Load Balancer]
B --> C[Web Tier - Auto Scaling Group]
C --> D[Application Tier - Auto Scaling Group]
D --> E[Database Tier - RDS Multi-AZ]
F[Route 53] --> A
G[CloudFront CDN] --> A
H[S3 Static Assets] --> G
C --> I[ElastiCache Redis]
D --> I
J[NAT Gateway] --> K[Internet Gateway]
C --> J
D --> J# environments/prod/main.tf
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
backend "s3" {
bucket = "company-terraform-state-prod"
key = "web-app/terraform.tfstate"
region = "us-west-2"
dynamodb_table = "terraform-state-locks"
encrypt = true
}
}
locals {
project_name = "web-app"
environment = "prod"
common_tags = {
Project = local.project_name
Environment = local.environment
ManagedBy = "Terraform"
Owner = "Platform Team"
CostCenter = "Engineering"
}
}
# Data sources
data "aws_availability_zones" "available" {
state = "available"
}
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
}
# VPC Module
module "vpc" {
source = "../../modules/vpc"
name = local.project_name
environment = local.environment
cidr_block = "10.0.0.0/16"
availability_zones = slice(data.aws_availability_zones.available.names, 0, 3)
public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
private_subnet_cidrs = ["10.0.11.0/24", "10.0.12.0/24", "10.0.13.0/24"]
database_subnet_cidrs = ["10.0.21.0/24", "10.0.22.0/24", "10.0.23.0/24"]
enable_nat_gateway = true
enable_vpn_gateway = false
tags = local.common_tags
}
# Security Groups Module
module "security_groups" {
source = "../../modules/security-groups"
name = local.project_name
vpc_id = module.vpc.vpc_id
vpc_cidr = module.vpc.vpc_cidr_block
tags = local.common_tags
}
# Load Balancer Module
module "alb" {
source = "../../modules/alb"
name = local.project_name
environment = local.environment
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.public_subnet_ids
security_groups = [module.security_groups.alb_security_group_id]
certificate_arn = aws_acm_certificate_validation.web_cert.certificate_arn
tags = local.common_tags
}
# Auto Scaling Groups
module "web_asg" {
source = "../../modules/auto-scaling"
name = "${local.project_name}-web"
environment = local.environment
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
ami_id = data.aws_ami.amazon_linux.id
instance_type = "t3.medium"
key_name = aws_key_pair.web_app.key_name
security_groups = [module.security_groups.web_security_group_id]
min_size = 3
max_size = 10
desired_capacity = 5
target_group_arns = [module.alb.target_group_arn]
user_data = base64encode(templatefile("${path.module}/user-data/web-tier.sh", {
environment = local.environment
app_config = local.app_config
}))
tags = local.common_tags
}
module "app_asg" {
source = "../../modules/auto-scaling"
name = "${local.project_name}-app"
environment = local.environment
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
ami_id = data.aws_ami.amazon_linux.id
instance_type = "t3.large"
key_name = aws_key_pair.web_app.key_name
security_groups = [module.security_groups.app_security_group_id]
min_size = 2
max_size = 8
desired_capacity = 4
user_data = base64encode(templatefile("${path.module}/user-data/app-tier.sh", {
environment = local.environment
database_host = module.rds.endpoint
redis_host = module.elasticache.primary_endpoint
app_config = local.app_config
}))
tags = local.common_tags
}
# Database Module
module "rds" {
source = "../../modules/rds"
name = local.project_name
environment = local.environment
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.database_subnet_ids
engine_version = "8.0.35"
instance_class = "db.r5.xlarge"
allocated_storage = 500
database_name = "webapp"
username = "admin"
password = random_password.db_password.result
backup_retention_period = 30
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
multi_az = true
deletion_protection = true
skip_final_snapshot = false
security_group_ids = [module.security_groups.database_security_group_id]
tags = local.common_tags
}
# ElastiCache Module
module "elasticache" {
source = "../../modules/elasticache"
name = local.project_name
environment = local.environment
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
node_type = "cache.r5.large"
num_cache_clusters = 2
parameter_group_name = "default.redis7"
security_group_ids = [module.security_groups.cache_security_group_id]
tags = local.common_tags
}
# CloudFront and S3 for static assets
module "cdn" {
source = "../../modules/cloudfront"
name = local.project_name
environment = local.environment
origin_domain_name = module.alb.dns_name
s3_bucket_name = "${local.project_name}-${local.environment}-static-assets-${random_id.bucket_suffix.hex}"
certificate_arn = aws_acm_certificate.cloudfront_cert.arn
tags = local.common_tags
}
# Supporting resources
resource "random_password" "db_password" {
length = 32
special = true
}
resource "aws_secretsmanager_secret" "db_password" {
name = "${local.project_name}/${local.environment}/database/password"
description = "Database password for ${local.project_name}"
tags = local.common_tags
}
resource "aws_secretsmanager_secret_version" "db_password" {
secret_id = aws_secretsmanager_secret.db_password.id
secret_string = jsonencode({
password = random_password.db_password.result
})
}
resource "random_id" "bucket_suffix" {
byte_length = 4
}
# SSL Certificates
resource "aws_acm_certificate" "web_cert" {
domain_name = var.domain_name
subject_alternative_names = ["*.${var.domain_name}"]
validation_method = "DNS"
lifecycle {
create_before_destroy = true
}
tags = local.common_tags
}
resource "aws_acm_certificate" "cloudfront_cert" {
provider = aws.us_east_1
domain_name = var.domain_name
subject_alternative_names = ["*.${var.domain_name}"]
validation_method = "DNS"
lifecycle {
create_before_destroy = true
}
tags = local.common_tags
}
# Route 53
resource "aws_route53_zone" "main" {
name = var.domain_name
tags = local.common_tags
}
resource "aws_route53_record" "web_app" {
zone_id = aws_route53_zone.main.zone_id
name = var.domain_name
type = "A"
alias {
name = module.cdn.cloudfront_domain_name
zone_id = module.cdn.cloudfront_hosted_zone_id
evaluate_target_health = false
}
}
# Monitoring and Alerting
module "monitoring" {
source = "../../modules/monitoring"
name = local.project_name
environment = local.environment
alb_arn = module.alb.arn
asg_names = [module.web_asg.name, module.app_asg.name]
rds_instance_id = module.rds.instance_id
elasticache_cluster_id = module.elasticache.cluster_id
notification_email = var.notification_email
tags = local.common_tags
}HCL19. Appendices and Quick References
Appendix A: Terraform Command Reference
Essential Commands Cheat Sheet
# Initialization and Setup
terraform init # Initialize working directory
terraform init -upgrade # Upgrade providers to latest version
terraform init -reconfigure # Reconfigure backend
terraform init -backend=false # Skip backend initialization
# Planning and Validation
terraform validate # Validate configuration syntax
terraform fmt # Format configuration files
terraform fmt -check # Check if files are formatted
terraform fmt -diff # Show formatting differences
terraform plan # Create execution plan
terraform plan -out=tfplan # Save plan to file
terraform plan -target=resource # Plan for specific resource
terraform plan -var="key=value" # Override variable values
terraform plan -refresh=false # Skip state refresh
# Applying Changes
terraform apply # Apply changes
terraform apply tfplan # Apply saved plan
terraform apply -auto-approve # Apply without confirmation
terraform apply -target=resource # Apply specific resource
terraform apply -parallelism=10 # Set parallel operations
# State Management
terraform state list # List resources in state
terraform state show resource # Show resource details
terraform state mv old new # Move resource in state
terraform state rm resource # Remove resource from state
terraform state pull # Download state file
terraform state push state.json # Upload state file
terraform refresh # Update state from infrastructure
# Import and Output
terraform import resource id # Import existing resource
terraform output # Show all outputs
terraform output name # Show specific output
terraform output -json # JSON format output
# Workspace Management
terraform workspace list # List workspaces
terraform workspace new name # Create workspace
terraform workspace select name # Switch workspace
terraform workspace show # Show current workspace
terraform workspace delete name # Delete workspace
# Cleanup
terraform destroy # Destroy infrastructure
terraform destroy -target=resource # Destroy specific resource
terraform destroy -auto-approve # Destroy without confirmation
# Advanced Operations
terraform force-unlock LOCK_ID # Force unlock state
terraform taint resource # Mark resource for recreation
terraform untaint resource # Remove taint from resource
terraform graph # Generate dependency graph
terraform providers # Show provider requirements
terraform version # Show Terraform versionBashAppendix B: HCL Language Reference
Data Types and Syntax
# Basic Data Types
variable "string_example" {
type = string
default = "Hello World"
}
variable "number_example" {
type = number
default = 42
}
variable "boolean_example" {
type = bool
default = true
}
# Collection Types
variable "list_example" {
type = list(string)
default = ["item1", "item2", "item3"]
}
variable "map_example" {
type = map(string)
default = {
key1 = "value1"
key2 = "value2"
}
}
variable "set_example" {
type = set(string)
default = ["unique1", "unique2"]
}
# Complex Types
variable "object_example" {
type = object({
name = string
age = number
active = bool
tags = map(string)
})
default = {
name = "example"
age = 30
active = true
tags = { environment = "dev" }
}
}
variable "tuple_example" {
type = tuple([string, number, bool])
default = ["example", 42, true]
}
# Operators
locals {
# Arithmetic
sum = 5 + 3
difference = 10 - 4
product = 6 * 7
quotient = 20 / 4
modulo = 17 % 5
# Comparison
equal = 5 == 5
not_equal = 5 != 3
less_than = 3 < 5
less_than_equal = 5 <= 5
greater_than = 7 > 5
greater_than_equal = 5 >= 5
# Logical
and_operation = true && false
or_operation = true || false
not_operation = !true
# Conditional
conditional = var.environment == "prod" ? "production" : "development"
}HCLBuilt-in Functions Reference
locals {
# String Functions
upper_text = upper("hello") # "HELLO"
lower_text = lower("WORLD") # "world"
title_text = title("hello world") # "Hello World"
trim_text = trim(" hello ", " ") # "hello"
split_text = split(",", "a,b,c") # ["a", "b", "c"]
join_text = join(",", ["a", "b"]) # "a,b"
format_text = format("Hello %s", "World") # "Hello World"
substr_text = substr("hello", 1, 3) # "ell"
replace_text = replace("hello", "l", "x") # "hexxo"
regex_text = regex("[0-9]+", "abc123") # "123"
# Numeric Functions
max_value = max(1, 2, 3) # 3
min_value = min(1, 2, 3) # 1
abs_value = abs(-5) # 5
ceil_value = ceil(4.3) # 5
floor_value = floor(4.7) # 4
# Collection Functions
length_list = length(["a", "b", "c"]) # 3
element_at = element(["a", "b", "c"], 1) # "b"
index_of = index(["a", "b", "c"], "b") # 1
concat_lists = concat(["a"], ["b", "c"]) # ["a", "b", "c"]
distinct_list = distinct(["a", "b", "a"]) # ["a", "b"]
flatten_list = flatten([["a"], ["b", "c"]]) # ["a", "b", "c"]
reverse_list = reverse(["a", "b", "c"]) # ["c", "b", "a"]
sort_list = sort(["c", "a", "b"]) # ["a", "b", "c"]
# Map Functions
keys_map = keys({a = 1, b = 2}) # ["a", "b"]
values_map = values({a = 1, b = 2}) # [1, 2]
merge_maps = merge({a = 1}, {b = 2}) # {a = 1, b = 2}
lookup_value = lookup({a = 1, b = 2}, "a", 0) # 1
# Type Conversion
to_string = tostring(42) # "42"
to_number = tonumber("42") # 42
to_bool = tobool("true") # true
to_list = tolist(["a", "b"]) # ["a", "b"]
to_set = toset(["a", "b", "a"]) # ["a", "b"]
to_map = tomap({a = "1", b = "2"}) # {a = "1", b = "2"}
# Date/Time Functions
timestamp_now = timestamp() # Current timestamp
time_add = timeadd(timestamp(), "1h") # Add 1 hour
format_date = formatdate("YYYY-MM-DD", timestamp()) # Format date
# Encoding Functions
base64_encode = base64encode("hello") # "aGVsbG8="
base64_decode = base64decode("aGVsbG8=") # "hello"
json_encode = jsonencode({a = 1}) # "{\"a\":1}"
json_decode = jsondecode("{\"a\":1}") # {a = 1}
url_encode = urlencode("hello world") # "hello%20world"
# File Functions
file_content = file("${path.module}/file.txt")
template_file = templatefile("${path.module}/template.tpl", {
name = "World"
})
# Network Functions
cidr_host = cidrhost("10.0.0.0/24", 1) # "10.0.0.1"
cidr_netmask = cidrnetmask("10.0.0.0/24") # "255.255.255.0"
cidr_subnet = cidrsubnet("10.0.0.0/16", 8, 1) # "10.1.0.0/24"
cidr_subnets = cidrsubnets("10.0.0.0/16", 8, 8, 8) # Multiple subnets
# Hash Functions
md5_hash = md5("hello") # MD5 hash
sha1_hash = sha1("hello") # SHA1 hash
sha256_hash = sha256("hello") # SHA256 hash
sha512_hash = sha512("hello") # SHA512 hash
# UUID Functions
uuid_v4 = uuidv4() # Generate UUIDv4
uuid_v5 = uuidv5("dns", "example.com") # Generate UUIDv5
# Validation Functions
can_function = can(regex("^[0-9]+$", "123")) # true if valid
try_function = try(tonumber("abc"), 0) # Returns 0 if conversion fails
}HCLAppendix C: AWS Resource Quick Reference
Common AWS Resources
# VPC Resources
resource "aws_vpc" "example" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = { Name = "example-vpc" }
}
resource "aws_subnet" "example" {
vpc_id = aws_vpc.example.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-west-2a"
map_public_ip_on_launch = true
tags = { Name = "example-subnet" }
}
resource "aws_internet_gateway" "example" {
vpc_id = aws_vpc.example.id
tags = { Name = "example-igw" }
}
resource "aws_route_table" "example" {
vpc_id = aws_vpc.example.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.example.id
}
tags = { Name = "example-rt" }
}
# Security Groups
resource "aws_security_group" "web" {
name_prefix = "web-"
vpc_id = aws_vpc.example.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = { Name = "web-sg" }
}
# EC2 Instances
resource "aws_instance" "example" {
ami = "ami-0c02fb55956c7d316"
instance_type = "t3.micro"
subnet_id = aws_subnet.example.id
vpc_security_group_ids = [aws_security_group.web.id]
user_data = base64encode(<<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
EOF
)
tags = { Name = "example-instance" }
}
# Load Balancer
resource "aws_lb" "example" {
name = "example-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.web.id]
subnets = [aws_subnet.example.id]
tags = { Name = "example-alb" }
}
# RDS Database
resource "aws_db_instance" "example" {
identifier = "example-db"
engine = "mysql"
engine_version = "8.0"
instance_class = "db.t3.micro"
allocated_storage = 20
storage_encrypted = true
db_name = "exampledb"
username = "admin"
password = random_password.db_password.result
vpc_security_group_ids = [aws_security_group.db.id]
db_subnet_group_name = aws_db_subnet_group.example.name
backup_retention_period = 7
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
skip_final_snapshot = true
tags = { Name = "example-db" }
}
# S3 Bucket
resource "aws_s3_bucket" "example" {
bucket = "example-bucket-${random_id.bucket_suffix.hex}"
tags = { Name = "example-bucket" }
}
resource "aws_s3_bucket_versioning" "example" {
bucket = aws_s3_bucket.example.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_encryption_configuration" "example" {
bucket = aws_s3_bucket.example.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}HCLAppendix D: Error Troubleshooting Guide
Common Errors and Solutions
flowchart TD
A[Terraform Error] --> B{Error Type}
B -->|State Lock| C[State Lock Error]
B -->|Provider| D[Provider Error]
B -->|Resource| E[Resource Error]
B -->|Configuration| F[Configuration Error]
C --> C1[terraform force-unlock LOCK_ID]
C --> C2[Check backend permissions]
C --> C3[Verify state backend access]
D --> D1[Check provider credentials]
D --> D2[Verify provider version]
D --> D3[Check network connectivity]
E --> E1[Check resource quotas]
E --> E2[Verify resource dependencies]
E --> E3[Check resource permissions]
F --> F1[terraform validate]
F --> F2[terraform fmt -check]
F --> F3[Check variable values]Error Reference Table
| Error Message | Cause | Solution |
|---|---|---|
Error locking state | State file is locked | terraform force-unlock LOCK_ID |
Provider configuration not present | Missing provider block | Add provider configuration |
Resource already exists | Resource conflicts | Import existing resource or rename |
Invalid resource name | Name doesn’t follow conventions | Use valid naming pattern |
Cycle in resource dependencies | Circular dependency | Break dependency cycle |
Authentication failed | Invalid credentials | Check AWS/Azure/GCP credentials |
Quota exceeded | Service limits reached | Request quota increase |
Variable not declared | Undefined variable | Declare variable in variables.tf |
Module not found | Incorrect module source | Verify module source path |
Syntax error | Invalid HCL syntax | Run terraform validate |
Appendix E: Best Practices Checklist
Pre-deployment Checklist
- Code Review
- Terraform configuration follows naming conventions
- All resources have appropriate tags
- Sensitive data is not hardcoded
- Provider versions are pinned
- Security Review
- IAM policies follow least privilege principle
- Security groups have minimal required access
- Encryption is enabled where applicable
- Secrets are managed through secure stores
- Testing
- Configuration passes
terraform validate - Code is formatted with
terraform fmt - Static analysis tools pass (TFLint, Checkov)
- Unit tests pass (if using Terratest)
- Configuration passes
- Documentation
- README.md is updated
- Variables are documented
- Outputs are documented
- Architecture diagrams are current
- State Management
- Remote state backend is configured
- State locking is enabled
- State backups are configured
- Workspace strategy is defined
Post-deployment Checklist
- Verification
- All resources were created successfully
- Application is accessible
- Monitoring is working
- Logs are being collected
- Security
- Access controls are working
- Encryption is active
- Security monitoring is enabled
- Compliance requirements are met
- Operations
- Backup procedures are in place
- Disaster recovery is tested
- Cost monitoring is configured
- Alerting is configured
Appendix F: Resource Naming Conventions
Recommended Naming Patterns
# General Pattern: {environment}-{project}-{resource-type}-{identifier}
# Examples:
locals {
# Environment prefixes
env_prefix = {
development = "dev"
staging = "stg"
production = "prod"
}
# Project identifier
project = "myapp"
# Resource naming
vpc_name = "${local.env_prefix[var.environment]}-${local.project}-vpc"
subnet_name = "${local.env_prefix[var.environment]}-${local.project}-subnet"
sg_name = "${local.env_prefix[var.environment]}-${local.project}-sg"
ec2_name = "${local.env_prefix[var.environment]}-${local.project}-ec2"
rds_name = "${local.env_prefix[var.environment]}-${local.project}-db"
s3_name = "${local.env_prefix[var.environment]}-${local.project}-bucket"
# Tag standards
standard_tags = {
Environment = var.environment
Project = local.project
ManagedBy = "Terraform"
Owner = var.team_name
CostCenter = var.cost_center
CreatedDate = formatdate("YYYY-MM-DD", timestamp())
}
}HCLFile Organization Standards
terraform-project/
├── main.tf # Main resources
├── variables.tf # Input variables
├── outputs.tf # Output values
├── versions.tf # Provider requirements
├── locals.tf # Local values
├── data.tf # Data sources
├── providers.tf # Provider configurations
├── terraform.tfvars.example # Example variable values
├── README.md # Project documentation
├── .gitignore # Git ignore patterns
├── .terraform-version # Terraform version
├── environments/ # Environment-specific configs
│ ├── dev/
│ ├── staging/
│ └── prod/
├── modules/ # Custom modules
│ ├── vpc/
│ ├── security-groups/
│ └── compute/
└── scripts/ # Helper scripts
├── deploy.sh
└── destroy.shBashThis comprehensive guide now includes enhanced sections with practical examples, better organization, and complete appendices that serve as quick references for daily Terraform operations. The guide progresses logically from basic concepts to expert-level implementations, making it suitable for learners at all levels.
Discover more from Altgr Blog
Subscribe to get the latest posts sent to your email.
