Back to Blog
DevOps & Deployment

WordPress Infrastructure as Code with Terraform: Provisioning Production-Ready AWS Architecture

Marcus Chen
59 min read

Running WordPress on a single server works until it does not. The moment traffic spikes, a disk fills up, or a PHP process hangs, that lone instance becomes the single point of failure your business never planned for. The answer is not to manually provision more servers and hope the configuration stays consistent. The answer is to treat your infrastructure like software: versioned, tested, repeatable, and reviewed by your team before any change hits production.

Terraform, created by HashiCorp, gives you exactly that capability. You declare what your infrastructure should look like in HCL (HashiCorp Configuration Language), and Terraform figures out how to make reality match your declaration. For WordPress specifically, this means you can spin up an entire production environment on AWS with a single command: load balancers, auto-scaling application servers, managed databases with failover, Redis caching layers, shared file systems, CDN distributions, and secrets management. Every piece is documented in code, and every change goes through version control.

This article walks through building a production-grade AWS architecture for WordPress using Terraform. We will cover module design, networking, compute, databases, caching, file storage, CDN integration, secrets management, state handling, and cost optimization. Every Terraform block is annotated and explained. By the end, you will have a complete configuration you can adapt for your own deployments.

Why Infrastructure as Code for WordPress

WordPress powers over 40% of the web, yet a surprising number of production WordPress installations still run on manually configured servers. An engineer SSHs in, installs packages, edits configuration files, and the server works. Six months later, nobody remembers exactly what was done or why. When the server needs replacing, the team reverse-engineers the setup from memory and scattered notes.

Infrastructure as Code eliminates this problem entirely. Your Terraform files become the single source of truth. New team members read the code to understand the architecture. Changes go through pull requests with peer review. Rolling back a bad change means reverting a commit. Spinning up a staging environment identical to production takes minutes instead of days.

For WordPress workloads specifically, IaC solves several pain points that are unique to the platform:

  • Media uploads need shared storage. When you scale horizontally, uploaded files must be accessible from every application server. EFS handles this, but configuring it manually across multiple instances is error-prone.
  • Database connections need careful management. WordPress uses persistent connections by default, and auto-scaling can overwhelm an RDS instance if connection limits are not planned correctly.
  • Object caching transforms performance. Redis or Memcached can cut database queries by 80% or more, but the caching layer must be provisioned with the right instance type and memory allocation.
  • SSL termination and CDN configuration interact. CloudFront, the Application Load Balancer, and WordPress all need to agree on protocol handling, or you end up with redirect loops.

Terraform lets you solve all of these problems once, encode the solutions in version-controlled files, and replicate them across environments without drift.

Terraform Module Design for WordPress

A well-structured Terraform project uses modules to group related resources. Each module handles one concern, accepts inputs through variables, and exposes outputs that other modules can reference. For a WordPress deployment on AWS, the module layout looks like this:

terraform-wordpress/
├── main.tf                  # Root module, composes all child modules
├── variables.tf             # Top-level input variables
├── outputs.tf               # Top-level outputs (URLs, endpoints)
├── terraform.tfvars         # Environment-specific values
├── backend.tf               # Remote state configuration
├── modules/
│   ├── networking/          # VPC, subnets, route tables, NAT
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── security/            # Security groups, NACLs
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── compute/             # Launch templates, ASG, ALB
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   └── user_data.sh
│   ├── database/            # RDS MySQL, read replicas
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── cache/               # ElastiCache Redis
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── storage/             # EFS for wp-content/uploads
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── cdn/                 # CloudFront distribution
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   └── secrets/             # AWS Secrets Manager, SSM params
│       ├── main.tf
│       ├── variables.tf
│       └── outputs.tf

This separation means your database team can review changes to the database module without wading through networking code. Your security team can audit the security module independently. And when AWS releases a new RDS engine version, you update one module and the change propagates cleanly.

The Root Module

The root module ties everything together. It calls each child module and passes outputs between them. Here is the skeleton:

# main.tf - Root module

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = var.aws_region

  default_tags {
    tags = {
      Project     = "wordpress"
      Environment = var.environment
      ManagedBy   = "terraform"
    }
  }
}

module "networking" {
  source = "./modules/networking"

  environment        = var.environment
  vpc_cidr          = var.vpc_cidr
  availability_zones = var.availability_zones
}

module "security" {
  source = "./modules/security"

  vpc_id      = module.networking.vpc_id
  environment = var.environment
}

module "secrets" {
  source = "./modules/secrets"

  environment = var.environment
  db_password = var.db_password
}

module "database" {
  source = "./modules/database"

  environment          = var.environment
  vpc_id               = module.networking.vpc_id
  private_subnet_ids   = module.networking.private_subnet_ids
  db_security_group_id = module.security.db_security_group_id
  db_password_arn      = module.secrets.db_password_arn
}

module "cache" {
  source = "./modules/cache"

  environment            = var.environment
  private_subnet_ids     = module.networking.private_subnet_ids
  cache_security_group_id = module.security.cache_security_group_id
}

module "storage" {
  source = "./modules/storage"

  environment            = var.environment
  vpc_id                 = module.networking.vpc_id
  private_subnet_ids     = module.networking.private_subnet_ids
  efs_security_group_id  = module.security.efs_security_group_id
}

module "compute" {
  source = "./modules/compute"

  environment            = var.environment
  vpc_id                 = module.networking.vpc_id
  public_subnet_ids      = module.networking.public_subnet_ids
  private_subnet_ids     = module.networking.private_subnet_ids
  app_security_group_id  = module.security.app_security_group_id
  alb_security_group_id  = module.security.alb_security_group_id
  db_endpoint            = module.database.primary_endpoint
  db_password_arn        = module.secrets.db_password_arn
  redis_endpoint         = module.cache.redis_endpoint
  efs_id                 = module.storage.efs_id
  instance_type          = var.instance_type
  ami_id                 = var.ami_id
  min_size               = var.asg_min_size
  max_size               = var.asg_max_size
  desired_capacity       = var.asg_desired_capacity
}

module "cdn" {
  source = "./modules/cdn"

  environment     = var.environment
  alb_dns_name    = module.compute.alb_dns_name
  domain_name     = var.domain_name
  certificate_arn = var.certificate_arn
}

Notice how each module receives only the information it needs. The database module gets subnet IDs and a security group but never sees the compute configuration. This principle of least privilege applies to your Terraform code just as it applies to IAM policies.

VPC, Subnets, and Security Groups

The networking foundation determines everything that follows. A poorly designed VPC creates problems that cascade through every layer of the stack. For WordPress on AWS, you need public subnets for the Application Load Balancer, private subnets for the application servers and database, and NAT Gateways so private instances can reach the internet for package updates.

VPC and Subnet Configuration

# modules/networking/main.tf

resource "aws_vpc" "wordpress" {
  cidr_block           = var.vpc_cidr
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name = "${var.environment}-wordpress-vpc"
  }
}

# Public subnets - one per AZ, for ALB and NAT Gateways
resource "aws_subnet" "public" {
  count = length(var.availability_zones)

  vpc_id                  = aws_vpc.wordpress.id
  cidr_block              = cidrsubnet(var.vpc_cidr, 4, count.index)
  availability_zone       = var.availability_zones[count.index]
  map_public_ip_on_launch = true

  tags = {
    Name = "${var.environment}-public-${var.availability_zones[count.index]}"
    Tier = "public"
  }
}

# Private subnets - one per AZ, for EC2 instances, RDS, ElastiCache
resource "aws_subnet" "private" {
  count = length(var.availability_zones)

  vpc_id            = aws_vpc.wordpress.id
  cidr_block        = cidrsubnet(var.vpc_cidr, 4, count.index + length(var.availability_zones))
  availability_zone = var.availability_zones[count.index]

  tags = {
    Name = "${var.environment}-private-${var.availability_zones[count.index]}"
    Tier = "private"
  }
}

# Internet Gateway for public subnets
resource "aws_internet_gateway" "wordpress" {
  vpc_id = aws_vpc.wordpress.id

  tags = {
    Name = "${var.environment}-wordpress-igw"
  }
}

# Elastic IP for NAT Gateway
resource "aws_eip" "nat" {
  count  = 1  # Single NAT for cost savings; use count = length(var.availability_zones) for HA
  domain = "vpc"

  tags = {
    Name = "${var.environment}-nat-eip"
  }
}

# NAT Gateway in the first public subnet
resource "aws_nat_gateway" "wordpress" {
  count         = 1
  allocation_id = aws_eip.nat[0].id
  subnet_id     = aws_subnet.public[0].id

  tags = {
    Name = "${var.environment}-nat-gw"
  }

  depends_on = [aws_internet_gateway.wordpress]
}

# Route table for public subnets - routes to Internet Gateway
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.wordpress.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.wordpress.id
  }

  tags = {
    Name = "${var.environment}-public-rt"
  }
}

# Route table for private subnets - routes to NAT Gateway
resource "aws_route_table" "private" {
  vpc_id = aws_vpc.wordpress.id

  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.wordpress[0].id
  }

  tags = {
    Name = "${var.environment}-private-rt"
  }
}

# Associate public subnets with public route table
resource "aws_route_table_association" "public" {
  count = length(var.availability_zones)

  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

# Associate private subnets with private route table
resource "aws_route_table_association" "private" {
  count = length(var.availability_zones)

  subnet_id      = aws_subnet.private[count.index].id
  route_table_id = aws_route_table.private.id
}

The cidrsubnet function automatically carves the VPC CIDR into smaller blocks. With a /16 VPC (10.0.0.0/16) and a newbits value of 4, each subnet gets a /20 block, providing 4,091 usable IP addresses per subnet. That is more than enough for even large WordPress deployments.

A single NAT Gateway keeps costs down for development and staging environments. For production, you should deploy one NAT Gateway per Availability Zone so that a zone failure does not cut off internet access for instances in surviving zones. Change the count parameter and adjust the route tables accordingly.

Security Groups

Security groups act as virtual firewalls for each tier. The key principle: only allow traffic that has a legitimate reason to flow between components.

# modules/security/main.tf

# ALB Security Group - accepts HTTP/HTTPS from anywhere
resource "aws_security_group" "alb" {
  name_prefix = "${var.environment}-alb-"
  description = "Security group for WordPress ALB"
  vpc_id      = var.vpc_id

  ingress {
    description = "HTTP from anywhere"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    description = "HTTPS from anywhere"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  lifecycle {
    create_before_destroy = true
  }

  tags = {
    Name = "${var.environment}-alb-sg"
  }
}

# Application Security Group - accepts traffic only from ALB
resource "aws_security_group" "app" {
  name_prefix = "${var.environment}-app-"
  description = "Security group for WordPress application servers"
  vpc_id      = var.vpc_id

  ingress {
    description     = "HTTP from ALB"
    from_port       = 80
    to_port         = 80
    protocol        = "tcp"
    security_groups = [aws_security_group.alb.id]
  }

  ingress {
    description     = "Health check from ALB"
    from_port       = 80
    to_port         = 80
    protocol        = "tcp"
    security_groups = [aws_security_group.alb.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  lifecycle {
    create_before_destroy = true
  }

  tags = {
    Name = "${var.environment}-app-sg"
  }
}

# Database Security Group - accepts MySQL only from app servers
resource "aws_security_group" "db" {
  name_prefix = "${var.environment}-db-"
  description = "Security group for WordPress RDS"
  vpc_id      = var.vpc_id

  ingress {
    description     = "MySQL from application servers"
    from_port       = 3306
    to_port         = 3306
    protocol        = "tcp"
    security_groups = [aws_security_group.app.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  lifecycle {
    create_before_destroy = true
  }

  tags = {
    Name = "${var.environment}-db-sg"
  }
}

# ElastiCache Security Group - accepts Redis only from app servers
resource "aws_security_group" "cache" {
  name_prefix = "${var.environment}-cache-"
  description = "Security group for WordPress ElastiCache"
  vpc_id      = var.vpc_id

  ingress {
    description     = "Redis from application servers"
    from_port       = 6379
    to_port         = 6379
    protocol        = "tcp"
    security_groups = [aws_security_group.app.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  lifecycle {
    create_before_destroy = true
  }

  tags = {
    Name = "${var.environment}-cache-sg"
  }
}

# EFS Security Group - accepts NFS only from app servers
resource "aws_security_group" "efs" {
  name_prefix = "${var.environment}-efs-"
  description = "Security group for WordPress EFS"
  vpc_id      = var.vpc_id

  ingress {
    description     = "NFS from application servers"
    from_port       = 2049
    to_port         = 2049
    protocol        = "tcp"
    security_groups = [aws_security_group.app.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  lifecycle {
    create_before_destroy = true
  }

  tags = {
    Name = "${var.environment}-efs-sg"
  }
}

Every security group references other security groups rather than CIDR blocks for internal traffic. This is intentional. When an auto-scaling group launches a new instance, that instance automatically inherits the correct access rules through its security group membership. You never need to update firewall rules when instances come and go.

The create_before_destroy lifecycle rule prevents downtime during security group updates. Terraform creates the new group, attaches it to resources, and only then deletes the old group.

Auto-Scaling Groups with Custom AMIs

Running WordPress on a single EC2 instance limits you to vertical scaling: bigger instance, bigger cost, same single point of failure. Auto-scaling groups let you scale horizontally. You define a launch template that describes your ideal WordPress server, set minimum and maximum instance counts, and let AWS handle the rest.

Building Custom AMIs with Packer

Before Terraform can launch instances, you need an Amazon Machine Image that has WordPress, PHP, Nginx, and all dependencies pre-installed. Building this with Packer (another HashiCorp tool) keeps your AMI creation reproducible. Here is a condensed Packer template:

# wordpress-ami.pkr.hcl

source "amazon-ebs" "wordpress" {
  ami_name      = "wordpress-${formatdate("YYYYMMDD-hhmm", timestamp())}"
  instance_type = "t3.medium"
  region        = "us-east-1"
  source_ami_filter {
    filters = {
      name                = "ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"
      root-device-type    = "ebs"
      virtualization-type = "hvm"
    }
    owners      = ["099720109477"]  # Canonical
    most_recent = true
  }
  ssh_username = "ubuntu"
}

build {
  sources = ["source.amazon-ebs.wordpress"]

  provisioner "shell" {
    inline = [
      "sudo apt-get update",
      "sudo apt-get install -y nginx php8.1-fpm php8.1-mysql php8.1-redis php8.1-curl php8.1-gd php8.1-xml php8.1-mbstring php8.1-zip php8.1-intl nfs-common amazon-efs-utils",
      "sudo systemctl enable nginx php8.1-fpm",
      "cd /var/www && sudo wget https://wordpress.org/latest.tar.gz",
      "sudo tar -xzf latest.tar.gz -C /var/www/html --strip-components=1",
      "sudo chown -R www-data:www-data /var/www/html",
    ]
  }

  provisioner "file" {
    source      = "configs/nginx-wordpress.conf"
    destination = "/tmp/wordpress.conf"
  }

  provisioner "shell" {
    inline = [
      "sudo mv /tmp/wordpress.conf /etc/nginx/sites-available/wordpress",
      "sudo ln -sf /etc/nginx/sites-available/wordpress /etc/nginx/sites-enabled/",
      "sudo rm -f /etc/nginx/sites-enabled/default",
    ]
  }
}

The AMI bakes in everything that does not change between deployments: the operating system, web server, PHP runtime, and WordPress core files. Configuration that varies by environment (database credentials, Redis endpoints, EFS mount points) gets injected at boot time through the launch template’s user data script.

Launch Template and Auto-Scaling Group

# modules/compute/main.tf

# Application Load Balancer
resource "aws_lb" "wordpress" {
  name               = "${var.environment}-wordpress-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [var.alb_security_group_id]
  subnets            = var.public_subnet_ids

  enable_deletion_protection = var.environment == "production" ? true : false

  tags = {
    Name = "${var.environment}-wordpress-alb"
  }
}

# Target group for WordPress instances
resource "aws_lb_target_group" "wordpress" {
  name     = "${var.environment}-wp-tg"
  port     = 80
  protocol = "HTTP"
  vpc_id   = var.vpc_id

  health_check {
    enabled             = true
    path                = "/wp-login.php"
    port                = "traffic-port"
    healthy_threshold   = 2
    unhealthy_threshold = 3
    timeout             = 10
    interval            = 30
    matcher             = "200,302"
  }

  stickiness {
    type            = "lb_cookie"
    cookie_duration = 3600
    enabled         = true
  }

  tags = {
    Name = "${var.environment}-wp-tg"
  }
}

# HTTPS listener (primary)
resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.wordpress.arn
  port              = "443"
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = var.certificate_arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.wordpress.arn
  }
}

# HTTP listener - redirect to HTTPS
resource "aws_lb_listener" "http" {
  load_balancer_arn = aws_lb.wordpress.arn
  port              = "80"
  protocol          = "HTTP"

  default_action {
    type = "redirect"
    redirect {
      port        = "443"
      protocol    = "HTTPS"
      status_code = "HTTP_301"
    }
  }
}

# IAM role for EC2 instances
resource "aws_iam_role" "wordpress" {
  name = "${var.environment}-wordpress-instance-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
      }
    ]
  })
}

# Policy: allow reading secrets from Secrets Manager and SSM
resource "aws_iam_role_policy" "secrets_access" {
  name = "${var.environment}-secrets-access"
  role = aws_iam_role.wordpress.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "secretsmanager:GetSecretValue",
          "ssm:GetParameter",
          "ssm:GetParameters"
        ]
        Resource = [
          var.db_password_arn,
          "arn:aws:ssm:*:*:parameter/${var.environment}/wordpress/*"
        ]
      }
    ]
  })
}

# Attach SSM managed policy for Session Manager access (replaces SSH)
resource "aws_iam_role_policy_attachment" "ssm" {
  role       = aws_iam_role.wordpress.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

resource "aws_iam_instance_profile" "wordpress" {
  name = "${var.environment}-wordpress-instance-profile"
  role = aws_iam_role.wordpress.name
}

# Launch template defines how each instance is configured
resource "aws_launch_template" "wordpress" {
  name_prefix   = "${var.environment}-wordpress-"
  image_id      = var.ami_id
  instance_type = var.instance_type

  vpc_security_group_ids = [var.app_security_group_id]

  iam_instance_profile {
    arn = aws_iam_instance_profile.wordpress.arn
  }

  # User data script runs on every boot
  user_data = base64encode(templatefile("${path.module}/user_data.sh", {
    environment    = var.environment
    db_endpoint    = var.db_endpoint
    db_password_arn = var.db_password_arn
    redis_endpoint = var.redis_endpoint
    efs_id         = var.efs_id
    aws_region     = data.aws_region.current.name
  }))

  monitoring {
    enabled = true
  }

  metadata_options {
    http_endpoint               = "enabled"
    http_tokens                 = "required"  # Enforce IMDSv2
    http_put_response_hop_limit = 1
  }

  tag_specifications {
    resource_type = "instance"
    tags = {
      Name = "${var.environment}-wordpress-app"
    }
  }

  lifecycle {
    create_before_destroy = true
  }
}

data "aws_region" "current" {}

# Auto-scaling group manages instance lifecycle
resource "aws_autoscaling_group" "wordpress" {
  name_prefix         = "${var.environment}-wordpress-"
  desired_capacity    = var.desired_capacity
  min_size            = var.min_size
  max_size            = var.max_size
  vpc_zone_identifier = var.private_subnet_ids
  target_group_arns   = [aws_lb_target_group.wordpress.arn]
  health_check_type   = "ELB"
  health_check_grace_period = 300

  launch_template {
    id      = aws_launch_template.wordpress.id
    version = "$Latest"
  }

  instance_refresh {
    strategy = "Rolling"
    preferences {
      min_healthy_percentage = 50
      instance_warmup        = 120
    }
  }

  tag {
    key                 = "Name"
    value               = "${var.environment}-wordpress"
    propagate_at_launch = true
  }

  lifecycle {
    create_before_destroy = true
  }
}

# Scale up policy - triggered by high CPU
resource "aws_autoscaling_policy" "scale_up" {
  name                   = "${var.environment}-wordpress-scale-up"
  autoscaling_group_name = aws_autoscaling_group.wordpress.name
  adjustment_type        = "ChangeInCapacity"
  scaling_adjustment     = 1
  cooldown               = 300
}

resource "aws_cloudwatch_metric_alarm" "high_cpu" {
  alarm_name          = "${var.environment}-wordpress-high-cpu"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  period              = 120
  statistic           = "Average"
  threshold           = 70
  alarm_description   = "Scale up when CPU exceeds 70% for 4 minutes"
  alarm_actions       = [aws_autoscaling_policy.scale_up.arn]

  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.wordpress.name
  }
}

# Scale down policy - triggered by low CPU
resource "aws_autoscaling_policy" "scale_down" {
  name                   = "${var.environment}-wordpress-scale-down"
  autoscaling_group_name = aws_autoscaling_group.wordpress.name
  adjustment_type        = "ChangeInCapacity"
  scaling_adjustment     = -1
  cooldown               = 300
}

resource "aws_cloudwatch_metric_alarm" "low_cpu" {
  alarm_name          = "${var.environment}-wordpress-low-cpu"
  comparison_operator = "LessThanThreshold"
  evaluation_periods  = 3
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  period              = 120
  statistic           = "Average"
  threshold           = 30
  alarm_description   = "Scale down when CPU below 30% for 6 minutes"
  alarm_actions       = [aws_autoscaling_policy.scale_down.arn]

  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.wordpress.name
  }
}

The User Data Script

The user data script runs when each instance boots. It mounts EFS, retrieves secrets, writes the WordPress configuration, and starts the services. This is the bridge between your baked AMI and your environment-specific settings.

#!/bin/bash
# modules/compute/user_data.sh
set -euo pipefail

# Mount EFS for shared uploads directory
mkdir -p /var/www/html/wp-content/uploads
mount -t efs -o tls ${efs_id}:/ /var/www/html/wp-content/uploads
echo "${efs_id}:/ /var/www/html/wp-content/uploads efs _netdev,tls 0 0" >> /etc/fstab

# Retrieve database password from Secrets Manager
DB_PASSWORD=$(aws secretsmanager get-secret-value \
  --secret-id "${db_password_arn}" \
  --query 'SecretString' \
  --output text \
  --region ${aws_region})

# Generate WordPress salts
WP_SALTS=$(curl -s https://api.wordpress.org/secret-key/1.1/salt/)

# Write wp-config.php
cat > /var/www/html/wp-config.php << 'WPCONFIG'

There are several WordPress-specific details worth calling out in this script. The FORCE_SSL_ADMIN directive combined with the HTTP_X_FORWARDED_PROTO check prevents the infamous redirect loop that occurs when the ALB terminates SSL and forwards plain HTTP to WordPress. Without this, WordPress sees an HTTP request, tries to redirect to HTTPS, the ALB forwards it as HTTP again, and the browser gives up after too many redirects.

The DISALLOW_FILE_EDIT flag is critical in a multi-server environment. If an admin edits a theme file through the WordPress editor, that change only applies to whichever server handled the request. The other instances still run the old code. Disabling the editor forces all code changes through your deployment pipeline where they apply consistently.

Amazon RDS with Read Replicas and Failover

WordPress is a database-heavy application. Every page load triggers multiple queries, and complex sites with many plugins can execute hundreds of queries per request. RDS gives you a managed MySQL (or MariaDB) instance with automated backups, patching, and Multi-AZ failover without the operational burden of running your own database server.

# modules/database/main.tf

# Subnet group tells RDS which subnets to use
resource "aws_db_subnet_group" "wordpress" {
  name       = "${var.environment}-wordpress-db-subnet"
  subnet_ids = var.private_subnet_ids

  tags = {
    Name = "${var.environment}-wordpress-db-subnet"
  }
}

# Parameter group for MySQL tuning
resource "aws_db_parameter_group" "wordpress" {
  family = "mysql8.0"
  name   = "${var.environment}-wordpress-params"

  # Increase max connections for auto-scaling environment
  parameter {
    name  = "max_connections"
    value = "500"
  }

  # Enable slow query log for performance analysis
  parameter {
    name  = "slow_query_log"
    value = "1"
  }

  parameter {
    name  = "long_query_time"
    value = "2"
  }

  # InnoDB buffer pool - let RDS manage based on instance memory
  parameter {
    name  = "innodb_buffer_pool_size"
    value = "{DBInstanceClassMemory*3/4}"
  }

  # Query cache is deprecated in MySQL 8.0, use Redis instead
  # WordPress benefits more from object caching than query caching

  tags = {
    Name = "${var.environment}-wordpress-params"
  }
}

# Primary RDS instance with Multi-AZ for automatic failover
resource "aws_db_instance" "wordpress" {
  identifier     = "${var.environment}-wordpress-primary"
  engine         = "mysql"
  engine_version = "8.0"

  instance_class        = var.environment == "production" ? "db.r6g.large" : "db.t4g.medium"
  allocated_storage     = 100
  max_allocated_storage = 500  # Auto-scaling storage

  db_name  = "wordpress"
  username = "wordpress"
  password = data.aws_secretsmanager_secret_version.db_password.secret_string

  db_subnet_group_name   = aws_db_subnet_group.wordpress.name
  vpc_security_group_ids = [var.db_security_group_id]
  parameter_group_name   = aws_db_parameter_group.wordpress.name

  multi_az            = var.environment == "production" ? true : false
  publicly_accessible = false

  # Backup configuration
  backup_retention_period = 14
  backup_window           = "03:00-04:00"
  maintenance_window      = "sun:04:00-sun:05:00"

  # Encryption at rest
  storage_encrypted = true

  # Performance Insights for query analysis
  performance_insights_enabled          = true
  performance_insights_retention_period = 7

  # Deletion protection for production
  deletion_protection = var.environment == "production" ? true : false

  # Final snapshot before deletion
  skip_final_snapshot       = var.environment == "production" ? false : true
  final_snapshot_identifier = var.environment == "production" ? "${var.environment}-wordpress-final-${formatdate("YYYYMMDD", timestamp())}" : null

  tags = {
    Name = "${var.environment}-wordpress-primary"
  }
}

# Read replica for offloading read-heavy queries
resource "aws_db_instance" "wordpress_replica" {
  count = var.environment == "production" ? 1 : 0

  identifier          = "${var.environment}-wordpress-replica"
  replicate_source_db = aws_db_instance.wordpress.identifier
  instance_class      = "db.r6g.large"

  publicly_accessible    = false
  vpc_security_group_ids = [var.db_security_group_id]
  parameter_group_name   = aws_db_parameter_group.wordpress.name

  performance_insights_enabled          = true
  performance_insights_retention_period = 7

  tags = {
    Name = "${var.environment}-wordpress-replica"
  }
}

data "aws_secretsmanager_secret_version" "db_password" {
  secret_id = var.db_password_arn
}

output "primary_endpoint" {
  value = aws_db_instance.wordpress.endpoint
}

output "replica_endpoint" {
  value = var.environment == "production" ? aws_db_instance.wordpress_replica[0].endpoint : null
}

Using Read Replicas with WordPress

WordPress does not natively support read replicas. Out of the box, every query goes to the same database host. To split reads and writes, you need a drop-in plugin like HyperDB (from Automattic) or LudicrousDB. These plugins intercept WordPress database queries, examine whether they are reads (SELECT) or writes (INSERT, UPDATE, DELETE), and route them to the appropriate server.

Add the replica endpoint to your wp-config.php as a secondary database server. The HyperDB configuration looks like this:

// db-config.php (HyperDB configuration)
$wpdb->add_database(array(
    'host'     => 'primary-endpoint.rds.amazonaws.com',
    'user'     => 'wordpress',
    'password' => $db_password,
    'name'     => 'wordpress',
    'write'    => 1,     // Primary handles writes
    'read'     => 1,     // Primary also handles reads as fallback
    'dataset'  => 'global',
    'timeout'  => 0.2,
));

$wpdb->add_database(array(
    'host'     => 'replica-endpoint.rds.amazonaws.com',
    'user'     => 'wordpress',
    'password' => $db_password,
    'name'     => 'wordpress',
    'write'    => 0,     // Replica never handles writes
    'read'     => 2,     // Higher priority for reads
    'dataset'  => 'global',
    'timeout'  => 0.2,
));

The read priority value (2 for the replica vs. 1 for the primary) ensures read queries prefer the replica. If the replica goes down, reads fall back to the primary. This setup can cut the load on your primary instance by 60-80% on read-heavy WordPress sites, which is most of them.

ElastiCache Redis for Object Cache and Sessions

Object caching is the single most effective performance optimization for WordPress after page caching. Every time WordPress needs an option value, a user's metadata, or a transient, it queries the database. With a persistent object cache backed by Redis, these values are stored in memory and retrieved in microseconds instead of milliseconds.

In a horizontally scaled environment, Redis serves a second critical purpose: session storage. PHP sessions are stored on the local filesystem by default. When a user's requests get routed to different servers by the load balancer, their session data disappears. Storing sessions in Redis makes them available from any application server.

# modules/cache/main.tf

# Subnet group for ElastiCache
resource "aws_elasticache_subnet_group" "wordpress" {
  name       = "${var.environment}-wordpress-cache-subnet"
  subnet_ids = var.private_subnet_ids

  tags = {
    Name = "${var.environment}-wordpress-cache-subnet"
  }
}

# Parameter group for Redis tuning
resource "aws_elasticache_parameter_group" "wordpress" {
  family = "redis7"
  name   = "${var.environment}-wordpress-redis-params"

  # Maximum memory policy: evict least recently used keys when memory is full
  parameter {
    name  = "maxmemory-policy"
    value = "allkeys-lru"
  }

  # Disable snapshotting for cache-only use (saves memory)
  parameter {
    name  = "save"
    value = ""
  }

  tags = {
    Name = "${var.environment}-wordpress-redis-params"
  }
}

# Redis replication group with automatic failover
resource "aws_elasticache_replication_group" "wordpress" {
  replication_group_id = "${var.environment}-wp-redis"
  description          = "Redis cluster for WordPress object cache and sessions"

  node_type            = var.environment == "production" ? "cache.r6g.large" : "cache.t4g.medium"
  num_cache_clusters   = var.environment == "production" ? 2 : 1
  port                 = 6379

  subnet_group_name    = aws_elasticache_subnet_group.wordpress.name
  security_group_ids   = [var.cache_security_group_id]
  parameter_group_name = aws_elasticache_parameter_group.wordpress.name

  automatic_failover_enabled = var.environment == "production" ? true : false
  multi_az_enabled           = var.environment == "production" ? true : false

  at_rest_encryption_enabled = true
  transit_encryption_enabled = false  # Set to true if you need encryption in transit

  # Maintenance and snapshot windows
  maintenance_window       = "sun:05:00-sun:06:00"
  snapshot_retention_limit = 0  # No snapshots for cache-only use

  tags = {
    Name = "${var.environment}-wordpress-redis"
  }
}

output "redis_endpoint" {
  value = aws_elasticache_replication_group.wordpress.primary_endpoint_address
}

The allkeys-lru eviction policy is important for WordPress. When Redis runs out of memory, it evicts the least recently used keys to make room for new ones. This prevents Redis from rejecting writes when it fills up, which would cause WordPress to fall back to database queries silently rather than crashing.

For production deployments, two cache clusters with automatic failover give you a standby node that promotes to primary within seconds if the active node fails. Your WordPress site keeps serving cached content without interruption.

Configuring the Redis Object Cache Plugin

On the WordPress side, install the Redis Object Cache plugin by Till Kruss. The plugin reads the Redis connection details from wp-config.php constants that the user data script already writes. The key constants are:

// Already in wp-config.php from user_data.sh
define('WP_REDIS_HOST', 'primary-endpoint.cache.amazonaws.com');
define('WP_REDIS_PORT', 6379);
define('WP_REDIS_DATABASE', 0);     // DB 0 for object cache
define('WP_REDIS_TIMEOUT', 1);       // 1 second connection timeout
define('WP_REDIS_READ_TIMEOUT', 1);  // 1 second read timeout

// For PHP session storage, add to php.ini or pool config:
// session.save_handler = redis
// session.save_path = "tcp://primary-endpoint.cache.amazonaws.com:6379?database=1"

Using a separate Redis database (database 1) for sessions keeps them isolated from the object cache. When you flush the object cache during a deployment, sessions remain intact and users stay logged in.

EFS for Shared wp-content/uploads

WordPress stores uploaded media files on the local filesystem by default. In a single-server setup, this works fine. In a multi-server auto-scaling environment, it becomes a problem immediately. A user uploads an image, and it lands on server A. The next request goes to server B, which has no knowledge of that file. The image returns a 404.

Amazon Elastic File System (EFS) provides a shared NFS filesystem that multiple EC2 instances can mount simultaneously. Every server sees the same files, and writes from one instance are immediately visible to all others.

# modules/storage/main.tf

# EFS filesystem for shared WordPress uploads
resource "aws_efs_file_system" "wordpress" {
  creation_token = "${var.environment}-wordpress-uploads"
  encrypted      = true

  performance_mode = "generalPurpose"
  throughput_mode  = "bursting"  # Use "elastic" for unpredictable workloads

  lifecycle_policy {
    transition_to_ia = "AFTER_30_DAYS"  # Move old files to Infrequent Access tier
  }

  lifecycle_policy {
    transition_to_primary_storage_class = "AFTER_1_ACCESS"  # Move back on access
  }

  tags = {
    Name = "${var.environment}-wordpress-uploads"
  }
}

# Mount targets - one per AZ so instances in any AZ can connect
resource "aws_efs_mount_target" "wordpress" {
  count = length(var.private_subnet_ids)

  file_system_id  = aws_efs_file_system.wordpress.id
  subnet_id       = var.private_subnet_ids[count.index]
  security_groups = [var.efs_security_group_id]
}

# EFS access point with POSIX permissions matching www-data
resource "aws_efs_access_point" "wordpress" {
  file_system_id = aws_efs_file_system.wordpress.id

  posix_user {
    gid = 33  # www-data group
    uid = 33  # www-data user
  }

  root_directory {
    path = "/uploads"
    creation_info {
      owner_gid   = 33
      owner_uid   = 33
      permissions = "0755"
    }
  }

  tags = {
    Name = "${var.environment}-wordpress-uploads-ap"
  }
}

# Backup policy - daily automated backups
resource "aws_efs_backup_policy" "wordpress" {
  file_system_id = aws_efs_file_system.wordpress.id

  backup_policy {
    status = "ENABLED"
  }
}

output "efs_id" {
  value = aws_efs_file_system.wordpress.id
}

output "efs_dns_name" {
  value = aws_efs_file_system.wordpress.dns_name
}

The lifecycle policies deserve attention. WordPress sites accumulate media files over years, but older content gets accessed far less frequently than recent uploads. The Infrequent Access tier costs about $0.016 per GB-month compared to $0.30 for the standard tier. By automatically transitioning files older than 30 days, you can cut storage costs by 90% for archival media while keeping everything accessible.

The access point sets POSIX user and group IDs to 33, which is the numeric ID for the www-data user on Ubuntu systems. This prevents permission conflicts between the web server and the filesystem. Without this, Nginx or PHP-FPM might not be able to read or write uploaded files, causing mysterious upload failures.

EFS Performance Considerations

EFS has higher latency than local EBS storage. For WordPress media uploads, this latency is usually acceptable because users do not upload files as frequently as they request pages. However, you should NOT mount your entire WordPress installation on EFS. PHP files, theme files, and plugins should live on the local filesystem (or EBS) for fast execution. Only the wp-content/uploads directory belongs on EFS.

If your site serves a lot of media directly (rather than through a CDN), consider EFS Elastic throughput mode instead of Bursting. Bursting provides a baseline of 50 MiB/s per TiB of storage and allows bursts up to 100 MiB/s, but once you exhaust your burst credits, throughput drops dramatically. Elastic mode scales automatically based on workload and charges per-transfer, which is more predictable for high-traffic sites.

CloudFront CDN Integration

A Content Delivery Network caches your site's static assets at edge locations worldwide, reducing load on your origin servers and improving page load times for visitors far from your AWS region. CloudFront integrates tightly with other AWS services and supports custom cache behaviors that map well to WordPress URL patterns.

# modules/cdn/main.tf

resource "aws_cloudfront_distribution" "wordpress" {
  enabled             = true
  is_ipv6_enabled     = true
  comment             = "${var.environment} WordPress CDN"
  default_root_object = ""
  price_class         = "PriceClass_100"  # US, Canada, Europe only; use PriceClass_All for global
  aliases             = [var.domain_name, "www.${var.domain_name}"]
  web_acl_id          = var.waf_web_acl_arn  # Optional: attach AWS WAF

  # Origin pointing to the ALB
  origin {
    domain_name = var.alb_dns_name
    origin_id   = "wordpress-alb"

    custom_origin_config {
      http_port              = 80
      https_port             = 443
      origin_protocol_policy = "https-only"
      origin_ssl_protocols   = ["TLSv1.2"]
    }

    custom_header {
      name  = "X-CloudFront-Secret"
      value = var.cloudfront_secret  # Verify requests actually come from CloudFront
    }
  }

  # Default behavior - dynamic WordPress pages (no caching)
  default_cache_behavior {
    allowed_methods  = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = "wordpress-alb"

    forwarded_values {
      query_string = true
      headers      = ["Host", "Authorization", "CloudFront-Forwarded-Proto"]

      cookies {
        forward = "whitelist"
        whitelisted_names = [
          "wordpress_*",
          "wp-settings-*",
          "comment_author_*",
        ]
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl                = 0
    default_ttl            = 0      # Do not cache dynamic pages by default
    max_ttl                = 0
    compress               = true
  }

  # Cache static assets aggressively
  ordered_cache_behavior {
    path_pattern     = "/wp-content/*"
    allowed_methods  = ["GET", "HEAD", "OPTIONS"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = "wordpress-alb"

    forwarded_values {
      query_string = true
      headers      = ["Host", "Origin", "Access-Control-Request-Headers", "Access-Control-Request-Method"]

      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl                = 0
    default_ttl            = 86400     # 24 hours
    max_ttl                = 31536000  # 1 year
    compress               = true
  }

  # Cache wp-includes static files
  ordered_cache_behavior {
    path_pattern     = "/wp-includes/*"
    allowed_methods  = ["GET", "HEAD", "OPTIONS"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = "wordpress-alb"

    forwarded_values {
      query_string = true
      headers      = ["Host"]

      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl                = 0
    default_ttl            = 86400
    max_ttl                = 31536000
    compress               = true
  }

  # Never cache admin or login pages
  ordered_cache_behavior {
    path_pattern     = "/wp-admin/*"
    allowed_methods  = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = "wordpress-alb"

    forwarded_values {
      query_string = true
      headers      = ["*"]

      cookies {
        forward = "all"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl                = 0
    default_ttl            = 0
    max_ttl                = 0
    compress               = true
  }

  # SSL certificate
  viewer_certificate {
    acm_certificate_arn      = var.certificate_arn
    ssl_support_method       = "sni-only"
    minimum_protocol_version = "TLSv1.2_2021"
  }

  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }

  tags = {
    Name = "${var.environment}-wordpress-cdn"
  }
}

# DNS record pointing to CloudFront
resource "aws_route53_record" "wordpress" {
  zone_id = var.hosted_zone_id
  name    = var.domain_name
  type    = "A"

  alias {
    name                   = aws_cloudfront_distribution.wordpress.domain_name
    zone_id                = aws_cloudfront_distribution.wordpress.hosted_zone_id
    evaluate_target_health = false
  }
}

output "cloudfront_domain" {
  value = aws_cloudfront_distribution.wordpress.domain_name
}

output "cloudfront_distribution_id" {
  value = aws_cloudfront_distribution.wordpress.id
}

The cache behavior ordering matters. CloudFront evaluates ordered behaviors from top to bottom and uses the first matching pattern. Static assets under /wp-content/ and /wp-includes/ get cached for up to a year. Admin pages never get cached. Everything else (the default behavior) passes through to the origin without caching, which lets WordPress handle dynamic content generation, cookies, and authenticated sessions correctly.

The X-CloudFront-Secret custom header is a security measure. Without it, anyone who discovers your ALB's DNS name can bypass CloudFront entirely, skipping your CDN cache and any WAF rules attached to the distribution. Add a check in your Nginx configuration that rejects requests missing this header:

# In nginx WordPress server block
if ($http_x_cloudfront_secret != "your-secret-value-here") {
    return 403;
}

Cache Invalidation Strategy

When you publish a new post or update content, the cached version on CloudFront becomes stale. There are two approaches to handle this. The first is to use short TTLs for HTML pages and rely on CloudFront's cache headers. Set Cache-Control: s-maxage=300 in your Nginx config for non-admin pages, giving you a 5-minute cache that balances performance with freshness. The second approach is to create CloudFront invalidations programmatically using a WordPress plugin that calls the CloudFront API when content changes. The first approach is simpler and works well for most sites. The second is better for news sites or any situation where stale content has business impact.

Secrets Management with AWS Secrets Manager and SSM

Hardcoding credentials in Terraform files is a security antipattern that will eventually burn you. Database passwords, API keys, and other sensitive values must never appear in version control. AWS provides two services for this: Secrets Manager for credentials that need rotation, and Systems Manager Parameter Store for configuration values.

# modules/secrets/main.tf

# Database password stored in Secrets Manager
resource "aws_secretsmanager_secret" "db_password" {
  name                    = "${var.environment}/wordpress/db-password"
  description             = "WordPress database password"
  recovery_window_in_days = var.environment == "production" ? 30 : 0

  tags = {
    Name = "${var.environment}-wordpress-db-password"
  }
}

resource "aws_secretsmanager_secret_version" "db_password" {
  secret_id     = aws_secretsmanager_secret.db_password.id
  secret_string = var.db_password
}

# Optional: automatic rotation for the database password
resource "aws_secretsmanager_secret_rotation" "db_password" {
  count = var.environment == "production" ? 1 : 0

  secret_id           = aws_secretsmanager_secret.db_password.id
  rotation_lambda_arn = var.rotation_lambda_arn

  rotation_rules {
    automatically_after_days = 30
  }
}

# SSM Parameter Store for non-secret configuration
resource "aws_ssm_parameter" "wordpress_config" {
  for_each = {
    "site_url"        = "https://${var.domain_name}"
    "admin_email"     = var.admin_email
    "cache_ttl"       = "3600"
    "max_upload_size"  = "64M"
  }

  name  = "/${var.environment}/wordpress/${each.key}"
  type  = "String"
  value = each.value

  tags = {
    Name = "${var.environment}-wordpress-${each.key}"
  }
}

# SSM Parameter for sensitive values that don't need rotation
resource "aws_ssm_parameter" "stripe_key" {
  name  = "/${var.environment}/wordpress/stripe_secret_key"
  type  = "SecureString"
  value = var.stripe_secret_key

  tags = {
    Name = "${var.environment}-wordpress-stripe-key"
  }
}

output "db_password_arn" {
  value = aws_secretsmanager_secret.db_password.arn
}

The difference between Secrets Manager and Parameter Store comes down to features and cost. Secrets Manager costs $0.40 per secret per month plus $0.05 per 10,000 API calls, but it supports automatic rotation with Lambda functions. Parameter Store is free for standard parameters (up to 10,000) and $0.05 per 10,000 API calls for advanced parameters. Use Secrets Manager for database passwords that should rotate automatically. Use Parameter Store for everything else.

In your user data script, the EC2 instance retrieves secrets at boot time using the AWS CLI. The IAM role attached to the instance profile grants permission to read only the specific secrets it needs. If an instance is compromised, the attacker can access the WordPress database password but not, for example, the production Stripe key for a different application.

Handling Secrets in Terraform Variables

The initial secret values still need to come from somewhere. The most common approach is to pass them through environment variables or a .tfvars file that is excluded from version control:

# terraform.tfvars (DO NOT commit this file)
db_password      = "your-strong-generated-password"
stripe_secret_key = "sk_live_..."

# .gitignore
*.tfvars
*.tfvars.json
.terraform/
*.tfstate
*.tfstate.backup

For team environments, use a CI/CD pipeline that pulls secrets from a secure vault (like HashiCorp Vault or 1Password) and injects them as environment variables during terraform apply. This way, no human ever needs to see the production database password.

Terraform State Management and Team Workflows

Terraform tracks the current state of your infrastructure in a state file. This file maps Terraform resource addresses to real AWS resource IDs. Without it, Terraform cannot determine what already exists, what needs updating, and what should be destroyed. By default, state lives in a local terraform.tfstate file. For teams, this is unworkable. Two engineers running terraform apply simultaneously from different machines will corrupt the state or create duplicate resources.

Remote State with S3 and DynamoDB

The standard approach is to store state in an S3 bucket with DynamoDB-based locking. The S3 bucket provides durable, versioned storage. DynamoDB provides a distributed lock that prevents concurrent applies.

# backend.tf

terraform {
  backend "s3" {
    bucket         = "mycompany-terraform-state"
    key            = "wordpress/production/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"
  }
}

# You need to create the S3 bucket and DynamoDB table before using this backend.
# Use a separate bootstrap Terraform configuration or create them manually.

# bootstrap/main.tf (run once, store state locally)
resource "aws_s3_bucket" "terraform_state" {
  bucket = "mycompany-terraform-state"

  tags = {
    Name = "Terraform State"
  }
}

resource "aws_s3_bucket_versioning" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id

  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "aws:kms"
    }
  }
}

resource "aws_s3_bucket_public_access_block" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-state-lock"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }

  tags = {
    Name = "Terraform State Lock"
  }
}

S3 bucket versioning is essential. If a state file becomes corrupted or someone runs a destructive operation by accident, you can recover a previous version. The KMS encryption ensures state data, which contains sensitive information like database endpoints and resource ARNs, stays encrypted at rest.

Team Workflow with Terraform Cloud or CI/CD

For teams, the safest workflow prevents anyone from running terraform apply on their local machine. Instead, all changes go through a code review process:

  1. An engineer creates a branch and modifies Terraform files.
  2. They open a pull request. The CI pipeline runs terraform plan and posts the output as a PR comment.
  3. A teammate reviews the plan, checking for unexpected resource deletions or modifications.
  4. After approval and merge, the CD pipeline runs terraform apply against the main branch.

Here is a GitHub Actions workflow that implements this pattern:

# .github/workflows/terraform.yml
name: Terraform WordPress Infrastructure

on:
  pull_request:
    paths:
      - 'terraform/**'
  push:
    branches:
      - main
    paths:
      - 'terraform/**'

permissions:
  id-token: write   # For OIDC auth with AWS
  contents: read
  pull-requests: write

jobs:
  plan:
    runs-on: ubuntu-latest
    if: github.event_name == 'pull_request'
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.6.0

      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/terraform-github-actions
          aws-region: us-east-1

      - name: Terraform Init
        run: terraform init
        working-directory: terraform

      - name: Terraform Plan
        id: plan
        run: terraform plan -no-color -out=tfplan
        working-directory: terraform

      - name: Comment PR with Plan
        uses: actions/github-script@v7
        with:
          script: |
            const plan = `${{ steps.plan.outputs.stdout }}`;
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `## Terraform Plan\n\`\`\`\n${plan}\n\`\`\``
            });

  apply:
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main' && github.event_name == 'push'
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.6.0

      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/terraform-github-actions
          aws-region: us-east-1

      - name: Terraform Init
        run: terraform init
        working-directory: terraform

      - name: Terraform Apply
        run: terraform apply -auto-approve
        working-directory: terraform

The OIDC authentication (id-token: write) removes the need for stored AWS access keys. GitHub Actions assumes an IAM role directly, and the temporary credentials expire after the workflow completes. This is far safer than storing long-lived access keys as repository secrets.

Cost Estimation and Right-Sizing

Infrastructure as Code does not mean unlimited infrastructure. Every resource in your Terraform configuration has a cost, and those costs add up quickly on AWS. Understanding the cost profile of each component helps you make informed decisions about instance sizing, redundancy, and optimization.

Monthly Cost Breakdown

Here is a realistic cost estimate for a production WordPress deployment on AWS using the configuration described in this article. All prices are for the us-east-1 region as of early 2022.

Compute (EC2 Auto-Scaling Group)

  • 2x t3.large instances (baseline): ~$120/month
  • Reserved instances (1-year, no upfront): ~$80/month (33% savings)
  • Savings plans (1-year, compute): ~$85/month

Database (RDS MySQL)

  • db.r6g.large Multi-AZ primary: ~$350/month
  • db.r6g.large read replica: ~$175/month
  • 100 GB storage: ~$23/month
  • Reserved instances (1-year): ~$230/month for primary (34% savings)

Cache (ElastiCache Redis)

  • cache.r6g.large with replica: ~$290/month
  • Reserved nodes (1-year): ~$190/month (34% savings)

Storage (EFS)

  • 10 GB Standard: ~$3/month
  • 50 GB Infrequent Access: ~$0.80/month
  • Practically free for most WordPress sites

Networking

  • NAT Gateway (1x): ~$32/month + $0.045/GB processed
  • Application Load Balancer: ~$16/month + $0.008/LCU-hour
  • Data transfer: Varies widely, budget $20-100/month

CDN (CloudFront)

  • Depends entirely on traffic volume
  • First 1 TB/month: $0.085/GB
  • 10 million HTTP requests: ~$7.50
  • Typical WordPress site (100K monthly visitors): $10-30/month

Total estimated range: $700-1,200/month for production, $200-400/month for staging (smaller instances, no Multi-AZ, no replicas).

Right-Sizing Strategies

These costs assume a medium-traffic WordPress site. You can reduce them significantly with some targeted choices:

Start small and scale up. Use t3.medium instances instead of t3.large and monitor CPU credit consumption. If your application consistently uses less than 20% CPU, you are paying for capacity you do not need. The auto-scaling group handles traffic spikes, so your base instances can be smaller than you think.

Use Graviton (ARM) instances. The r6g, t4g, and m6g instance families use AWS Graviton processors and cost about 20% less than their Intel equivalents while delivering equal or better performance. WordPress runs perfectly on ARM since PHP has had ARM support for years. Swap t3.large for t4g.large and db.r6g.large for db.r7g.large to capture these savings.

Reserve what you can predict. Your baseline instances (minimum ASG capacity), primary database, and Redis cluster run 24/7. Reserved instances or Savings Plans cut those costs by 30-60% depending on the commitment term and payment option. Only reserve the baseline. Let on-demand pricing handle the variable auto-scaling capacity.

Skip the read replica until you need it. For most WordPress sites, the primary RDS instance handles both reads and writes without breaking a sweat. Add the replica when Performance Insights shows your read IOPS consistently exceeding 70% of the instance maximum, or when you need the additional failover protection.

Integrate Infracost into your CI pipeline. Infracost is an open-source tool that estimates costs from Terraform plan output. Add it to your GitHub Actions workflow so every pull request shows the cost impact of infrastructure changes. Engineers see "This change adds $150/month" directly on the PR, which drives better cost-aware decisions.

# Add to your GitHub Actions workflow
- name: Infracost Breakdown
  uses: infracost/actions/setup@v2
  with:
    api-key: ${{ secrets.INFRACOST_API_KEY }}

- name: Generate Infracost JSON
  run: infracost breakdown --path=terraform --format=json --out-file=/tmp/infracost.json

- name: Post Infracost Comment
  run: |
    infracost comment github \
      --path=/tmp/infracost.json \
      --repo=$GITHUB_REPOSITORY \
      --pull-request=${{ github.event.pull_request.number }} \
      --github-token=${{ github.token }}

Complete Annotated Variables and Outputs

With all the modules defined, here are the top-level variables and outputs that tie the configuration together. These represent the knobs and dials you adjust for each environment.

# variables.tf - Top-level input variables

variable "aws_region" {
  description = "AWS region for all resources"
  type        = string
  default     = "us-east-1"
}

variable "environment" {
  description = "Deployment environment (staging, production)"
  type        = string
  validation {
    condition     = contains(["staging", "production"], var.environment)
    error_message = "Environment must be staging or production."
  }
}

variable "vpc_cidr" {
  description = "CIDR block for the VPC"
  type        = string
  default     = "10.0.0.0/16"
}

variable "availability_zones" {
  description = "List of AZs to deploy across"
  type        = list(string)
  default     = ["us-east-1a", "us-east-1b", "us-east-1c"]
}

variable "instance_type" {
  description = "EC2 instance type for WordPress servers"
  type        = string
  default     = "t4g.medium"
}

variable "ami_id" {
  description = "AMI ID for WordPress servers (built with Packer)"
  type        = string
}

variable "asg_min_size" {
  description = "Minimum number of instances in the auto-scaling group"
  type        = number
  default     = 2
}

variable "asg_max_size" {
  description = "Maximum number of instances in the auto-scaling group"
  type        = number
  default     = 10
}

variable "asg_desired_capacity" {
  description = "Starting number of instances in the auto-scaling group"
  type        = number
  default     = 2
}

variable "db_password" {
  description = "Master password for the RDS instance"
  type        = string
  sensitive   = true
}

variable "domain_name" {
  description = "Primary domain name for the WordPress site"
  type        = string
}

variable "certificate_arn" {
  description = "ARN of the ACM certificate for SSL"
  type        = string
}

variable "stripe_secret_key" {
  description = "Stripe secret key for payment processing"
  type        = string
  sensitive   = true
  default     = ""
}

variable "admin_email" {
  description = "WordPress admin email address"
  type        = string
  default     = "[email protected]"
}
# outputs.tf - Top-level outputs

output "alb_dns_name" {
  description = "DNS name of the Application Load Balancer"
  value       = module.compute.alb_dns_name
}

output "cloudfront_domain" {
  description = "CloudFront distribution domain name"
  value       = module.cdn.cloudfront_domain
}

output "cloudfront_distribution_id" {
  description = "CloudFront distribution ID (for cache invalidation)"
  value       = module.cdn.cloudfront_distribution_id
}

output "rds_primary_endpoint" {
  description = "RDS primary instance endpoint"
  value       = module.database.primary_endpoint
}

output "rds_replica_endpoint" {
  description = "RDS read replica endpoint (null if not production)"
  value       = module.database.replica_endpoint
}

output "redis_endpoint" {
  description = "ElastiCache Redis primary endpoint"
  value       = module.cache.redis_endpoint
}

output "efs_id" {
  description = "EFS filesystem ID"
  value       = module.storage.efs_id
}

output "vpc_id" {
  description = "VPC ID"
  value       = module.networking.vpc_id
}

output "estimated_monthly_cost" {
  description = "Rough monthly cost estimate"
  value       = "Run 'infracost breakdown --path=.' for detailed cost estimate"
}

Environment-Specific Configurations

Create separate .tfvars files for each environment. The staging configuration uses smaller instances, skips Multi-AZ, and disables read replicas:

# staging.tfvars
environment       = "staging"
instance_type     = "t4g.small"
asg_min_size      = 1
asg_max_size      = 2
asg_desired_capacity = 1
domain_name       = "staging.yourdomain.com"
certificate_arn   = "arn:aws:acm:us-east-1:123456789012:certificate/staging-cert-id"
admin_email       = "[email protected]"

# production.tfvars
environment       = "production"
instance_type     = "t4g.large"
asg_min_size      = 2
asg_max_size      = 10
asg_desired_capacity = 2
domain_name       = "yourdomain.com"
certificate_arn   = "arn:aws:acm:us-east-1:123456789012:certificate/prod-cert-id"
admin_email       = "[email protected]"

Deploy to staging with terraform apply -var-file=staging.tfvars and production with terraform apply -var-file=production.tfvars. Same code, different parameters, consistent architecture across environments.

Operational Considerations and Day-Two Operations

Provisioning infrastructure is only the beginning. Running WordPress in production on AWS requires ongoing operational attention. Terraform helps with the infrastructure layer, but you need additional tooling and processes for application-level concerns.

Deployments and Rolling Updates

WordPress deployments in an auto-scaled environment work differently than deploying to a single server. You cannot just SSH in and run git pull. Two approaches work well:

AMI-based deployments. Build a new AMI with the updated WordPress code using Packer. Update the AMI ID in your Terraform variables. Run terraform apply. The auto-scaling group performs a rolling replacement, launching new instances with the updated AMI and draining old instances. This is the most reliable approach because each instance boots from a known-good image.

CodeDeploy-based deployments. Keep the AMI stable and use AWS CodeDeploy to push application code to running instances. This is faster because you do not need to build a new AMI for every code change, but it introduces configuration drift between the AMI and the running state of instances. If an instance is replaced by auto-scaling, it launches with the old code until CodeDeploy catches up.

For WordPress specifically, AMI-based deployments are usually the better choice. WordPress code changes are infrequent (core updates, plugin updates, theme changes), and each change should be tested before deployment anyway. The extra time to build an AMI is a worthwhile tradeoff for the consistency guarantee.

Database Migrations

WordPress handles its own database schema migrations through the wp_upgrade() function. When you update WordPress core, the first admin page load detects the version mismatch and runs the necessary database updates. In a multi-server environment, this means the first server to run the update modifies the shared database. Subsequent servers detect that the migration has already completed and skip it.

This usually works without issues, but there is a race condition risk if multiple servers boot simultaneously with a new WordPress version. To eliminate this risk, include a pre-deployment step that runs wp core update-db via WP-CLI on a single instance before updating the auto-scaling group.

Monitoring and Alerting

Add CloudWatch alarms for the metrics that actually matter for WordPress performance:

# monitoring.tf (add to root module or create a monitoring module)

# RDS: High CPU indicates slow queries or undersized instance
resource "aws_cloudwatch_metric_alarm" "rds_cpu" {
  alarm_name          = "${var.environment}-rds-high-cpu"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 3
  metric_name         = "CPUUtilization"
  namespace           = "AWS/RDS"
  period              = 300
  statistic           = "Average"
  threshold           = 80
  alarm_description   = "RDS CPU exceeds 80% for 15 minutes"
  alarm_actions       = [aws_sns_topic.alerts.arn]

  dimensions = {
    DBInstanceIdentifier = module.database.primary_instance_id
  }
}

# RDS: Free storage running low
resource "aws_cloudwatch_metric_alarm" "rds_storage" {
  alarm_name          = "${var.environment}-rds-low-storage"
  comparison_operator = "LessThanThreshold"
  evaluation_periods  = 1
  metric_name         = "FreeStorageSpace"
  namespace           = "AWS/RDS"
  period              = 300
  statistic           = "Average"
  threshold           = 5368709120  # 5 GB in bytes
  alarm_description   = "RDS free storage below 5 GB"
  alarm_actions       = [aws_sns_topic.alerts.arn]

  dimensions = {
    DBInstanceIdentifier = module.database.primary_instance_id
  }
}

# ElastiCache: Memory usage approaching limit
resource "aws_cloudwatch_metric_alarm" "redis_memory" {
  alarm_name          = "${var.environment}-redis-high-memory"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  metric_name         = "DatabaseMemoryUsagePercentage"
  namespace           = "AWS/ElastiCache"
  period              = 300
  statistic           = "Average"
  threshold           = 80
  alarm_description   = "Redis memory usage exceeds 80%"
  alarm_actions       = [aws_sns_topic.alerts.arn]

  dimensions = {
    CacheClusterId = "${var.environment}-wp-redis-001"
  }
}

# ALB: High 5xx error rate indicates application problems
resource "aws_cloudwatch_metric_alarm" "alb_5xx" {
  alarm_name          = "${var.environment}-alb-5xx-errors"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  metric_name         = "HTTPCode_Target_5XX_Count"
  namespace           = "AWS/ApplicationELB"
  period              = 300
  statistic           = "Sum"
  threshold           = 50
  alarm_description   = "More than 50 5xx errors in 5 minutes"
  alarm_actions       = [aws_sns_topic.alerts.arn]

  dimensions = {
    LoadBalancer = module.compute.alb_arn_suffix
  }
}

# SNS topic for alarm notifications
resource "aws_sns_topic" "alerts" {
  name = "${var.environment}-wordpress-alerts"
}

resource "aws_sns_topic_subscription" "email" {
  topic_arn = aws_sns_topic.alerts.arn
  protocol  = "email"
  endpoint  = var.admin_email
}

These four alarms catch the most common failure modes in WordPress AWS deployments. High RDS CPU usually means a rogue plugin is running unoptimized queries. Low storage means your database is growing faster than expected, often from logging plugins or accumulated post revisions. High Redis memory suggests your eviction policy needs tuning or your instance size needs increasing. And a spike in 5xx errors on the ALB signals application crashes that need immediate attention.

Backup and Disaster Recovery

The Terraform configuration already includes several backup mechanisms: RDS automated backups with 14-day retention, EFS automatic backups through AWS Backup, and S3 versioning on the Terraform state bucket. For a complete disaster recovery plan, add cross-region replication for your RDS backups and S3 assets:

# Cross-region RDS backup replication
resource "aws_db_instance_automated_backups_replication" "wordpress" {
  count = var.environment == "production" ? 1 : 0

  source_db_instance_arn = aws_db_instance.wordpress.arn
  retention_period       = 7

  provider = aws.dr_region  # Requires a second provider for the DR region
}

Test your recovery process regularly. A backup you have never restored from is a backup you cannot trust. Schedule quarterly recovery drills where you restore the database to a new RDS instance, mount the EFS filesystem, and verify the site functions correctly.

Common Pitfalls and How to Avoid Them

After deploying WordPress on AWS with Terraform across dozens of projects, certain mistakes appear repeatedly. Knowing these in advance saves hours of debugging.

WordPress cron and auto-scaling do not mix well. WordPress uses a pseudo-cron system triggered by page visits. In an auto-scaled environment, cron events can fire multiple times (once per instance) or not at all (if instances are scaling down). Disable WP-Cron in wp-config.php with define('DISABLE_WP_CRON', true) and replace it with a real cron job using EventBridge (CloudWatch Events) that hits wp-cron.php on a single instance through the ALB.

ALB health checks can trigger WordPress installation wizard. If your health check hits a page that requires database access before WordPress is fully configured, the health check "succeeds" with a 200 status but actually returns the installation page. Use /wp-login.php as the health check path and accept both 200 and 302 status codes. The login page always exists and requires a working database connection.

Security group changes can cause downtime. Terraform sometimes needs to destroy and recreate security groups when their configuration changes significantly. If instances reference the old security group, they lose network access during the transition. The create_before_destroy lifecycle rule on security groups prevents this, but you must also use name_prefix instead of name to avoid naming conflicts between the old and new groups.

EFS mount can block instance boot. If the EFS mount target is unavailable when an instance boots, the mount command hangs indefinitely unless you set a timeout. Use the _netdev mount option in fstab, which tells the OS to wait for network availability before mounting. Also add noresvport to handle NFS port reuse during reconnection after brief network interruptions.

Terraform state can contain sensitive data. Your state file includes every attribute of every resource, including database passwords if you pass them as variables. Always encrypt your state bucket with KMS, restrict access to the S3 bucket using IAM policies, and consider using Terraform's built-in state encryption features in newer versions. Never share state files over unencrypted channels.

Putting It All Together

The architecture described in this article gives you a WordPress deployment that handles traffic spikes through auto-scaling, survives Availability Zone failures through Multi-AZ RDS and distributed subnets, accelerates global performance through CloudFront, and stores every infrastructure decision in version-controlled code that your team can review, discuss, and improve.

To deploy this from scratch, follow these steps:

  1. Bootstrap the state backend. Run the bootstrap Terraform configuration to create the S3 bucket and DynamoDB table for remote state.
  2. Build the WordPress AMI. Run Packer to create an AMI with your OS, web server, PHP, and WordPress pre-installed.
  3. Set your variables. Create a production.tfvars file with your AMI ID, domain name, certificate ARN, and other environment-specific values. Store sensitive values in environment variables or a secrets manager.
  4. Initialize Terraform. Run terraform init to download providers and configure the remote backend.
  5. Review the plan. Run terraform plan -var-file=production.tfvars and carefully review every resource that will be created. Pay special attention to the security groups, IAM policies, and public accessibility settings.
  6. Apply. Run terraform apply -var-file=production.tfvars and confirm. The first apply takes 15-20 minutes, mostly waiting for the RDS instance to provision.
  7. Configure DNS. If you are not using Route 53 (or if your domain is registered elsewhere), create a CNAME record pointing your domain to the CloudFront distribution domain name.
  8. Complete WordPress setup. Visit your domain and complete the WordPress installation wizard. After that, install and activate the Redis Object Cache plugin, configure HyperDB if using a read replica, and disable WP-Cron in favor of EventBridge.

Every subsequent infrastructure change follows the same pull-request-driven workflow: modify Terraform files, open a PR, review the plan, merge, and let CI/CD apply the changes. Your WordPress infrastructure becomes as manageable and auditable as your application code.

The Terraform configurations in this article are designed as a starting point. Your specific requirements will drive modifications. Maybe you need a WAF to block malicious traffic patterns. Maybe you want to use Aurora Serverless instead of standard RDS to handle unpredictable traffic without pre-provisioning capacity. Maybe you need a VPN connection to an on-premises network for a hybrid deployment. The modular structure makes these additions straightforward: add a new module, wire it into the root module, and apply.

Infrastructure as Code is not just about automation. It is about bringing software engineering discipline to infrastructure management. Version control, code review, automated testing, and continuous deployment are practices that have made application development more reliable and more collaborative over the past two decades. Your infrastructure deserves the same rigor. Terraform gives you the tools. The architecture patterns in this article give you the blueprint. What you build with them is up to you.

Share this article

Marcus Chen

Staff engineer with 12 years in WordPress infrastructure. Previously at Automattic and a large media company. Writes about hosting platforms, caching, and deployment pipelines.