Back to Blog
DevOps & Deployment

Zero-Downtime Deployment Strategies for WordPress: Blue-Green, Canary, and Rolling Updates in Practice

Tom Bradley
39 min read

Why Downtime During WordPress Deployments Is Unacceptable

Every second your WordPress site is offline costs money. For WooCommerce stores processing orders around the clock, a five-minute maintenance window at the wrong time can mean thousands in lost revenue. For media sites riding a traffic spike, a deployment-induced outage destroys the moment. And for SaaS platforms built on WordPress, downtime erodes the trust you spent months building.

The traditional WordPress deployment method is alarmingly primitive: SSH into the server, pull the latest code, maybe run a quick search-replace, and hope nothing breaks. If something does break, you scramble to fix it while users stare at a white screen or a maintenance page. This approach worked when WordPress powered simple blogs. It does not work when WordPress powers businesses.

Zero-downtime deployment eliminates the gap between “old version running” and “new version running.” Users never see a maintenance page. Requests are never dropped. The transition from one release to the next is invisible to anyone visiting your site.

This article covers three primary strategies for achieving zero-downtime WordPress deployments: blue-green deployments, canary releases, and rolling updates. We will walk through real configurations for Nginx, HAProxy, Deployer PHP, Kubernetes, and GitHub Actions. We will address the hard problems that WordPress-specific concerns introduce, including database migrations, shared uploads directories, WooCommerce session persistence, and instant rollback procedures.

None of this is theoretical. Every configuration example in this article has been tested in production environments handling real traffic.

Symlink-Based Atomic Deployments with Deployer PHP

Before discussing blue-green or canary strategies, you need a deployment mechanism that can switch between releases instantly. Symlink-based atomic deployment is the foundation that makes everything else possible.

The concept is straightforward. Each deployment creates a new directory containing the full application code. A symlink called current points to the active release directory. Switching releases means updating where the symlink points. This operation is atomic on Linux filesystems, meaning there is no moment where the symlink points to nothing.

Deployer PHP (deployer.org) is the most popular tool for this pattern in PHP projects. Here is a practical Deployer configuration for WordPress:

<?php
// deploy.php

namespace Deployer;

require 'recipe/common.php';

set('application', 'wordpress-site');
set('repository', '[email protected]:yourorg/wordpress-site.git');
set('keep_releases', 5);

// Shared files and directories persist across releases
set('shared_files', [
    'wp-config.php',
    '.htaccess',
]);

set('shared_dirs', [
    'wp-content/uploads',
    'wp-content/cache',
    'wp-content/wflogs',
]);

// Writable directories
set('writable_dirs', [
    'wp-content/uploads',
    'wp-content/cache',
]);

host('production')
    ->set('hostname', 'prod-server.example.com')
    ->set('remote_user', 'deploy')
    ->set('deploy_path', '/var/www/wordpress-site');

host('staging')
    ->set('hostname', 'staging.example.com')
    ->set('remote_user', 'deploy')
    ->set('deploy_path', '/var/www/staging-wordpress');

// Build assets before uploading
task('build:assets', function () {
    runLocally('cd wp-content/themes/your-theme && npm ci && npm run build');
});

// Upload compiled assets
task('upload:assets', function () {
    upload('wp-content/themes/your-theme/dist/', '{{release_path}}/wp-content/themes/your-theme/dist/');
});

// Flush object cache after deploy
task('cache:flush', function () {
    run('cd {{release_path}} && wp cache flush --allow-root');
});

// Flush OPcache
task('opcache:reset', function () {
    run('curl -s https://{{hostname}}/opcache-reset.php?key={{opcache_key}} || true');
});

// Full deployment pipeline
after('deploy:update_code', 'build:assets');
after('deploy:update_code', 'upload:assets');
after('deploy:symlink', 'cache:flush');
after('deploy:symlink', 'opcache:reset');

after('deploy:failed', 'deploy:unlock');

The directory structure on your server looks like this after several deployments:

/var/www/wordpress-site/
├── current -> /var/www/wordpress-site/releases/42
├── releases/
│   ├── 38/
│   ├── 39/
│   ├── 40/
│   ├── 41/
│   └── 42/    <-- active release
└── shared/
    ├── wp-config.php
    ├── .htaccess
    └── wp-content/
        ├── uploads/
        ├── cache/
        └── wflogs/

When Deployer runs, it clones your repository into releases/43/, creates symlinks from that release into the shared/ directory for uploads and config files, runs any build tasks, and then atomically switches the current symlink to point at releases/43/.

Your Nginx configuration points its root directive at the current symlink:

server {
    listen 80;
    server_name example.com;
    root /var/www/wordpress-site/current;
    
    index index.php;
    
    location / {
        try_files $uri $uri/ /index.php?$args;
    }
    
    location ~ \.php$ {
        fastcgi_pass unix:/run/php/php8.1-fpm.sock;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        include fastcgi_params;
        
        # Critical: resolve symlinks for OPcache
        fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
        fastcgi_param DOCUMENT_ROOT $realpath_root;
    }
}

Pay close attention to the $realpath_root variable in the PHP location block. This resolves the symlink to the actual filesystem path, which prevents OPcache from serving stale bytecode after a release switch. Without this, PHP-FPM may continue serving cached opcodes from the previous release directory even after the symlink has changed.

An alternative approach to the OPcache problem is to reset OPcache after each deployment. You can do this with a small PHP file that calls opcache_reset(), triggered via a curl request in your deployment pipeline. Some teams do both: use $realpath_root and reset OPcache.

OPcache Reset Script

<?php
// opcache-reset.php - place in web root, protect with a secret key
$secret = getenv('OPCACHE_RESET_KEY');
if (!isset($_GET['key']) || $_GET['key'] !== $secret) {
    http_response_code(403);
    exit('Forbidden');
}

if (function_exists('opcache_reset')) {
    opcache_reset();
    echo 'OPcache cleared';
} else {
    echo 'OPcache not available';
}

Blue-Green Deployment: Dual DocumentRoot with Nginx Traffic Switching

Blue-green deployment takes the atomic switching concept further by maintaining two complete, independent environments. One environment (let’s call it “blue”) serves live traffic. The other (“green”) sits idle or serves as a staging target. When you deploy, you deploy to the idle environment, verify it works, and then switch traffic from blue to green.

The key difference from simple symlink deployment is that blue-green gives you a full, running environment to test against before any traffic hits it. With symlink-based deployment, the new release goes live the instant the symlink changes. With blue-green, you can hit the green environment with test requests, run smoke tests, check database connectivity, and verify plugin compatibility before switching a single real user over.

Here is how to set this up with Nginx using two upstream blocks:

# /etc/nginx/conf.d/upstream.conf

upstream blue_backend {
    server 127.0.0.1:8081;
}

upstream green_backend {
    server 127.0.0.1:8082;
}

Each backend runs its own PHP-FPM pool with a separate document root:

# /etc/php/8.1/fpm/pool.d/blue.conf
[blue]
user = www-data
group = www-data
listen = 127.0.0.1:8081
pm = dynamic
pm.max_children = 20
pm.start_servers = 5
pm.min_spare_servers = 3
pm.max_spare_servers = 10

php_admin_value[open_basedir] = /var/www/blue:/tmp
env[WP_ENV] = blue
# /etc/php/8.1/fpm/pool.d/green.conf
[green]
user = www-data
group = www-data
listen = 127.0.0.1:8082
pm = dynamic
pm.max_children = 20
pm.start_servers = 5
pm.min_spare_servers = 3
pm.max_spare_servers = 10

php_admin_value[open_basedir] = /var/www/green:/tmp
env[WP_ENV] = green

The Nginx server block uses a variable to determine which upstream receives traffic:

# /etc/nginx/sites-available/wordpress.conf

map $host $active_backend {
    default blue_backend;
}

server {
    listen 443 ssl http2;
    server_name example.com;

    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

    # Dynamic root based on active environment
    set $doc_root /var/www/blue;
    if ($active_backend = green_backend) {
        set $doc_root /var/www/green;
    }
    root $doc_root;

    location / {
        try_files $uri $uri/ /index.php?$args;
    }

    location ~ \.php$ {
        fastcgi_pass $active_backend;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        include fastcgi_params;
    }

    # Static assets with long cache
    location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff2)$ {
        expires 30d;
        add_header Cache-Control "public, immutable";
    }
}

Switching traffic is done by updating the map directive and reloading Nginx:

#!/bin/bash
# switch-environment.sh

CURRENT=$(grep -oP 'default \K\w+_backend' /etc/nginx/conf.d/upstream-map.conf)

if [ "$CURRENT" = "blue_backend" ]; then
    NEW="green_backend"
else
    NEW="blue_backend"
fi

sed -i "s/default ${CURRENT}/default ${NEW}/" /etc/nginx/conf.d/upstream-map.conf

# Test config before reloading
nginx -t && systemctl reload nginx

echo "Switched from $CURRENT to $NEW"

The nginx -t command validates the configuration before reloading. If the configuration is invalid, the reload never happens, and the previous environment continues serving traffic. The reload itself is graceful: Nginx finishes processing in-flight requests with old worker processes while new workers pick up the updated configuration.

Smoke Testing the Idle Environment

Before switching, you should run automated tests against the idle environment. Since both environments are running simultaneously, you can hit the inactive one directly:

#!/bin/bash
# smoke-test.sh

IDLE_PORT=$1  # 8081 or 8082

# Basic health check
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:${IDLE_PORT}/)
if [ "$HTTP_CODE" != "200" ]; then
    echo "FAIL: Homepage returned $HTTP_CODE"
    exit 1
fi

# Check that WordPress loaded correctly
BODY=$(curl -s http://127.0.0.1:${IDLE_PORT}/)
if ! echo "$BODY" | grep -q "wp-content"; then
    echo "FAIL: Response does not look like WordPress"
    exit 1
fi

# Check REST API
API_CODE=$(curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:${IDLE_PORT}/wp-json/wp/v2/posts?per_page=1)
if [ "$API_CODE" != "200" ]; then
    echo "FAIL: REST API returned $API_CODE"
    exit 1
fi

# Check WooCommerce endpoint if applicable
WC_CODE=$(curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:${IDLE_PORT}/shop/)
if [ "$WC_CODE" != "200" ]; then
    echo "WARN: WooCommerce shop returned $WC_CODE"
fi

echo "All smoke tests passed"
exit 0

Handling Database Migrations During Zero-Downtime Deploys

The database is where zero-downtime deployment gets genuinely difficult for WordPress. Code can be swapped atomically. Databases cannot.

WordPress itself handles schema changes through its dbDelta() function during updates. But custom plugins, themes, and WooCommerce extensions often need their own migrations. The fundamental challenge: how do you run a migration that changes the database schema while the old code is still handling requests?

The answer is backward-compatible migrations. Every database change must be deployable in a way that does not break the currently running version of the code.

The Expand-Contract Pattern

This pattern splits destructive migrations into two or three phases:

Phase 1: Expand. Add new columns, tables, or indexes. Do not remove or rename anything. The old code ignores the new columns, so it continues working. Deploy the new code that uses the new columns.

Phase 2: Migrate Data. If you need to move data from old columns to new ones, do it now. Both old and new code can run against this schema.

Phase 3: Contract. In a future release, after all servers run the new code, remove the old columns. This is the only step that could break old code, but no old code is running anymore.

Here is a practical example. Suppose you need to rename a column in a custom table from user_email to subscriber_email:

// Release 1: Expand - add the new column, keep the old one
function wpkite_migration_001_expand() {
    global $wpdb;
    $table = $wpdb->prefix . 'wpkite_subscribers';
    
    // Add new column
    $wpdb->query("ALTER TABLE {$table} ADD COLUMN subscriber_email VARCHAR(255) AFTER user_email");
    
    // Copy data
    $wpdb->query("UPDATE {$table} SET subscriber_email = user_email WHERE subscriber_email IS NULL");
    
    // Add index on new column
    $wpdb->query("CREATE INDEX idx_subscriber_email ON {$table} (subscriber_email)");
}

// Update code to write to BOTH columns
function wpkite_add_subscriber($email, $source) {
    global $wpdb;
    $wpdb->insert(
        $wpdb->prefix . 'wpkite_subscribers',
        [
            'user_email' => $email,        // old column (for backward compat)
            'subscriber_email' => $email,  // new column
            'source' => $source,
            'status' => 'active',
        ]
    );
}
// Release 2: Contract - remove old column (deployed weeks later)
function wpkite_migration_002_contract() {
    global $wpdb;
    $table = $wpdb->prefix . 'wpkite_subscribers';
    
    $wpdb->query("ALTER TABLE {$table} DROP COLUMN user_email");
    $wpdb->query("DROP INDEX idx_user_email ON {$table}");
}

Running Migrations Safely

Never run migrations as part of the symlink switch. Run them before the code deployment, during the build phase. If a migration fails, the deployment stops, and the old code continues serving traffic unchanged.

// In Deployer: run migrations before switching symlink
task('database:migrate', function () {
    run('cd {{release_path}} && wp eval-file scripts/run-migrations.php');
});

before('deploy:symlink', 'database:migrate');

For large tables (millions of rows), ALTER TABLE operations can lock the table for minutes. Use tools like pt-online-schema-change or gh-ost to perform online schema changes without locking:

# Using pt-online-schema-change for lock-free ALTER TABLE
pt-online-schema-change \
    --alter "ADD COLUMN subscriber_email VARCHAR(255) AFTER user_email" \
    --execute \
    --max-load Threads_running=25 \
    --critical-load Threads_running=50 \
    D=wordpress,t=wp_wpkite_subscribers,u=admin,p=secret

This tool creates a shadow copy of the table, applies the change to the copy, migrates rows in small batches using triggers, and then performs an atomic rename. The table remains fully available for reads and writes throughout the process.

Shared Persistent Storage: wp-content/uploads Across Releases

WordPress stores uploaded media in wp-content/uploads/. This directory must persist across deployments. If each release gets its own uploads directory, users would lose access to all previously uploaded images and documents.

Deployer handles this through its shared_dirs configuration, which creates a symlink from each release’s wp-content/uploads to a single shared directory. But this introduces its own challenges in multi-server environments.

Single Server: Symlink Approach

On a single server, the shared directory approach is simple and reliable:

# Directory structure
/var/www/site/shared/wp-content/uploads/  # Actual files live here
/var/www/site/releases/42/wp-content/uploads -> /var/www/site/shared/wp-content/uploads/

Multi-Server: Shared Filesystem

When running WordPress across multiple servers (for blue-green or rolling updates), all servers need access to the same uploads. The most common solutions:

NFS Mount:

# /etc/fstab on each web server
nfs-server:/exports/wp-uploads  /var/www/site/shared/wp-content/uploads  nfs  defaults,noatime,_netdev  0  0

NFS works but adds a network dependency. If the NFS server goes down, your entire site breaks. Use NFSv4 with proper caching to reduce latency:

# Mount with caching options
nfs-server:/exports/wp-uploads  /var/www/site/shared/wp-content/uploads  nfs4  rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,_netdev  0  0

Object Storage with Plugin:

A more scalable solution is offloading media to S3 or a compatible object storage service. Plugins like WP Offload Media or custom implementations using the WordPress media filters can redirect uploads to S3:

// wp-config.php
define('AS3CF_SETTINGS', serialize([
    'provider' => 'aws',
    'access-key-id' => getenv('AWS_ACCESS_KEY_ID'),
    'secret-access-key' => getenv('AWS_SECRET_ACCESS_KEY'),
    'bucket' => 'my-wp-uploads',
    'region' => 'us-east-1',
    'copy-to-s3' => true,
    'serve-from-s3' => true,
    'remove-local-file' => true,
]));

With S3-backed media, the uploads directory becomes stateless. Each server can operate independently without shared filesystem concerns. This is the recommended approach for any multi-server WordPress deployment.

GlusterFS for Self-Hosted Clusters:

For teams that prefer self-hosted solutions without cloud vendor lock-in, GlusterFS provides a replicated filesystem:

# On storage nodes
gluster volume create wp-uploads replica 2 \
    storage1:/data/brick1/wp-uploads \
    storage2:/data/brick1/wp-uploads

gluster volume start wp-uploads

# On web servers
mount -t glusterfs storage1:/wp-uploads /var/www/site/shared/wp-content/uploads

Canary Releases with Load Balancer Percentage Routing

Canary deployment sends a small percentage of traffic to the new version while the majority continues hitting the old version. If the new version behaves well (low error rate, acceptable response times), you gradually increase the percentage until all traffic goes to the new version.

This strategy is less common in traditional WordPress hosting but becomes practical when you run WordPress behind a load balancer. HAProxy is particularly well-suited for canary routing because of its flexible backend weighting system.

HAProxy Configuration for Canary Routing

# /etc/haproxy/haproxy.cfg

global
    log /dev/log local0
    maxconn 4096
    stats socket /var/run/haproxy.sock mode 600 level admin

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5s
    timeout client  30s
    timeout server  30s
    retries 3

frontend wordpress_front
    bind *:443 ssl crt /etc/ssl/certs/example.com.pem
    
    # Sticky sessions for WooCommerce cart persistence
    stick-table type string len 64 size 100k expire 30m
    stick on req.cook(wordpress_logged_in) table wordpress_front
    
    default_backend wp_stable

backend wp_stable
    balance roundrobin
    option httpchk GET /wp-login.php
    http-check expect status 200
    
    server wp-stable-1 10.0.1.10:80 check weight 100
    server wp-stable-2 10.0.1.11:80 check weight 100

backend wp_canary
    balance roundrobin
    option httpchk GET /wp-login.php
    http-check expect status 200
    
    server wp-canary-1 10.0.1.20:80 check weight 100

To route a percentage of traffic to the canary, use HAProxy ACLs with the rand function:

frontend wordpress_front
    bind *:443 ssl crt /etc/ssl/certs/example.com.pem
    
    # Route 5% of traffic to canary
    acl is_canary rand(100) lt 5
    
    # Don't canary logged-in users or admin pages
    acl is_admin path_beg /wp-admin
    acl is_logged_in req.cook(wordpress_logged_in) -m found
    
    use_backend wp_canary if is_canary !is_admin !is_logged_in
    default_backend wp_stable

This sends 5% of anonymous, non-admin traffic to the canary backend. Logged-in users and admin requests always go to the stable backend, which prevents inconsistencies in the WordPress admin experience during deployments.

Gradually Increasing Canary Traffic

You can adjust the canary percentage through the HAProxy runtime API without restarting the proxy:

#!/bin/bash
# canary-promote.sh - Gradually increase canary traffic

SOCKET="/var/run/haproxy.sock"

set_canary_weight() {
    local pct=$1
    echo "Setting canary to ${pct}%"
    
    # Update the ACL threshold via runtime API
    echo "set acl #0 canary_pct ${pct}" | socat stdio $SOCKET
}

# Progressive rollout
set_canary_weight 5
echo "Waiting 10 minutes at 5%..."
sleep 600

# Check error rates before proceeding
ERROR_RATE=$(curl -s http://localhost:8404/stats\;csv | grep wp_canary | awk -F, '{print $15}')
if [ "$ERROR_RATE" -gt 1 ]; then
    echo "Error rate too high ($ERROR_RATE%), rolling back"
    set_canary_weight 0
    exit 1
fi

set_canary_weight 25
echo "Waiting 10 minutes at 25%..."
sleep 600

set_canary_weight 50
echo "Waiting 10 minutes at 50%..."
sleep 600

set_canary_weight 100
echo "Canary promoted to 100%"

Monitoring the Canary

The canary is useless without monitoring. You need to compare error rates, response times, and application-level metrics between the stable and canary backends. A basic approach uses HAProxy stats combined with server-side logging:

# Add response time tracking to HAProxy
frontend wordpress_front
    # Log backend response time
    log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r"
    
    # Tag canary requests in logs
    http-request set-header X-Canary true if is_canary

On the WordPress side, you can log the canary header for application-level monitoring:

// In your theme's functions.php or a must-use plugin
add_action('shutdown', function() {
    if (isset($_SERVER['HTTP_X_CANARY'])) {
        $response_time = microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'];
        error_log(sprintf(
            'CANARY request=%s time=%.4f status=%d memory=%d',
            $_SERVER['REQUEST_URI'],
            $response_time,
            http_response_code(),
            memory_get_peak_usage(true)
        ));
    }
});

Rolling Updates in Containerized WordPress (Kubernetes)

If you run WordPress in containers (Docker/Kubernetes), rolling updates are the standard zero-downtime strategy. Kubernetes replaces pods one at a time, waiting for each new pod to pass health checks before terminating the old one.

WordPress Kubernetes Deployment

# wordpress-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress
  labels:
    app: wordpress
spec:
  replicas: 4
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Add 1 new pod at a time
      maxUnavailable: 0   # Never have fewer than desired replicas
  selector:
    matchLabels:
      app: wordpress
  template:
    metadata:
      labels:
        app: wordpress
    spec:
      containers:
      - name: wordpress
        image: your-registry.com/wordpress:v2.3.1
        ports:
        - containerPort: 80
        env:
        - name: WORDPRESS_DB_HOST
          valueFrom:
            secretKeyRef:
              name: wordpress-db
              key: host
        - name: WORDPRESS_DB_NAME
          valueFrom:
            secretKeyRef:
              name: wordpress-db
              key: name
        - name: WORDPRESS_DB_USER
          valueFrom:
            secretKeyRef:
              name: wordpress-db
              key: user
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: wordpress-db
              key: password
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        readinessProbe:
          httpGet:
            path: /wp-login.php
            port: 80
          initialDelaySeconds: 10
          periodSeconds: 5
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /wp-login.php
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 10
          failureThreshold: 5
        volumeMounts:
        - name: uploads
          mountPath: /var/www/html/wp-content/uploads
      volumes:
      - name: uploads
        persistentVolumeClaim:
          claimName: wordpress-uploads-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: wordpress-uploads-pvc
spec:
  accessModes:
    - ReadWriteMany   # Required for multiple pods
  resources:
    requests:
      storage: 50Gi
  storageClassName: efs-sc   # Amazon EFS or similar RWX storage

The WordPress Docker Image

Your Docker image should contain the full WordPress installation plus your custom theme and plugins, baked in at build time:

# Dockerfile
FROM wordpress:6.4-php8.2-fpm

# Install additional PHP extensions
RUN docker-php-ext-install pdo_mysql opcache

# OPcache settings for production
RUN { \
    echo 'opcache.memory_consumption=256'; \
    echo 'opcache.interned_strings_buffer=16'; \
    echo 'opcache.max_accelerated_files=20000'; \
    echo 'opcache.revalidate_freq=0'; \
    echo 'opcache.validate_timestamps=0'; \
    echo 'opcache.save_comments=1'; \
    echo 'opcache.fast_shutdown=1'; \
} > /usr/local/etc/php/conf.d/opcache-recommended.ini

# Copy custom theme
COPY wp-content/themes/your-theme/ /var/www/html/wp-content/themes/your-theme/

# Copy must-use plugins
COPY wp-content/mu-plugins/ /var/www/html/wp-content/mu-plugins/

# Copy plugins (version-locked via composer)
COPY vendor/ /var/www/html/vendor/
COPY wp-content/plugins/ /var/www/html/wp-content/plugins/

# Set ownership
RUN chown -R www-data:www-data /var/www/html

Setting opcache.validate_timestamps=0 is safe in containers because the code never changes inside a running container. When you deploy a new image, Kubernetes creates new pods with the new code, and OPcache starts fresh.

Kubernetes Service and Ingress

# wordpress-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: wordpress
spec:
  selector:
    app: wordpress
  ports:
  - port: 80
    targetPort: 80
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: wordpress
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "64m"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
spec:
  tls:
  - hosts:
    - example.com
    secretName: tls-example-com
  rules:
  - host: example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: wordpress
            port:
              number: 80

During a rolling update, Kubernetes performs these steps for each pod:

1. Creates a new pod with the updated image.
2. Waits for the readiness probe to pass (confirming WordPress is responding on the new pod).
3. Adds the new pod to the Service endpoints (it starts receiving traffic).
4. Sends SIGTERM to an old pod.
5. Removes the old pod from Service endpoints (it stops receiving new traffic).
6. Waits for the termination grace period (default 30 seconds) to allow in-flight requests to complete.
7. Kills the old pod.

With maxSurge: 1 and maxUnavailable: 0, you always have at least 4 healthy pods serving traffic. The update proceeds one pod at a time, so for 4 replicas the full rollout takes several minutes.

Handling PHP Sessions in Kubernetes

WordPress plugins that use PHP sessions will break in a multi-pod setup because the default file-based session handler stores session data locally to each pod. Use Redis for session storage:

# In wp-config.php or a mu-plugin
// Redirect PHP sessions to Redis
ini_set('session.save_handler', 'redis');
ini_set('session.save_path', 'tcp://redis-service:6379?auth=' . getenv('REDIS_PASSWORD'));

Rollback Procedures: Instant Symlink Rollback vs. Database Challenges

When a deployment goes wrong, you need to revert quickly. The rollback strategy depends on what you deployed and whether database changes were involved.

Code-Only Rollback: The Easy Case

If your deployment only changed PHP files, templates, or assets (no database schema changes), rollback is instant with symlink-based deployments:

# Deployer built-in rollback
dep rollback production

This switches the current symlink back to the previous release directory. The operation takes less than a second. For blue-green deployments, you run the traffic switch script to point back to the previous environment.

In Kubernetes, rollback is equally straightforward:

# Roll back to previous revision
kubectl rollout undo deployment/wordpress

# Or roll back to a specific revision
kubectl rollout undo deployment/wordpress --to-revision=5

# Check rollout history
kubectl rollout history deployment/wordpress

Database Rollback: The Hard Case

If your deployment included database migrations, rolling back the code does not undo the database changes. This is why the expand-contract pattern discussed earlier is so important. If you followed it, the old code is compatible with the new schema, and a code rollback works without touching the database.

But what if a migration went wrong and corrupted data? You need a pre-migration database snapshot:

#!/bin/bash
# pre-deploy-snapshot.sh

TIMESTAMP=$(date +%Y%m%d_%H%M%S)
DB_NAME="wordpress_production"
BACKUP_DIR="/var/backups/deploy-snapshots"

# Create snapshot before migration
mysqldump --single-transaction --quick \
    --host=db-server \
    --user=backup_user \
    --password="$DB_BACKUP_PASSWORD" \
    "$DB_NAME" | gzip > "${BACKUP_DIR}/${DB_NAME}_${TIMESTAMP}.sql.gz"

# Record snapshot in deployment metadata
echo "$TIMESTAMP" > /var/www/site/current/.deploy-snapshot

echo "Snapshot created: ${BACKUP_DIR}/${DB_NAME}_${TIMESTAMP}.sql.gz"

Restoring from a snapshot means data loss: any orders, comments, or user registrations that happened between the snapshot and the rollback will be gone. This is the nuclear option. For high-traffic WooCommerce sites, you might lose dozens of orders during even a brief window.

A safer approach for non-destructive migrations (the expand phase of expand-contract) is to simply leave the new columns in place. They consume a negligible amount of space, and the old code ignores them.

Automated Rollback Triggers

You can automate rollback based on health checks:

#!/bin/bash
# post-deploy-monitor.sh - Run after deployment, auto-rollback on failure

SITE_URL="https://example.com"
MAX_ERRORS=3
CHECK_INTERVAL=10
CHECK_DURATION=120  # Monitor for 2 minutes

errors=0
checks=0
start_time=$(date +%s)

while true; do
    current_time=$(date +%s)
    elapsed=$((current_time - start_time))
    
    if [ $elapsed -ge $CHECK_DURATION ]; then
        echo "Monitoring complete. Deployment looks healthy."
        exit 0
    fi
    
    HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" --max-time 10 "$SITE_URL")
    checks=$((checks + 1))
    
    if [ "$HTTP_CODE" != "200" ]; then
        errors=$((errors + 1))
        echo "Check $checks: FAIL (HTTP $HTTP_CODE) - Error count: $errors"
        
        if [ $errors -ge $MAX_ERRORS ]; then
            echo "ERROR THRESHOLD REACHED. Initiating rollback..."
            dep rollback production
            
            # Notify team
            curl -X POST "$SLACK_WEBHOOK" \
                -H 'Content-type: application/json' \
                -d '{"text":"AUTOMATED ROLLBACK triggered for production. '"$errors"' consecutive failures detected."}'
            
            exit 1
        fi
    else
        errors=0  # Reset consecutive error count
        echo "Check $checks: OK (HTTP $HTTP_CODE)"
    fi
    
    sleep $CHECK_INTERVAL
done

WooCommerce Considerations: Order Tables, Sessions, and Cart Persistence

WooCommerce adds significant complexity to zero-downtime deployments because of its stateful nature. Customers have active sessions, carts with items, and potentially in-progress checkout flows.

Session Persistence During Deployment

WooCommerce stores session data in the wp_woocommerce_sessions table by default. This means sessions survive code deployments as long as the database remains the same, which it does in all the strategies we have discussed.

However, if you are using file-based PHP sessions (some plugins do this), sessions will be lost when traffic switches to a new server or container. Always verify your session storage mechanism before implementing zero-downtime deploys.

For load-balanced environments, you must ensure session affinity (sticky sessions) or use centralized session storage:

# HAProxy sticky session configuration for WooCommerce
backend wp_servers
    balance roundrobin
    
    # Stick on the WooCommerce session cookie
    cookie SERVERID insert indirect nocache
    
    server wp1 10.0.1.10:80 check cookie wp1
    server wp2 10.0.1.11:80 check cookie wp2
    server wp3 10.0.1.12:80 check cookie wp3

WooCommerce High-Performance Order Storage (HPOS)

WooCommerce 8.2+ introduced High-Performance Order Storage, which moves orders from WordPress post meta to dedicated custom tables. This affects deployment migrations because the order schema is different from standard WordPress tables.

When deploying WooCommerce updates that involve HPOS migrations, follow these additional precautions:

// Check HPOS status before running migrations
function check_hpos_status() {
    if (class_exists('Automattic\WooCommerce\Utilities\OrderUtil')) {
        $hpos_enabled = \Automattic\WooCommerce\Utilities\OrderUtil::custom_orders_table_usage_is_enabled();
        $sync_enabled = get_option('woocommerce_custom_orders_table_data_sync_enabled');
        
        error_log(sprintf(
            'HPOS Status: enabled=%s sync=%s',
            $hpos_enabled ? 'yes' : 'no',
            $sync_enabled
        ));
        
        return [
            'hpos_enabled' => $hpos_enabled,
            'sync_enabled' => $sync_enabled === 'yes',
        ];
    }
    return ['hpos_enabled' => false, 'sync_enabled' => false];
}

Cart and Checkout During Deployment

A customer in the middle of checkout when a deployment happens should not have their cart emptied or their payment flow interrupted. Since WooCommerce cart data lives in the session table (database) and payment processing happens through external gateways (Stripe, PayPal), the deployment itself does not interrupt these flows.

The risk is in code changes that alter the checkout flow. If your new release changes form field names, modifies validation rules, or restructures the checkout template, a customer who loaded the old checkout page might submit a form that the new code does not understand.

Mitigate this by ensuring backward compatibility in form handlers for at least one release cycle:

// Accept both old and new field names during transition
function handle_checkout_submission() {
    // New field name
    $phone = isset($_POST['billing_phone_number']) ? sanitize_text_field($_POST['billing_phone_number']) : '';
    
    // Fall back to old field name
    if (empty($phone) && isset($_POST['billing_phone'])) {
        $phone = sanitize_text_field($_POST['billing_phone']);
    }
    
    // Process order...
}

WooCommerce Scheduled Actions

WooCommerce uses Action Scheduler for background tasks like processing pending orders, sending emails, and syncing inventory. During a rolling deployment, you may have old and new code running simultaneously, both processing scheduled actions.

To prevent conflicts, designate a single worker for scheduled actions:

# In Kubernetes, run a separate deployment for WP-Cron/Action Scheduler
apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress-worker
spec:
  replicas: 1   # Only ONE worker to prevent duplicate processing
  template:
    spec:
      containers:
      - name: wordpress-cron
        image: your-registry.com/wordpress:v2.3.1
        command: ["/bin/sh", "-c"]
        args:
          - |
            while true; do
              php /var/www/html/wp-cron.php
              sleep 60
            done
        env:
        - name: DISABLE_WP_CRON
          value: "true"

And disable WP-Cron on the web-serving pods:

// wp-config.php
define('DISABLE_WP_CRON', true);

CI/CD Integration: Triggering Deploys from GitHub Actions

Automating the full pipeline from code push to production deployment eliminates human error and makes deployments routine rather than events.

GitHub Actions Workflow

# .github/workflows/deploy.yml
name: Deploy WordPress

on:
  push:
    branches: [main]
  workflow_dispatch:
    inputs:
      environment:
        description: 'Deployment target'
        required: true
        default: 'staging'
        type: choice
        options:
          - staging
          - production

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      mysql:
        image: mysql:8.0
        env:
          MYSQL_ROOT_PASSWORD: test
          MYSQL_DATABASE: wordpress_test
        ports:
          - 3306:3306
        options: --health-cmd="mysqladmin ping" --health-interval=10s --health-timeout=5s --health-retries=3
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup PHP
        uses: shivammathur/setup-php@v2
        with:
          php-version: '8.2'
          extensions: mysqli, pdo_mysql, gd, zip, opcache
          tools: composer, wp-cli
      
      - name: Install dependencies
        run: composer install --no-dev --optimize-autoloader
      
      - name: Run PHP lint
        run: find wp-content/themes/your-theme -name "*.php" -exec php -l {} \;
      
      - name: Run PHPStan
        run: vendor/bin/phpstan analyse wp-content/themes/your-theme --level=6
      
      - name: Build assets
        run: |
          cd wp-content/themes/your-theme
          npm ci
          npm run build
      
      - name: Run integration tests
        env:
          WP_TESTS_DB_HOST: 127.0.0.1
          WP_TESTS_DB_NAME: wordpress_test
          WP_TESTS_DB_USER: root
          WP_TESTS_DB_PASS: test
        run: vendor/bin/phpunit

  deploy-staging:
    needs: test
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: staging
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup PHP
        uses: shivammathur/setup-php@v2
        with:
          php-version: '8.2'
          tools: composer
      
      - name: Install Deployer
        run: composer global require deployer/deployer
      
      - name: Setup SSH
        uses: webfactory/[email protected]
        with:
          ssh-private-key: ${{ secrets.DEPLOY_SSH_KEY }}
      
      - name: Add known hosts
        run: |
          mkdir -p ~/.ssh
          ssh-keyscan -H ${{ secrets.STAGING_HOST }} >> ~/.ssh/known_hosts
      
      - name: Build assets
        run: |
          cd wp-content/themes/your-theme
          npm ci
          npm run build
      
      - name: Deploy to staging
        run: dep deploy staging -v
      
      - name: Smoke test staging
        run: |
          sleep 5
          HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" https://staging.example.com)
          if [ "$HTTP_CODE" != "200" ]; then
            echo "Staging smoke test failed with HTTP $HTTP_CODE"
            dep rollback staging
            exit 1
          fi

  deploy-production:
    needs: deploy-staging
    if: github.event.inputs.environment == 'production' || (github.ref == 'refs/heads/main' && github.event_name == 'push')
    runs-on: ubuntu-latest
    environment: production
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup PHP
        uses: shivammathur/setup-php@v2
        with:
          php-version: '8.2'
          tools: composer
      
      - name: Install Deployer
        run: composer global require deployer/deployer
      
      - name: Setup SSH
        uses: webfactory/[email protected]
        with:
          ssh-private-key: ${{ secrets.DEPLOY_SSH_KEY }}
      
      - name: Add known hosts
        run: |
          mkdir -p ~/.ssh
          ssh-keyscan -H ${{ secrets.PRODUCTION_HOST }} >> ~/.ssh/known_hosts
      
      - name: Build assets
        run: |
          cd wp-content/themes/your-theme
          npm ci
          npm run build
      
      - name: Create database snapshot
        run: |
          ssh deploy@${{ secrets.PRODUCTION_HOST }} \
            "mysqldump --single-transaction --quick wordpress_prod | gzip > /var/backups/pre-deploy-$(date +%Y%m%d_%H%M%S).sql.gz"
      
      - name: Deploy to production
        run: dep deploy production -v
      
      - name: Post-deploy monitoring
        run: |
          for i in $(seq 1 12); do
            HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" --max-time 10 https://example.com)
            if [ "$HTTP_CODE" != "200" ]; then
              echo "Health check failed: HTTP $HTTP_CODE"
              FAILURES=$((FAILURES + 1))
              if [ "$FAILURES" -ge 3 ]; then
                echo "Rolling back..."
                dep rollback production
                exit 1
              fi
            else
              FAILURES=0
            fi
            sleep 10
          done
          echo "Production deployment verified"
      
      - name: Notify Slack
        if: always()
        uses: 8398a7/action-slack@v3
        with:
          status: ${{ job.status }}
          fields: repo,message,commit,author,action
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}

Container-Based CI/CD with GitHub Actions

For Kubernetes-based deployments, the workflow builds a Docker image instead of using Deployer:

# .github/workflows/deploy-k8s.yml
name: Deploy WordPress (Kubernetes)

on:
  push:
    branches: [main]

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Build theme assets
        run: |
          cd wp-content/themes/your-theme
          npm ci
          npm run build
      
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
      
      - name: Login to container registry
        uses: docker/login-action@v3
        with:
          registry: your-registry.com
          username: ${{ secrets.REGISTRY_USER }}
          password: ${{ secrets.REGISTRY_PASSWORD }}
      
      - name: Build and push image
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: |
            your-registry.com/wordpress:${{ github.sha }}
            your-registry.com/wordpress:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max
      
      - name: Deploy to Kubernetes
        uses: azure/k8s-deploy@v4
        with:
          manifests: |
            k8s/wordpress-deployment.yaml
          images: |
            your-registry.com/wordpress:${{ github.sha }}
          strategy: canary
          percentage: 20
      
      - name: Monitor canary
        run: |
          echo "Waiting for canary to stabilize..."
          sleep 120
          kubectl get pods -l app=wordpress
          
          # Check for crash loops
          RESTARTS=$(kubectl get pods -l app=wordpress -o jsonpath='{.items[*].status.containerStatuses[0].restartCount}')
          for count in $RESTARTS; do
            if [ "$count" -gt 2 ]; then
              echo "Pod restart detected, rejecting canary"
              kubectl rollout undo deployment/wordpress
              exit 1
            fi
          done
      
      - name: Promote canary
        run: |
          kubectl set image deployment/wordpress wordpress=your-registry.com/wordpress:${{ github.sha }}
          kubectl rollout status deployment/wordpress --timeout=300s

Real Nginx and HAProxy Configuration Examples

Let’s bring together the complete, production-ready configurations that tie all these strategies together.

Full Nginx Configuration for Blue-Green with SSL

# /etc/nginx/nginx.conf
user www-data;
worker_processes auto;
pid /run/nginx.pid;
worker_rlimit_nofile 65535;

events {
    worker_connections 4096;
    multi_accept on;
    use epoll;
}

http {
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    client_max_body_size 64m;
    
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
    
    # Logging
    log_format detailed '$remote_addr - $remote_user [$time_local] '
                        '"$request" $status $body_bytes_sent '
                        '"$http_referer" "$http_user_agent" '
                        'rt=$request_time uct=$upstream_connect_time '
                        'uht=$upstream_header_time urt=$upstream_response_time '
                        'backend=$upstream_addr';
    
    access_log /var/log/nginx/access.log detailed;
    error_log /var/log/nginx/error.log;
    
    # Gzip
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml text/javascript image/svg+xml;
    
    # Upstream definitions
    upstream blue {
        server unix:/run/php/php-fpm-blue.sock;
    }
    
    upstream green {
        server unix:/run/php/php-fpm-green.sock;
    }
    
    # Active environment - change this to switch
    map $uri $active_env {
        default blue;
    }
    
    # Rate limiting
    limit_req_zone $binary_remote_addr zone=wp_login:10m rate=3r/s;
    limit_req_zone $binary_remote_addr zone=wp_xmlrpc:10m rate=1r/s;
    
    # FastCGI cache
    fastcgi_cache_path /var/cache/nginx/wordpress 
        levels=1:2 
        keys_zone=wordpress:100m 
        max_size=1g 
        inactive=60m 
        use_temp_path=off;
    
    fastcgi_cache_key "$scheme$request_method$host$request_uri";
    
    server {
        listen 80;
        server_name example.com www.example.com;
        return 301 https://example.com$request_uri;
    }
    
    server {
        listen 443 ssl http2;
        server_name example.com;
        
        ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
        ssl_protocols TLSv1.2 TLSv1.3;
        ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
        ssl_prefer_server_ciphers off;
        ssl_session_cache shared:SSL:10m;
        ssl_session_timeout 1d;
        ssl_session_tickets off;
        
        # HSTS
        add_header Strict-Transport-Security "max-age=63072000" always;
        
        # Dynamic document root based on active environment
        set $doc_root_blue /var/www/blue/current;
        set $doc_root_green /var/www/green/current;
        
        # Default to blue
        set $active_root $doc_root_blue;
        set $active_upstream blue;
        
        # Switch based on map
        if ($active_env = green) {
            set $active_root $doc_root_green;
            set $active_upstream green;
        }
        
        root $active_root;
        index index.php;
        
        # Security headers
        add_header X-Frame-Options "SAMEORIGIN" always;
        add_header X-Content-Type-Options "nosniff" always;
        add_header X-XSS-Protection "1; mode=block" always;
        
        # Block xmlrpc
        location = /xmlrpc.php {
            limit_req zone=wp_xmlrpc burst=2 nodelay;
            include fastcgi_params;
            fastcgi_pass $active_upstream;
            fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
        }
        
        # Rate limit wp-login
        location = /wp-login.php {
            limit_req zone=wp_login burst=5 nodelay;
            include fastcgi_params;
            fastcgi_pass $active_upstream;
            fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
            fastcgi_param DOCUMENT_ROOT $realpath_root;
        }
        
        # WordPress admin - no caching
        location /wp-admin/ {
            try_files $uri $uri/ /index.php?$args;
            
            location ~ \.php$ {
                include fastcgi_params;
                fastcgi_pass $active_upstream;
                fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
                fastcgi_param DOCUMENT_ROOT $realpath_root;
                fastcgi_no_cache 1;
                fastcgi_cache_bypass 1;
            }
        }
        
        # Static assets
        location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff|woff2|ttf|eot|webp|avif)$ {
            expires 30d;
            add_header Cache-Control "public, immutable";
            log_not_found off;
            access_log off;
        }
        
        # Main location
        location / {
            try_files $uri $uri/ /index.php?$args;
        }
        
        # PHP handling with FastCGI cache
        location ~ \.php$ {
            try_files $uri =404;
            
            include fastcgi_params;
            fastcgi_pass $active_upstream;
            fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
            fastcgi_param DOCUMENT_ROOT $realpath_root;
            
            # FastCGI cache settings
            fastcgi_cache wordpress;
            fastcgi_cache_valid 200 10m;
            fastcgi_cache_valid 404 1m;
            
            # Don't cache logged-in users or POST requests
            set $skip_cache 0;
            if ($request_method = POST) {
                set $skip_cache 1;
            }
            if ($http_cookie ~* "wordpress_logged_in|comment_author|woocommerce_cart_hash|woocommerce_items_in_cart") {
                set $skip_cache 1;
            }
            if ($request_uri ~* "/wp-admin/|/wp-json/|/xmlrpc.php|wp-.*\.php|/feed/|index\.php|sitemap") {
                set $skip_cache 1;
            }
            
            fastcgi_cache_bypass $skip_cache;
            fastcgi_no_cache $skip_cache;
            
            # Add cache status header for debugging
            add_header X-Cache-Status $upstream_cache_status;
        }
        
        # Deny access to sensitive files
        location ~ /\.(ht|git|env) {
            deny all;
        }
        
        location ~ /wp-config\.php$ {
            deny all;
        }
    }
}

Full HAProxy Configuration for Canary with Health Checks

# /etc/haproxy/haproxy.cfg
global
    log stdout format raw local0
    maxconn 10000
    tune.ssl.default-dh-param 2048
    ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256
    ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
    stats socket /var/run/haproxy.sock mode 660 level admin
    stats timeout 30s

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    option  http-server-close
    option  forwardfor except 127.0.0.0/8
    timeout connect 5s
    timeout client  30s
    timeout server  60s
    timeout http-request 10s
    timeout http-keep-alive 15s
    timeout queue 30s
    retries 3
    errorfile 503 /etc/haproxy/errors/503.http

# Stats page for monitoring
listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 10s
    stats admin if TRUE
    stats auth admin:$STATS_PASSWORD

frontend http_front
    bind *:80
    redirect scheme https code 301 if !{ ssl_fc }

frontend https_front
    bind *:443 ssl crt /etc/ssl/certs/example.com.pem alpn h2,http/1.1
    
    # Security headers
    http-response set-header Strict-Transport-Security "max-age=63072000; includeSubDomains"
    http-response set-header X-Frame-Options SAMEORIGIN
    http-response set-header X-Content-Type-Options nosniff
    
    # ACLs for routing
    acl is_admin path_beg /wp-admin /wp-login.php
    acl is_api path_beg /wp-json
    acl is_cron path_beg /wp-cron.php
    acl is_logged_in req.cook(wordpress_logged_in) -m found
    acl is_woo_cart req.cook(woocommerce_items_in_cart) -m found
    
    # Canary routing: percentage of anonymous traffic
    acl canary_eligible !is_admin !is_logged_in !is_woo_cart !is_cron
    acl canary_selected rand(1000) lt 50  # 5% canary
    
    # Tag canary requests
    http-request set-header X-Canary true if canary_eligible canary_selected
    
    # Routing rules
    use_backend wp_canary if canary_eligible canary_selected
    use_backend wp_stable if is_admin or is_logged_in or is_woo_cart
    default_backend wp_stable

backend wp_stable
    balance leastconn
    option httpchk GET /wp-login.php HTTP/1.1\r\nHost:\ example.com
    http-check expect status 200
    
    # Sticky sessions for WooCommerce
    cookie SERVERID insert indirect nocache
    
    server stable-1 10.0.1.10:80 check inter 5s fall 3 rise 2 cookie s1 weight 100
    server stable-2 10.0.1.11:80 check inter 5s fall 3 rise 2 cookie s2 weight 100
    server stable-3 10.0.1.12:80 check inter 5s fall 3 rise 2 cookie s3 weight 100

backend wp_canary
    balance leastconn
    option httpchk GET /wp-login.php HTTP/1.1\r\nHost:\ example.com
    http-check expect status 200
    
    # If canary is down, fall back to stable
    option allbackups
    
    server canary-1 10.0.2.10:80 check inter 3s fall 2 rise 2 weight 100
    server stable-1 10.0.1.10:80 check backup

The option allbackups directive in the canary backend is a safety net. If the canary server fails health checks, traffic automatically falls back to a stable server instead of returning errors. This means a bad canary deployment self-heals from the user’s perspective.

Environment Switch Script for HAProxy

#!/bin/bash
# haproxy-canary-control.sh

SOCKET="/var/run/haproxy.sock"

case "$1" in
    status)
        echo "show stat" | socat stdio $SOCKET | grep -E "wp_stable|wp_canary" | \
            awk -F, '{printf "%-20s %-15s status=%s weight=%s\n", $1, $2, $18, $19}'
        ;;
    
    set-canary)
        PCT=$2
        if [ -z "$PCT" ]; then
            echo "Usage: $0 set-canary "
            exit 1
        fi
        # Calculate threshold out of 1000
        THRESHOLD=$((PCT * 10))
        
        # Update HAProxy config
        sed -i "s/rand(1000) lt [0-9]*/rand(1000) lt ${THRESHOLD}/" /etc/haproxy/haproxy.cfg
        
        # Validate and reload
        haproxy -c -f /etc/haproxy/haproxy.cfg
        if [ $? -eq 0 ]; then
            systemctl reload haproxy
            echo "Canary set to ${PCT}%"
        else
            echo "Config validation failed, no changes applied"
            exit 1
        fi
        ;;
    
    disable-canary)
        $0 set-canary 0
        ;;
    
    promote-canary)
        $0 set-canary 100
        echo "Canary promoted. Update stable servers and reset canary percentage."
        ;;
    
    drain-canary)
        echo "set server wp_canary/canary-1 state drain" | socat stdio $SOCKET
        echo "Canary server draining. Existing connections will complete."
        ;;
    
    *)
        echo "Usage: $0 {status|set-canary |disable-canary|promote-canary|drain-canary}"
        exit 1
        ;;
esac

Putting It All Together: Choosing the Right Strategy

Each zero-downtime strategy suits different operational contexts. Here is a practical decision framework.

Symlink-based atomic deployment is the right choice when you run a single server or a small cluster where each server gets the same code deployed sequentially. It is the simplest to implement, requires no additional infrastructure, and works with any hosting provider that gives you SSH access. Start here if you are currently deploying via FTP or manual git pulls.

Blue-green deployment makes sense when you need a pre-production verification step on the exact production infrastructure. If you have burned by deployments that worked in staging but failed in production because of environment differences, blue-green eliminates that variable. The cost is maintaining two parallel environments, which doubles your server resources (though the idle environment can run at reduced capacity).

Canary releases are appropriate when you have enough traffic that statistical significance matters. Sending 5% of traffic to a new release only provides useful signal if 5% of your traffic is more than a handful of requests per minute. For a site serving 100 requests per minute, 5 requests per minute on the canary can reveal issues within a few minutes. For a site serving 10 requests per minute, you might wait an hour before the canary has handled enough requests to draw conclusions.

Rolling updates in Kubernetes are the natural choice if you have already containerized your WordPress application. They provide a smooth upgrade path with built-in health checks and automatic rollback. The overhead is maintaining a Kubernetes cluster, which is not trivial, but the operational benefits extend far beyond deployment.

Combining Strategies

These strategies are not mutually exclusive. A common production setup combines several:

1. Symlink-based atomic deployment within each server (Deployer manages individual server releases).
2. Blue-green at the infrastructure level (two pools of servers, traffic switches between them).
3. Canary at the load balancer level (small percentage of traffic to the new blue/green environment before full switch).
4. Automated rollback based on error rate monitoring in the CI/CD pipeline.

This layered approach provides multiple safety nets. If the canary detects an issue, only 5% of traffic was affected. If the canary passes but a problem emerges after full promotion, the blue-green switch reverses traffic instantly. And if a server within the active pool has issues, the symlink rollback on that specific server takes effect in under a second.

The key to making any of these strategies work for WordPress is addressing the stateful components: the database, the uploads directory, the object cache, and user sessions. Stateless code deployments are the easy part. Managing shared state across releases, environments, and containers is where the real engineering happens.

Start with symlink-based deployment and Deployer. Get comfortable with atomic releases and instant rollback. Then layer on blue-green or canary releases as your traffic and reliability requirements demand. The investment in deployment infrastructure pays for itself the first time a deploy goes wrong and you recover in seconds instead of scrambling for minutes.

Share this article

Tom Bradley

DevOps engineer focused on WordPress deployment automation. Builds CI/CD pipelines and infrastructure-as-code solutions for WordPress agencies.