The Complete Guide to WordPress Race Conditions: WP-Cron, Options, and Transients
Race conditions are among the hardest bugs to find in any software system. They hide in the gap between “this works on my local machine” and “this breaks under production traffic.” WordPress, for all its strengths, was not designed with concurrent programming as a first-class concern. The architecture leans heavily on a single-threaded PHP request model, shared MySQL state, and an object cache layer that can drift out of sync with the database under load.
This guide walks through the specific race conditions that live inside WordPress core, focusing on three subsystems where they cause the most damage: WP-Cron, the Options API, and Transients. Each section includes source-level analysis drawn from the actual WordPress codebase, concrete code examples that reproduce the problem, and battle-tested fixes you can apply today.
Understanding the Execution Model That Creates Race Conditions
Before dissecting individual race conditions, you need a clear picture of how WordPress processes requests in a typical production environment.
A standard WordPress deployment runs behind a web server (Nginx or Apache) that spawns multiple PHP processes. Each process handles one HTTP request at a time, but dozens or hundreds of processes run simultaneously. Every one of those processes can read from and write to the same MySQL database and the same object cache (Memcached or Redis). There is no built-in coordination between these processes at the PHP application layer.
This means two requests arriving 5 milliseconds apart can both read the same option value from the database, both modify it in PHP memory, and both write it back. The second write silently overwrites the first. No error is thrown. No warning appears in the logs. The data from the first write simply vanishes.
WordPress core uses several strategies to mitigate this problem, but none of them eliminate it entirely. The cron option uses a transient-based lock. The alloptions cache uses a read-through pattern with wp_cache_add(). Transients with expiration times use separate timeout options. Each of these mechanisms has specific failure modes under concurrent access, and understanding those failure modes is the key to writing WordPress code that holds up under real traffic.
spawn_cron() Source Code Walkthrough and the 60-Second Lock
The spawn_cron() function is WordPress’s mechanism for triggering cron execution. It lives in wp-includes/cron.php and is called on every page load (via the wp_cron() function, which since WordPress 6.9 registers _wp_cron() on the shutdown action rather than wp_loaded). Its job is to fire a non-blocking HTTP request to wp-cron.php when scheduled events are due. The critical question is: how does it prevent multiple simultaneous page loads from all spawning cron at the same time?
The answer is a transient-based lock. Here is the actual flow from core source:
function spawn_cron( $gmt_time = 0 ) {
if ( ! $gmt_time ) {
$gmt_time = microtime( true );
}
if ( defined( 'DOING_CRON' ) || isset( $_GET['doing_wp_cron'] ) ) {
return false;
}
$lock = (float) get_transient( 'doing_cron' );
if ( $lock > $gmt_time + 10 * MINUTE_IN_SECONDS ) {
$lock = 0;
}
// Don't run if another process is currently running it
// or more than once every 60 sec.
if ( $lock + WP_CRON_LOCK_TIMEOUT > $gmt_time ) {
return false;
}
// ... check for ready cron jobs ...
$doing_wp_cron = sprintf( '%.22F', $gmt_time );
set_transient( 'doing_cron', $doing_wp_cron );
// ... fire the HTTP request to wp-cron.php ...
}
The constant WP_CRON_LOCK_TIMEOUT is defined in wp-includes/default-constants.php and defaults to MINUTE_IN_SECONDS (60). This means once a cron spawn starts, no other process should trigger another spawn for at least 60 seconds.
The Lock Check Is Not Atomic
Look carefully at the sequence of operations:
- Read the
doing_crontransient withget_transient( 'doing_cron' ) - Compare the lock value against the current time
- Write a new lock value with
set_transient( 'doing_cron', $doing_wp_cron )
Between step 1 and step 3, another PHP process can execute the same steps. Both processes read the same (expired) lock value, both decide the lock is free, and both proceed to spawn cron. This is a textbook check-then-act race condition.
On a busy site handling 100 requests per second, the window between reading the transient and writing the new lock value might be 1-5 milliseconds. That is more than enough time for two or three other requests to slip through the same gap.
Why the Lock Still Works (Mostly)
The reason this is not a catastrophic problem in practice comes down to the design of wp-cron.php itself. When wp-cron.php receives the request, it performs its own lock check:
$doing_wp_cron = $_GET['doing_wp_cron'];
if ( get_transient( 'doing_cron' ) !== $doing_wp_cron ) {
// Another process got the lock. Bail out.
exit;
}
The value stored in the transient is the high-precision timestamp (sprintf( '%.22F', $gmt_time )) from the process that set it. When multiple processes race to set the transient, only the value from the last writer survives. All other processes pass their own timestamp as the doing_wp_cron query parameter. When they hit wp-cron.php, they compare their own timestamp against whatever is now in the transient. Only one will match.
This is a clever use of a last-writer-wins property. It does not prevent multiple HTTP requests to wp-cron.php from being fired (those requests still consume resources), but it ensures that only one of those requests actually executes cron events. The redundant requests bail out early.
When the Lock Breaks Down
The 60-second lock fails in several real-world scenarios:
External object cache flush. If your Redis or Memcached instance is flushed (or restarts), the doing_cron transient disappears. When wp_using_ext_object_cache() returns true, transients are stored in the object cache rather than the database. A cache flush means every subsequent request sees no lock at all, and all of them attempt to spawn cron simultaneously.
Extremely long cron jobs. If a cron event takes more than 60 seconds, the lock expires while the job is still running. The next page load sees an expired lock and spawns a new cron runner. If the cron event is not idempotent (for example, it sends emails or processes payments), you now have two instances running the same job concurrently.
Clock skew between servers. The lock value and the comparison time both use microtime( true ), which depends on the system clock. If your web servers have clocks that are even a few seconds apart (not uncommon in cloud environments without NTP properly configured), one server might see the lock as expired while another server just set it.
The Cron Option Serialization Race Condition
All WordPress cron events are stored in a single option row in the wp_options table, under the key cron. This is a serialized PHP array that can grow to tens of kilobytes on sites with many scheduled events. Every function that modifies the cron schedule follows the same pattern:
- Call
_get_cron_array(), which callsget_option( 'cron' )to read the entire cron array - Modify the array in PHP memory (add an event, remove an event, reschedule)
- Call
_set_cron_array( $cron ), which callsupdate_option( 'cron', $cron, true )to write the entire array back
This is a read-modify-write cycle on a single database row with no locking. The _get_cron_array() function reads the current state:
function _get_cron_array() {
$cron = get_option( 'cron' );
if ( ! is_array( $cron ) ) {
return array();
}
if ( ! isset( $cron['version'] ) ) {
$cron = _upgrade_cron_array( $cron );
}
unset( $cron['version'] );
return $cron;
}
And _set_cron_array() writes the modified state:
function _set_cron_array( $cron, $wp_error = false ) {
if ( ! is_array( $cron ) ) {
$cron = array();
}
$cron['version'] = 2;
$result = update_option( 'cron', $cron, true );
if ( $wp_error && ! $result ) {
return new WP_Error(
'could_not_set',
__( 'The cron event list could not be saved.' )
);
}
return $result;
}
The Lost Update Problem
Consider two requests arriving nearly simultaneously. Request A wants to schedule event X. Request B wants to schedule event Y.
Time 0ms: Request A calls _get_cron_array() -> gets {existing_events}
Time 1ms: Request B calls _get_cron_array() -> gets {existing_events}
Time 2ms: Request A adds event X -> {existing_events, X}
Time 3ms: Request B adds event Y -> {existing_events, Y}
Time 4ms: Request A calls _set_cron_array({existing_events, X})
Time 5ms: Request B calls _set_cron_array({existing_events, Y})
The final state of the cron option is {existing_events, Y}. Event X has been silently lost. Request A returned true, indicating success. No error was logged. The callback registered with wp_schedule_single_event() will never fire.
This is not a theoretical concern. On sites that schedule events during page loads (for example, scheduling an async processing task in response to a form submission, or plugins that schedule cleanup tasks during init), this lost-update problem happens regularly under moderate traffic.
Why update_option() Does Not Protect Against This
You might expect update_option() to use some form of optimistic locking, perhaps checking a version number before writing. It does not. The function in wp-includes/option.php performs a direct $wpdb->update() call:
$result = $wpdb->update(
$wpdb->options,
$update_args,
array( 'option_name' => $option )
);
This generates a SQL statement like:
UPDATE wp_options
SET option_value = '{serialized_data}'
WHERE option_name = 'cron'
There is no WHERE option_value = '{previous_value}' clause. There is no row version check. The UPDATE succeeds unconditionally, overwriting whatever was there before. MySQL’s own row-level locking ensures the two UPDATEs do not corrupt the row itself, but it does nothing to prevent the logical data loss from the lost update.
wp_schedule_single_event() Atomicity Failures Under Concurrency
The wp_schedule_single_event() function has a built-in duplicate detection mechanism. Before scheduling a new event, it checks whether an identical event (same hook, same args) already exists within a 10-minute window:
$crons = _get_cron_array();
$key = md5( serialize( $event->args ) );
$duplicate = false;
if ( $event->timestamp < time() + 10 * MINUTE_IN_SECONDS ) {
$min_timestamp = 0;
} else {
$min_timestamp = $event->timestamp - 10 * MINUTE_IN_SECONDS;
}
if ( $event->timestamp < time() ) {
$max_timestamp = time() + 10 * MINUTE_IN_SECONDS;
} else {
$max_timestamp = $event->timestamp + 10 * MINUTE_IN_SECONDS;
}
foreach ( $crons as $event_timestamp => $cron ) {
if ( $event_timestamp < $min_timestamp ) {
continue;
}
if ( $event_timestamp > $max_timestamp ) {
break;
}
if ( isset( $cron[ $event->hook ][ $key ] ) ) {
$duplicate = true;
break;
}
}
if ( $duplicate ) {
// ... return false or WP_Error ...
}
The duplicate check reads the cron array, scans for a matching event, and then either bails out or proceeds to add the event and write the array back. This entire sequence is not atomic. Between the moment the duplicate check completes (finding no duplicate) and the moment the new event is written, another request can schedule the same event.
Reproducing the Duplicate Event Bug
Here is a plugin that demonstrates this race condition. It schedules a single event on every page load, relying on the built-in duplicate detection to prevent multiple copies:
add_action( 'init', function() {
$hook = 'my_async_task';
$args = array( get_current_user_id() );
if ( ! wp_next_scheduled( $hook, $args ) ) {
wp_schedule_single_event( time() + 300, $hook, $args );
}
} );
This code has two layers of race conditions. First, the wp_next_scheduled() check and the wp_schedule_single_event() call are not atomic. Second, even if you removed the outer check and relied solely on wp_schedule_single_event()‘s internal duplicate detection, that detection itself is not atomic.
Under concurrent requests, you can end up with two, three, or more copies of the same event in the cron array. Each copy will fire independently when its timestamp arrives, causing whatever side effects the callback produces to happen multiple times.
A Safer Approach Using Transient Locks
To prevent duplicate scheduling under concurrency, you can wrap the scheduling operation in a transient-based lock. This does not make the operation truly atomic, but it narrows the race window significantly:
function schedule_task_safely( $hook, $args, $delay = 300 ) {
$lock_key = 'sched_lock_' . md5( $hook . serialize( $args ) );
// Attempt to acquire lock using add_option (atomic INSERT).
// We use the database directly to avoid cache inconsistency.
global $wpdb;
$inserted = $wpdb->query( $wpdb->prepare(
"INSERT IGNORE INTO {$wpdb->options}
(option_name, option_value, autoload)
VALUES (%s, %s, 'no')",
'_transient_' . $lock_key,
time()
) );
if ( ! $inserted ) {
// Another process holds the lock.
return false;
}
// We hold the lock. Schedule the event.
if ( ! wp_next_scheduled( $hook, $args ) ) {
wp_schedule_single_event( time() + $delay, $hook, $args );
}
// Release lock after a short delay to cover the write window.
set_transient( $lock_key, time(), 30 );
return true;
}
The key insight here is that INSERT IGNORE is atomic at the database level. If two processes try to insert the same option name simultaneously, only one will succeed. The other gets a zero-row result. This provides a genuine mutual exclusion primitive, unlike the check-then-set pattern used in core.
update_option() and the alloptions Cache Race Condition
The alloptions cache is one of the most performance-critical caches in WordPress. On every request, wp_load_alloptions() fetches all autoloaded options in a single query and stores the result in the object cache under the key alloptions in the options group:
function wp_load_alloptions( $force_cache = false ) {
global $wpdb;
// ... filter checks ...
if ( ! wp_installing() || ! is_multisite() ) {
$alloptions = wp_cache_get( 'alloptions', 'options', $force_cache );
} else {
$alloptions = false;
}
if ( ! $alloptions ) {
// ... query database for all autoloaded options ...
$alloptions = array();
foreach ( (array) $alloptions_db as $o ) {
$alloptions[ $o->option_name ] = $o->option_value;
}
wp_cache_add( 'alloptions', $alloptions, 'options' );
}
return $alloptions;
}
The function uses wp_cache_add() rather than wp_cache_set(). The difference is critical: wp_cache_add() only writes to the cache if the key does not already exist. This prevents one request from overwriting another request’s more recent cache entry. Or at least, that is the intent.
The Read-Modify-Write Race in update_option()
When update_option() successfully writes a new value to the database, it needs to update the alloptions cache to reflect the change. Look at the relevant section of update_option():
if ( ! wp_installing() ) {
if ( ! isset( $update_args['autoload'] ) ) {
$alloptions = wp_load_alloptions( true );
if ( isset( $alloptions[ $option ] ) ) {
$alloptions[ $option ] = $serialized_value;
wp_cache_set( 'alloptions', $alloptions, 'options' );
} else {
wp_cache_set( $option, $serialized_value, 'options' );
}
}
// ... other autoload cases ...
}
The sequence is:
- Load the entire
alloptionsarray from cache (wp_load_alloptions( true ), wheretrueforces a fresh read from the persistent cache) - Modify the single option value in the local PHP copy of the array
- Write the entire array back to cache with
wp_cache_set()
This is another read-modify-write cycle. If two processes update different options at nearly the same time, the following sequence can occur:
Time 0ms: Process A updates option 'foo' in the database
Time 1ms: Process B updates option 'bar' in the database
Time 2ms: Process A loads alloptions from cache (contains old 'foo', old 'bar')
Time 3ms: Process B loads alloptions from cache (contains old 'foo', old 'bar')
Time 4ms: Process A sets alloptions with new 'foo', old 'bar'
Time 5ms: Process B sets alloptions with old 'foo', new 'bar'
After this sequence, the cache contains old values for ‘foo’ because Process B’s write overwrote Process A’s update. The database has the correct values for both options, but the cache is now stale for ‘foo’. Every subsequent request that reads ‘foo’ from the alloptions cache will get the old value until the cache is invalidated or expires.
Symptoms of the alloptions Race
This race condition manifests as “phantom” settings changes. An admin changes a plugin setting, sees the confirmation message, but the setting appears to revert to its old value. Refreshing the page sometimes shows the new value and sometimes shows the old value, depending on which cache server handles the request.
The problem is most visible on sites with persistent object caches (Redis, Memcached) and multiple application servers. Without a persistent object cache, the in-memory object cache is per-request and cannot experience this cross-process race. But without a persistent object cache, every request pays the cost of a full alloptions query, which is exactly why persistent caches are used in the first place.
Mitigation: Cache Invalidation Instead of Update
Rather than trying to update the alloptions cache atomically, a safer pattern for plugins that update options frequently is to delete the cache and let the next read repopulate it:
function safe_update_option( $option, $value ) {
$result = update_option( $option, $value );
if ( $result ) {
// Delete alloptions cache entirely.
// The next call to wp_load_alloptions() will
// rebuild it from the database.
wp_cache_delete( 'alloptions', 'options' );
}
return $result;
}
This trades a small performance cost (one extra database query on the next request) for correctness. The database is the authoritative source of truth, and forcing a rebuild from the database eliminates the stale-cache problem.
That said, be careful with this approach on high-traffic sites. If you delete the alloptions cache on every option update, and you update options frequently, you can create a “thundering herd” situation where hundreds of concurrent requests all try to rebuild the cache at the same time. For high-frequency updates, consider using a non-autoloaded option (which bypasses the alloptions cache entirely) or using a transient with its own cache key.
Transient-Based Mutual Exclusion: The TransientLock Pattern
WordPress does not provide a built-in mutex or locking primitive. The closest thing is the pattern used by spawn_cron(), but as we saw, it has race windows. A tighter approach builds on the atomic properties of MySQL’s INSERT statement and the object cache’s add() method.
The Basic Pattern
class WP_Transient_Lock {
/**
* Attempt to acquire a named lock.
*
* @param string $name Lock name.
* @param int $timeout Maximum lock duration in seconds.
* @return string|false Lock token on success, false on failure.
*/
public static function acquire( $name, $timeout = 30 ) {
$token = wp_generate_uuid4();
$cache_key = 'lock_' . $name;
if ( wp_using_ext_object_cache() ) {
// wp_cache_add() is atomic in Redis/Memcached.
$acquired = wp_cache_add( $cache_key, $token, 'locks', $timeout );
} else {
// Fall back to database-level atomicity.
global $wpdb;
$option_name = '_transient_' . $cache_key;
$acquired = (bool) $wpdb->query( $wpdb->prepare(
"INSERT IGNORE INTO {$wpdb->options}
(option_name, option_value, autoload)
VALUES (%s, %s, 'no')",
$option_name,
$token . '|' . ( time() + $timeout )
) );
}
return $acquired ? $token : false;
}
/**
* Release a named lock.
*
* @param string $name Lock name.
* @param string $token Token returned by acquire().
* @return bool True if released, false otherwise.
*/
public static function release( $name, $token ) {
$cache_key = 'lock_' . $name;
if ( wp_using_ext_object_cache() ) {
$current = wp_cache_get( $cache_key, 'locks' );
if ( $current === $token ) {
wp_cache_delete( $cache_key, 'locks' );
return true;
}
return false;
}
global $wpdb;
$option_name = '_transient_' . $cache_key;
$deleted = $wpdb->query( $wpdb->prepare(
"DELETE FROM {$wpdb->options}
WHERE option_name = %s AND option_value LIKE %s",
$option_name,
$wpdb->esc_like( $token ) . '%'
) );
return (bool) $deleted;
}
}
Why wp_cache_add() Works as a Lock Primitive
Both Redis and Memcached implement ADD (or SETNX in Redis) as an atomic operation. If the key does not exist, it is created and the operation returns success. If the key already exists, the operation returns failure without modifying the existing value. This happens at the cache server level, not in PHP, so there is no check-then-act race window.
The wp_cache_add() function in WordPress maps directly to this operation when an external object cache is configured. When using the built-in file or in-memory cache, wp_cache_add() is per-process and does not provide cross-process atomicity, which is why the database fallback is necessary.
Using the Lock Pattern
function process_expensive_task( $task_id ) {
$token = WP_Transient_Lock::acquire( 'task_' . $task_id, 60 );
if ( ! $token ) {
// Another process is already handling this task.
return;
}
try {
// Do the expensive work here.
do_the_actual_work( $task_id );
} finally {
WP_Transient_Lock::release( 'task_' . $task_id, $token );
}
}
The token-based release ensures that a process can only release its own lock. If Process A’s lock expires (because the work took longer than the timeout) and Process B acquires a new lock, Process A cannot accidentally release Process B’s lock when it finishes. This prevents a cascading failure where expired locks cause a chain of premature releases.
Stale-While-Revalidate for WordPress Transient Regeneration
The most common race condition in WordPress plugin code involves transient regeneration. The typical pattern looks like this:
function get_expensive_data() {
$data = get_transient( 'expensive_data' );
if ( false === $data ) {
$data = compute_expensive_data(); // Takes 2-5 seconds.
set_transient( 'expensive_data', $data, HOUR_IN_SECONDS );
}
return $data;
}
When the transient expires, every concurrent request finds false and calls compute_expensive_data() simultaneously. If this function makes external API calls, runs heavy database queries, or processes large datasets, you get a stampede of identical expensive operations. On a site with 50 concurrent requests, that is 50 copies of the same expensive computation running at the same time, often bringing the server to its knees.
This is commonly called a “cache stampede” or “thundering herd” problem. The stale-while-revalidate pattern solves it by serving expired (stale) data while exactly one process regenerates the cache in the background.
Implementing Stale-While-Revalidate
function get_data_with_swr( $key, $ttl, $callback ) {
// Use a soft TTL stored alongside the data.
$cached = get_transient( $key );
if ( false !== $cached ) {
$expires_at = get_transient( $key . '_expires' );
if ( $expires_at && time() < $expires_at ) {
// Data is fresh. Return it.
return $cached;
}
// Data is stale. Try to acquire a regeneration lock.
$lock_key = $key . '_regen_lock';
$acquired = wp_cache_add( $lock_key, 1, 'transient_locks', 60 );
if ( ! $acquired ) {
// Another process is regenerating. Return stale data.
return $cached;
}
// We hold the lock. Regenerate in this request.
$fresh_data = call_user_func( $callback );
// Set a hard TTL much longer than the soft TTL.
// This ensures stale data survives even if regeneration
// is slow.
set_transient( $key, $fresh_data, $ttl * 3 );
set_transient( $key . '_expires', time() + $ttl, $ttl * 3 );
// Release the lock.
wp_cache_delete( $lock_key, 'transient_locks' );
return $fresh_data;
}
// No cached data at all (first run or hard expiration).
// Must compute synchronously.
$data = call_user_func( $callback );
set_transient( $key, $data, $ttl * 3 );
set_transient( $key . '_expires', time() + $ttl, $ttl * 3 );
return $data;
}
The Two-TTL Strategy Explained
The pattern uses two expiration times. The "soft TTL" is stored as a separate transient (_expires) and represents when the data should ideally be refreshed. The "hard TTL" is the actual expiration on the main transient and is set to 3x the soft TTL. This buffer means that even if regeneration takes a long time, or if the regenerating process crashes, stale data remains available for a significant window.
When the soft TTL passes, the first process to arrive acquires the regeneration lock and refreshes the data. All other processes see the lock and return the stale data immediately. The user experience is that the page loads fast for everyone, and the data is at most one TTL cycle behind.
Choosing the Right Lock Backend
The wp_cache_add() call used for the regeneration lock only works as a true atomic lock when an external object cache is in use. If your site runs without Redis or Memcached, the in-memory cache provides no cross-process protection. For sites without external object caches, replace the wp_cache_add() lock with the database-level INSERT IGNORE approach described in the TransientLock pattern above.
Database-Level Locking vs. WordPress Application-Level Locking
WordPress application-level locks (transients, options used as flags) are built on top of the database but do not use the database's own locking features. MySQL and MariaDB provide several locking mechanisms that are far stronger than anything WordPress implements at the application layer.
MySQL GET_LOCK() and RELEASE_LOCK()
MySQL's named lock functions provide true mutual exclusion:
function mysql_lock_acquire( $name, $timeout = 5 ) {
global $wpdb;
$result = $wpdb->get_var( $wpdb->prepare(
"SELECT GET_LOCK(%s, %d)",
'wp_' . $name,
$timeout
) );
return $result === '1';
}
function mysql_lock_release( $name ) {
global $wpdb;
$wpdb->query( $wpdb->prepare(
"SELECT RELEASE_LOCK(%s)",
'wp_' . $name
) );
}
GET_LOCK() acquires a server-wide named lock. If the lock is already held by another connection, the calling connection blocks for up to $timeout seconds. Only one connection can hold a given named lock at any time. The lock is automatically released when the connection closes, which prevents dead locks from crashed processes.
Advantages Over Transient Locks
True atomicity. There is no check-then-act gap. The lock acquisition and the exclusion guarantee are a single operation at the database level.
Automatic cleanup. If a PHP process crashes or the connection drops, MySQL automatically releases the lock. Transient locks remain set until they expire, which can block other processes even after the lock holder is gone.
Blocking with timeout. GET_LOCK() can wait for the lock to become available, up to a specified timeout. Transient locks are non-blocking: if you cannot acquire the lock, you must decide immediately whether to retry, skip, or fail.
Disadvantages of MySQL Locks
Connection scoped. The lock is tied to the MySQL connection. In WordPress, the $wpdb connection is typically a single persistent or non-persistent connection per PHP process. If your hosting environment uses connection pooling (such as ProxySQL), locks may not behave as expected because different PHP requests might share the same MySQL connection.
Limited number of locks. Before MySQL 5.7.5, each connection could hold only one named lock at a time. Acquiring a second lock implicitly released the first. MySQL 5.7.5+ supports multiple concurrent locks per connection, but you should verify your database version before relying on this.
Not replicated. Named locks are local to the MySQL server instance. If your WordPress installation uses read replicas, locks acquired on the primary server are not visible to connections hitting replicas. This matters for lock checks but not for lock acquisition, since writes always go to the primary.
SELECT ... FOR UPDATE
For operations that need to atomically read and update a specific row, SELECT ... FOR UPDATE provides row-level locking within a transaction:
function atomic_increment_counter( $option_name ) {
global $wpdb;
$wpdb->query( 'START TRANSACTION' );
$current = $wpdb->get_var( $wpdb->prepare(
"SELECT option_value FROM {$wpdb->options}
WHERE option_name = %s FOR UPDATE",
$option_name
) );
$new_value = (int) $current + 1;
$wpdb->update(
$wpdb->options,
array( 'option_value' => $new_value ),
array( 'option_name' => $option_name )
);
$wpdb->query( 'COMMIT' );
return $new_value;
}
The FOR UPDATE clause acquires an exclusive lock on the selected row. Any other transaction that tries to SELECT ... FOR UPDATE the same row will block until the first transaction completes. This is the proper way to implement atomic read-modify-write operations on the wp_options table.
However, using transactions and row locks in WordPress requires care. WordPress does not use transactions by default, and many plugins assume they can read uncommitted data. Long-running transactions can also cause lock contention and deadlocks under high concurrency. Use this approach for specific critical sections, not as a general replacement for the Options API.
Real-World Debugging with Query Monitor and WP-CLI Cron Commands
Race conditions are notoriously difficult to debug because they are timing-dependent. You cannot set a breakpoint and step through the code, because the act of pausing one process changes the timing that triggers the bug. Instead, you need tools that observe the system without significantly altering its behavior.
Query Monitor for Option and Transient Analysis
The Query Monitor plugin (by John Blackbourn) shows every database query executed during a request, including the full SQL and the PHP call stack that triggered it. This is invaluable for identifying redundant option reads, unexpected cache misses, and duplicate cron operations.
To investigate a suspected cron race condition, enable Query Monitor and look for:
- Multiple
SELECT option_value FROM wp_options WHERE option_name = 'cron'queries in a single request (indicates the cron array is being read multiple times, suggesting it was invalidated mid-request) UPDATE wp_options SET option_value = ... WHERE option_name = 'cron'appearing more than once per request- Cache misses on
alloptionsafter the first load (indicates another process invalidated the cache between reads)
For transient-based race conditions, watch for:
DELETE FROM wp_options WHERE option_name = '_transient_timeout_...'immediately followed byINSERT INTO wp_optionsfor the same transient (this is the delete-then-recreate path inset_transient(), which has its own race window)- Multiple identical
INSERTattempts for the same transient name (indicates multiple processes trying to set the transient simultaneously)
WP-CLI Cron Commands
WP-CLI provides several commands for inspecting and manipulating the cron system without going through the web-based cron runner:
# List all scheduled cron events with their next run time.
wp cron event list
# Run all due cron events immediately.
wp cron event run --due-now
# Run a specific cron hook.
wp cron event run my_custom_hook
# Show the raw cron option value (useful for debugging serialization).
wp option get cron --format=json | python -m json.tool
# Check if WP-Cron is working.
wp cron test
The wp cron event list command is particularly useful for diagnosing duplicate events. If you see the same hook with the same args appearing multiple times with different timestamps, you have a race condition in your scheduling code.
# Count occurrences of each hook to find duplicates.
wp cron event list --format=csv | cut -d',' -f1 | sort | uniq -c | sort -rn
Logging Concurrent Access
For race conditions that you cannot catch with Query Monitor (because they involve two separate requests), add targeted logging:
function log_cron_modification( $option, $old_value, $value ) {
if ( $option !== 'cron' ) {
return;
}
$old_count = is_array( $old_value ) ? count( $old_value ) : 0;
$new_count = is_array( $value ) ? count( $value ) : 0;
error_log( sprintf(
'[CRON_RACE_DEBUG] PID=%d Time=%.6f Events: %d -> %d Backtrace: %s',
getmypid(),
microtime( true ),
$old_count,
$new_count,
wp_debug_backtrace_summary()
) );
}
add_action( 'update_option', 'log_cron_modification', 10, 3 );
This hooks into the update_option action (fired inside update_option() before the database write) and logs the process ID, a high-resolution timestamp, the event count change, and the call stack. When you see two log entries with different PIDs and timestamps within a few milliseconds of each other, both showing a decrease in event count (or different event counts), you have found your race condition.
Code Examples: Race Conditions and Their Fixes
Example 1: Counter Increment Race
The buggy version:
function increment_page_views( $post_id ) {
$views = (int) get_post_meta( $post_id, 'view_count', true );
update_post_meta( $post_id, 'view_count', $views + 1 );
}
Two requests reading view_count = 42 at the same time will both write 43, losing one count. The fixed version uses a SQL-level atomic increment:
function increment_page_views( $post_id ) {
global $wpdb;
$meta_key = 'view_count';
// Ensure the meta row exists.
if ( ! metadata_exists( 'post', $post_id, $meta_key ) ) {
add_post_meta( $post_id, $meta_key, 0, true );
}
// Atomic increment at the SQL level.
$wpdb->query( $wpdb->prepare(
"UPDATE {$wpdb->postmeta}
SET meta_value = meta_value + 1
WHERE post_id = %d AND meta_key = %s",
$post_id,
$meta_key
) );
// Invalidate the meta cache so subsequent reads
// in this request see the new value.
wp_cache_delete( $post_id, 'post_meta' );
}
The SQL meta_value = meta_value + 1 expression is evaluated by MySQL in a single operation. No read-modify-write gap exists at the application level.
Example 2: API Rate Limit Tracker Race
The buggy version:
function check_api_rate_limit( $api_name, $max_per_minute ) {
$key = 'rate_' . $api_name;
$count = (int) get_transient( $key );
if ( $count >= $max_per_minute ) {
return false; // Rate limited.
}
set_transient( $key, $count + 1, 60 );
return true; // Allowed.
}
Under concurrent requests, multiple processes read the same count, all see it as below the limit, and all increment it. The actual request count exceeds the intended limit. The fixed version:
function check_api_rate_limit( $api_name, $max_per_minute ) {
global $wpdb;
$option_name = '_rate_limit_' . $api_name;
$now = time();
$window = 60;
// Use a single atomic query to increment and check.
$wpdb->query( $wpdb->prepare(
"INSERT INTO {$wpdb->options}
(option_name, option_value, autoload)
VALUES (%s, %s, 'no')
ON DUPLICATE KEY UPDATE option_value =
IF(
SUBSTRING_INDEX(option_value, '|', 1) + %d < %d,
CONCAT(%d, '|', 1),
IF(
CAST(SUBSTRING_INDEX(option_value, '|', -1) AS UNSIGNED) >= %d,
option_value,
CONCAT(
SUBSTRING_INDEX(option_value, '|', 1),
'|',
CAST(SUBSTRING_INDEX(option_value, '|', -1) AS UNSIGNED) + 1
)
)
)",
$option_name,
$now . '|1',
$window,
$now,
$now,
$max_per_minute
) );
// Check if we are within the limit.
$value = $wpdb->get_var( $wpdb->prepare(
"SELECT option_value FROM {$wpdb->options}
WHERE option_name = %s",
$option_name
) );
if ( ! $value ) {
return true;
}
list( $window_start, $count ) = explode( '|', $value );
// If the window has expired, allow (the next INSERT will reset it).
if ( (int) $window_start + $window < $now ) {
return true;
}
return (int) $count <= $max_per_minute;
}
This is more complex but pushes the atomicity into MySQL using INSERT ... ON DUPLICATE KEY UPDATE. The counter and window timestamp are stored together in a single value, and the increment only happens if the count is below the limit. No PHP-level read-modify-write cycle exists.
Example 3: Idempotent Cron Event Handler
Even with perfect scheduling, cron events can fire multiple times due to the race conditions described earlier. The handler itself should be idempotent:
add_action( 'send_weekly_report', function( $user_id ) {
// Use a flag to prevent duplicate sends.
$flag_key = 'weekly_report_sent_' . $user_id . '_' . date( 'Y-W' );
// Atomic flag check using INSERT IGNORE.
global $wpdb;
$inserted = $wpdb->query( $wpdb->prepare(
"INSERT IGNORE INTO {$wpdb->options}
(option_name, option_value, autoload)
VALUES (%s, %s, 'no')",
'_transient_' . $flag_key,
'1'
) );
if ( ! $inserted ) {
// Report already sent this week.
error_log( "Weekly report already sent for user {$user_id}, skipping." );
return;
}
// Set an expiration for cleanup.
set_transient( $flag_key, '1', WEEK_IN_SECONDS );
// Actually send the report.
send_report_to_user( $user_id );
} );
The INSERT IGNORE flag ensures that no matter how many times the cron event fires, the report is only sent once per week per user. The transient expiration provides automatic cleanup so the options table does not accumulate stale flags indefinitely.
Testing for Race Conditions in Development
Finding race conditions in a development environment is hard because development typically means one developer making one request at a time. You need to simulate concurrency deliberately.
Method 1: Parallel curl Requests
The simplest approach is to fire many requests simultaneously using curl or a load testing tool:
# Fire 50 concurrent requests to a page that
# triggers your scheduling code.
seq 50 | xargs -P 50 -I {} curl -s -o /dev/null \
-w "Request {}: HTTP %{http_code} in %{time_total}s\n" \
"https://example.com/page-that-schedules-events/"
# Then check for duplicate cron events.
wp cron event list --format=csv | sort | uniq -c | sort -rn | head -20
Method 2: Strategic sleep() Injection
To widen a narrow race window for debugging, temporarily inject a sleep() call between the read and write operations:
// TEMPORARY DEBUG CODE - REMOVE BEFORE DEPLOYING.
function debug_widen_cron_race() {
// Hook into the read phase of cron scheduling.
add_filter( 'pre_option_cron', function( $pre ) {
// Only slow down non-cron requests.
if ( ! defined( 'DOING_CRON' ) ) {
usleep( 500000 ); // 500ms delay.
}
return $pre;
}, 1 );
}
add_action( 'plugins_loaded', 'debug_widen_cron_race' );
This makes the race window 500ms wide instead of a few microseconds, making it trivial to reproduce with even two concurrent requests. Remember to remove this code before deploying.
Method 3: PHPUnit with Forked Processes
For automated testing, you can use pcntl_fork() to create concurrent processes within a test:
class Test_Cron_Race_Condition extends WP_UnitTestCase {
public function test_concurrent_scheduling_does_not_duplicate() {
$hook = 'test_race_hook';
$args = array( 'test_data' );
// Fork 10 child processes.
$pids = array();
for ( $i = 0; $i < 10; $i++ ) {
$pid = pcntl_fork();
if ( $pid === 0 ) {
// Child process: schedule the event.
wp_schedule_single_event( time() + 300, $hook, $args );
exit( 0 );
}
$pids[] = $pid;
}
// Wait for all children to complete.
foreach ( $pids as $pid ) {
pcntl_waitpid( $pid, $status );
}
// Count how many copies of the event exist.
$crons = _get_cron_array();
$count = 0;
foreach ( $crons as $timestamp => $hooks ) {
if ( isset( $hooks[ $hook ] ) ) {
$count += count( $hooks[ $hook ] );
}
}
// Should be exactly 1, but will likely be more
// due to the race condition.
$this->assertEquals( 1, $count,
"Expected 1 event but found {$count}. Race condition detected."
);
}
}
Note that pcntl_fork() is not available on all PHP installations and does not work well with the WordPress test framework's database transaction handling. Each forked process gets a copy of the database connection, which can cause issues. A more reliable approach is to use a load testing tool (like ab, wrk, or k6) to hit an endpoint that triggers the scheduling code.
Method 4: MySQL Slow Query Log for Lock Contention
Enable the MySQL slow query log with a very low threshold to catch lock wait times:
# In MySQL configuration or at runtime:
SET GLOBAL slow_query_log = 1;
SET GLOBAL long_query_time = 0.01; -- 10ms
SET GLOBAL log_queries_not_using_indexes = 1;
Then examine the slow query log for queries against wp_options that take longer than expected. Lock contention on the cron option row or the alloptions cache row will show up as queries with elevated execution times even though the queries themselves are simple.
Method 5: Using wp cron event list for Post-Mortem Analysis
After running a concurrency test, use WP-CLI to dump the cron state and look for anomalies:
# Full cron dump as JSON for analysis.
wp option get cron --format=json > cron_dump.json
# Count total events per hook.
wp cron event list --fields=hook --format=csv | \
tail -n +2 | sort | uniq -c | sort -rn
# Look for events scheduled in the past
# (indicates they were missed due to lock contention).
wp cron event list --fields=hook,next_run_relative --format=table | \
grep 'ago'
If you find events scheduled "3 hours ago" or "1 day ago" that have not fired, the cron runner was likely blocked by lock contention. The doing_cron transient might have gotten stuck, or the cron option write might have failed silently due to a deadlock.
Architectural Recommendations
After working through all these race conditions, several high-level principles emerge for building WordPress code that survives concurrent access.
Push atomicity into the database. PHP cannot provide cross-process atomicity. MySQL can. Whenever you need to atomically read-modify-write, use SQL expressions (meta_value = meta_value + 1), INSERT IGNORE, INSERT ... ON DUPLICATE KEY UPDATE, SELECT ... FOR UPDATE, or GET_LOCK(). Do not read a value into PHP, modify it, and write it back.
Design for idempotency. Assume your code will run more than once. Cron handlers should check whether their work has already been done before doing it. API endpoints should handle duplicate submissions gracefully. If running the same operation twice produces the same result as running it once, race conditions become harmless.
Prefer cache invalidation over cache update. When you modify data in the database, delete the relevant cache key rather than trying to update it. Let the next read repopulate the cache from the database. The database is the single source of truth; the cache is a performance optimization that should never be treated as authoritative.
Use external object caches for lock primitives. Redis's SETNX (exposed through wp_cache_add() when using a Redis object cache) provides true atomic lock acquisition. The built-in WordPress object cache does not. If your application needs real mutex behavior, Redis or Memcached is not optional.
Avoid storing high-contention data in autoloaded options. The alloptions cache is a single, large cache entry that is read on every request and invalidated whenever any autoloaded option changes. If your plugin updates an autoloaded option frequently, every update invalidates the entire alloptions cache for all concurrent requests. Use non-autoloaded options, transients, or custom tables for data that changes more than a few times per minute.
Replace WP-Cron with a system cron for production sites. Add define( 'DISABLE_WP_CRON', true ); to wp-config.php and run wp-cron.php via a system cron job. This eliminates the spawn_cron() race condition entirely, because only one cron process runs at each scheduled interval. The typical crontab entry is:
* * * * * cd /var/www/html && php wp-cron.php >> /var/log/wp-cron.log 2>&1
Or using WP-CLI:
* * * * * cd /var/www/html && wp cron event run --due-now >> /var/log/wp-cron.log 2>&1
With a system cron, you get predictable timing, no per-request overhead from spawn_cron(), and no duplicate cron spawning from concurrent web requests. The cron option serialization race condition still exists if plugins schedule events during web requests, but the execution side is now single-process.
Monitor for symptoms. Race conditions rarely announce themselves. Set up monitoring for: duplicate cron events (same hook+args at different timestamps), options that appear to "revert" after being changed, transient regeneration storms (many identical expensive queries in the slow query log), and deadlock errors in the MySQL error log. Catching these symptoms early lets you identify and fix the underlying race condition before it causes data loss.
Race conditions in WordPress are not bugs you can fix once and forget about. They are a consequence of the architecture, and every piece of code that reads shared state, modifies it, and writes it back is potentially vulnerable. The strategies in this guide provide the tools to identify, reproduce, and fix these problems. But the most effective defense is to write code that assumes concurrent access from the start, rather than treating it as an edge case to handle later.
Rachel Torres
Senior WordPress developer and core contributor. Specializes in WordPress internals, performance optimization, and PHP best practices. Runs a WordPress consultancy in Austin, Texas.