sql server – log send queue size and redo queue size in AG

Both the log send queue and redo queue can be found in sys.sys.dm_hadr_database_replica_states (docs), and also available using the SQLServer:Database Replica Perfmon counters (docs).


With Perfmon counters, you’ll need to collect counters individually on each replica. Each replica knows about the Send/Redo(recovery) queue affecting that replica itself.

It sounds like you already found the Log Send Queue perfmon counter, which is defined as:

Amount of log records in the log files of the primary database, in kilobytes, that haven’t been sent to the secondary replica. This value is sent to the secondary replica from the primary replica. Queue size doesn’t include FILESTREAM files that are sent to a secondary.

The perfmon counter for the Redo Queue is actually the Recovery Queue counter, which is defined as:

Amount of log records in the log files of the secondary replica that have not been redone.

You can also access the Perfmon counters using T-SQL by querying the sys.dm_os_performance_counters DMV.

FROM sys.dm_os_performance_counters
WHERE object_name = 'SQLServer:Database Replica'
AND counter_name IN ('Log Send Queue','Recovery Queue')
ORDER BY cntr_value DESC;


From T-SQL, you can query the DMVs to get this info. The below query is from my standard toolbox, and includes both the Send & Redo queue sizes & rates (in the first few columns), plus some other information on the replica states & send/redo status. Note that running this on the Primary Replica will report status information for all replicas, running it on the secondary replica will only report local information for that secondary replica.

The advantage of monitoring via T-SQL is that you can have a single/central collection point to check on the status of all replicas at once.

-- ▶ Run this on **PRIMARY**

    --AG info
    AGName                          = ag.name, 
    ReplicaServer                   = ar.replica_server_name, 
    DbName                          = adc.database_name, 
    -- Queue Size & rates
    SendQueueSize                   = drs.log_send_queue_size,
    RedoQueueSize                   = drs.redo_queue_size, 
    SendRateKb                      = drs.log_send_rate, 
    RedoRateKb                      = drs.redo_rate, 
    -- State Info
    IsLocal                         = drs.is_local, 
    IsPrimary                       = drs.is_primary_replica, 
    SynchronizationState            = drs.synchronization_state_desc, 
    SynchronizationHealthState      = drs.synchronization_health_desc, 
    -- Last ____ Time
    LastSentTime                    = drs.last_sent_time, 
    LastRedoTime                    = drs.last_redone_time, 
    LastReceivedTime                = drs.last_received_time, 
    LastHardenedTime                = drs.last_hardened_time, 
    LastCommitTime                  = drs.last_commit_time, 
    -- LSN info
    RecoveryLSN                     = drs.recovery_lsn, 
    TruncationLSN                   = drs.truncation_lsn, 
    LastSentLSN                     = drs.last_sent_lsn, 
    LastReceivedLSN                 = drs.last_received_lsn, 
    LastHardenedLSN                 = drs.last_hardened_lsn, 
    LastRedoneLSN                   = drs.last_redone_lsn, 
    EndOfLogLSN                     = drs.end_of_log_lsn, 
    LastCommitLSN                   = drs.last_commit_lsn,
    --Oh yeah, filestream, too
    FileStreamSendRate              = drs.filestream_send_rate
FROM sys.dm_hadr_database_replica_states AS drs
JOIN sys.availability_databases_cluster AS adc 
    ON drs.group_id = adc.group_id AND drs.group_database_id = adc.group_database_id
JOIN sys.availability_groups AS ag
    ON ag.group_id = drs.group_id
JOIN sys.availability_replicas AS ar 
    ON drs.group_id = ar.group_id AND drs.replica_id = ar.replica_id
    -- ag.name, ar.replica_server_name, adc.database_name;
    drs.log_send_queue_size + drs.redo_queue_size DESC;