Daily Data Guard Monitoring Checklist for Cloud Ops DBAs
Oracle Data Guard ensures high availability and disaster recovery for critical databases running in Oracle Cloud Infrastructure (OCI). For a Cloud Ops DBA, monitoring Data Guard daily is essential to ensure the primary and standby databases remain healthy, synchronized, and ready for switchover or failover.
This blog provides a step-by-step daily monitoring checklist with commands, SQL queries, and sample outputs.
Step 1: Verify Data Guard Association with OCI CLI
Check the Data Guard association status to confirm that the standby database is healthy.
oci db data-guard-association list \
--database-id ocid1.database.oc1.ap-mumbai-1.abcdexample
Sample output
{
"data": [
{
"id": "ocid1.dataguardassociation.oc1.ap-mumbai-1.abcdefexample",
"lifecycle-state": "AVAILABLE",
"role": "PRIMARY",
"peer-role": "STANDBY",
"protection-mode": "MAXIMUM_PERFORMANCE",
"transport-type": "ASYNC"
}
]
}
Interpretation: The primary database is in AVAILABLE state and Data Guard is active.
Step 2: Check Archive Destination Status on Primary
Connect to the primary database and verify archive log transport.
SELECT dest_id,
status,
protection_mode,
synchronization_status,
error
FROM v$archive_dest_status
WHERE dest_id > 1;
Sample output
DEST_ID STATUS PROTECTION_MODE SYNCHRONIZATION_STATUS ERROR
------- --------- ------------------- ---------------------- -----
2 VALID MAXIMUM PERFORMANCE SYNCHRONIZED
Interpretation: The destination is valid and synchronized, which means redo logs are shipping to the standby.
Step 3: Verify Managed Recovery Process on Standby
On the standby database, confirm that the Managed Recovery Process (MRP) is applying redo.
SELECT process, status, thread#, sequence#
FROM v$managed_standby
WHERE process LIKE 'MRP%';
Sample output
PROCESS STATUS THREAD# SEQUENCE#
--------- --------- ------- ---------
MRP0 APPLYING 1 14523
Interpretation: The redo logs are being applied successfully by the standby.
Step 4: Monitor Transport and Apply Lag
Lag is a critical indicator of standby health.
SELECT name, value, unit
FROM v$dataguard_stats
WHERE name IN ('transport lag','apply lag');
Sample output
NAME VALUE UNIT
-------------- ----- ----------------
transport lag +00 00:00:04
apply lag +00 00:00:07
Interpretation: Transport lag is 4 seconds, apply lag is 7 seconds, which is acceptable in asynchronous mode.
Step 5: Review Errors in Data Guard
Check for Data Guard errors in the alert log.
SELECT originating_timestamp, message_text
FROM v$diag_alert_ext
WHERE message_text LIKE '%ORA-%'
AND originating_timestamp > SYSDATE - 1
ORDER BY originating_timestamp DESC;
Sample output
ORIGINATING_TIMESTAMP MESSAGE_TEXT
----------------------- -----------------------------------------
05-SEP-25 09:45:21 ORA-16826: apply service unexpectedly terminated
Interpretation: Any errors like ORA-16826 must be investigated immediately.
Step 6: Check Synchronization Using OCI CLI Metrics
Fetch Data Guard lag metrics directly from OCI.
oci monitoring metric-data summarize-metrics-data \
--compartment-id ocid1.compartment.oc1..aaaaexample \
--namespace oci_database \
--query-text "DataGuardTransportLag[1m].mean(), DataGuardApplyLag[1m].mean()" \
--start-time 2025-09-05T09:30:00Z \
--end-time 2025-09-05T09:40:00Z
Sample output
{
"data": [
{
"aggregated-datapoints": [
{
"timestamp": "2025-09-05T09:35:00+00:00",
"DataGuardTransportLag[1m].mean()": 3,
"DataGuardApplyLag[1m].mean()": 6
}
]
}
]
}
Interpretation: Lag values are low, confirming standby synchronization.
Step 7: Confirm Role and Readiness
On both primary and standby databases, check the database role.
SELECT database_role, open_mode, protection_mode
FROM v$database;
Sample output on primary
DATABASE_ROLE OPEN_MODE PROTECTION_MODE
------------- ---------- -------------------
PRIMARY READ WRITE MAXIMUM PERFORMANCE
Sample output on standby
DATABASE_ROLE OPEN_MODE PROTECTION_MODE
------------- ---------- -------------------
PHYSICAL STANDBY MOUNTED MAXIMUM PERFORMANCE
Interpretation: The roles are correct and consistent.
Step 8: Record and Report
Document daily results in logs or send alerts. Many DBAs integrate these queries and commands into shell scripts that email results to the DBA team.
Conclusion
This daily checklist enables Cloud Ops DBAs to ensure Data Guard is functioning properly in Oracle Cloud Databases. By combining OCI CLI commands with SQL queries, you gain full visibility into replication, synchronization, and lag. Regular monitoring prevents surprises during switchover or failover, keeping mission-critical applications highly available.
No Comments