Daily Data Guard Monitoring Checklist for Cloud Ops DBAs

Oracle Data Guard ensures high availability and disaster recovery for critical databases running in Oracle Cloud Infrastructure (OCI). For a Cloud Ops DBA, monitoring Data Guard daily is essential to ensure the primary and standby databases remain healthy, synchronized, and ready for switchover or failover.

This blog provides a step-by-step daily monitoring checklist with commands, SQL queries, and sample outputs.

Step 1: Verify Data Guard Association with OCI CLI

Check the Data Guard association status to confirm that the standby database is healthy.

oci db data-guard-association list \
--database-id ocid1.database.oc1.ap-mumbai-1.abcdexample

Sample output

{
  "data": [
    {
      "id": "ocid1.dataguardassociation.oc1.ap-mumbai-1.abcdefexample",
      "lifecycle-state": "AVAILABLE",
      "role": "PRIMARY",
      "peer-role": "STANDBY",
      "protection-mode": "MAXIMUM_PERFORMANCE",
      "transport-type": "ASYNC"
    }
  ]
}

Interpretation: The primary database is in AVAILABLE state and Data Guard is active.

Step 2: Check Archive Destination Status on Primary

Connect to the primary database and verify archive log transport.

SELECT dest_id,
       status,
       protection_mode,
       synchronization_status,
       error
FROM   v$archive_dest_status
WHERE  dest_id > 1;

Sample output

DEST_ID STATUS    PROTECTION_MODE     SYNCHRONIZATION_STATUS ERROR
------- --------- ------------------- ---------------------- -----
      2 VALID     MAXIMUM PERFORMANCE SYNCHRONIZED

Interpretation: The destination is valid and synchronized, which means redo logs are shipping to the standby.

Step 3: Verify Managed Recovery Process on Standby

On the standby database, confirm that the Managed Recovery Process (MRP) is applying redo.

SELECT process, status, thread#, sequence#
FROM   v$managed_standby
WHERE  process LIKE 'MRP%';

Sample output

PROCESS   STATUS    THREAD# SEQUENCE#
--------- --------- ------- ---------
MRP0      APPLYING       1     14523

Interpretation: The redo logs are being applied successfully by the standby.

Step 4: Monitor Transport and Apply Lag

Lag is a critical indicator of standby health.

SELECT name, value, unit
FROM   v$dataguard_stats
WHERE  name IN ('transport lag','apply lag');

Sample output

NAME           VALUE UNIT
-------------- ----- ----------------
transport lag  +00 00:00:04
apply lag      +00 00:00:07

Interpretation: Transport lag is 4 seconds, apply lag is 7 seconds, which is acceptable in asynchronous mode.

Step 5: Review Errors in Data Guard

Check for Data Guard errors in the alert log.

SELECT originating_timestamp, message_text
FROM   v$diag_alert_ext
WHERE  message_text LIKE '%ORA-%'
AND    originating_timestamp > SYSDATE - 1
ORDER BY originating_timestamp DESC;

Sample output

ORIGINATING_TIMESTAMP   MESSAGE_TEXT
----------------------- -----------------------------------------
05-SEP-25 09:45:21      ORA-16826: apply service unexpectedly terminated

Interpretation: Any errors like ORA-16826 must be investigated immediately.

Step 6: Check Synchronization Using OCI CLI Metrics

Fetch Data Guard lag metrics directly from OCI.

oci monitoring metric-data summarize-metrics-data \
--compartment-id ocid1.compartment.oc1..aaaaexample \
--namespace oci_database \
--query-text "DataGuardTransportLag[1m].mean(), DataGuardApplyLag[1m].mean()" \
--start-time 2025-09-05T09:30:00Z \
--end-time 2025-09-05T09:40:00Z

Sample output

{
  "data": [
    {
      "aggregated-datapoints": [
        {
          "timestamp": "2025-09-05T09:35:00+00:00",
          "DataGuardTransportLag[1m].mean()": 3,
          "DataGuardApplyLag[1m].mean()": 6
        }
      ]
    }
  ]
}

Interpretation: Lag values are low, confirming standby synchronization.

Step 7: Confirm Role and Readiness

On both primary and standby databases, check the database role.

SELECT database_role, open_mode, protection_mode
FROM   v$database;

Sample output on primary

DATABASE_ROLE OPEN_MODE  PROTECTION_MODE
------------- ---------- -------------------
PRIMARY       READ WRITE MAXIMUM PERFORMANCE

Sample output on standby

DATABASE_ROLE OPEN_MODE  PROTECTION_MODE
------------- ---------- -------------------
PHYSICAL STANDBY MOUNTED MAXIMUM PERFORMANCE

Interpretation: The roles are correct and consistent.

Step 8: Record and Report

Document daily results in logs or send alerts. Many DBAs integrate these queries and commands into shell scripts that email results to the DBA team.

Conclusion

This daily checklist enables Cloud Ops DBAs to ensure Data Guard is functioning properly in Oracle Cloud Databases. By combining OCI CLI commands with SQL queries, you gain full visibility into replication, synchronization, and lag. Regular monitoring prevents surprises during switchover or failover, keeping mission-critical applications highly available.

Oracle DBA

Experienced OCM-certified Oracle Database Administrator with over 18 years of expertise in designing, implementing, and managing complex database solutions. My expertise spans performance optimization, security, and high-stakes solution implementation. Adept at managing complex environments with precision.

No Comments

    Leave a Message

    Your email address will not be published. All fields are mandatory. **