Troubleshooting Performance Issues in Oracle Cloud Databases

Performance issues in Oracle Cloud Infrastructure (OCI) databases can arise due to CPU saturation, I/O bottlenecks, long-running queries, or inefficient application design. As a Cloud Ops DBA, your role is to identify the root cause quickly and take corrective actions.

This guide walks through a structured troubleshooting approach using both OCI CLI and Oracle dynamic views.

Step 1: Check Database System Health

Start with confirming that the DB system is in an AVAILABLE state.

oci db system list --compartment-id ocid1.compartment.oc1..aaaaexample

Sample output

{
  "data": [
    {
      "display-name": "Prod-DB-System",
      "id": "ocid1.dbsystem.oc1.ap-mumbai-1.abcdexample",
      "cpu-core-count": 4,
      "node-count": 2,
      "lifecycle-state": "AVAILABLE",
      "version": "19.21.0.0.0"
    }
  ]
}

If the state is AVAILABLE, move ahead. If not, check if maintenance or failover is in progress.

Step 2: Review Resource Utilization via OCI Metrics

High CPU or memory utilization may indicate a bottleneck.

oci monitoring metric-data summarize-metrics-data \
--compartment-id ocid1.compartment.oc1..aaaaexample \
--namespace oci_database \
--query-text "CpuUtilization[1m].mean()"

Sample output

{
  "data": [
    {
      "aggregated-datapoints": [
        { "timestamp": "2025-09-05T14:00:00+00:00", "value": 85.2 },
        { "timestamp": "2025-09-05T14:01:00+00:00", "value": 87.5 }
      ]
    }
  ]
}

CPU values consistently above 80 percent indicate the need to check sessions and queries.

Step 3: Identify Long-running Queries

Connect to the database using SQL*Plus and run:

SET LINESIZE 200
SET PAGESIZE 100

SELECT s.sid,
       s.serial#,
       s.username,
       s.sql_id,
       q.sql_text,
       ROUND(s.last_call_et/60,2) AS minutes_running,
       s.event
FROM   v$session s
JOIN   v$sql q
ON     s.sql_id = q.sql_id
WHERE  s.username IS NOT NULL
AND    s.status = 'ACTIVE'
AND    s.last_call_et > 600
ORDER BY s.last_call_et DESC;

Sample output

SID  SERIAL# USERNAME SQL_ID    MINUTES_RUNNING EVENT                     SQL_TEXT
---  ------- -------- --------- --------------- ------------------------- ---------------------------------------
123  4567    HR       8h2x9k1m             45.2 db file sequential read   SELECT * FROM employees e, departments d ...
234  8910    APPUSER  9k3y7z2p             28.1 direct path read temp     SELECT COUNT(*) FROM orders WHERE status='PENDING'

Queries running longer than expected should be tuned or terminated if they block others.

Step 4: Analyze Wait Events

Wait events show what sessions are waiting on.

SELECT event, COUNT(*) AS sessions_waiting
FROM   v$session
WHERE  state='WAITING'
GROUP BY event
ORDER BY sessions_waiting DESC;

Sample output

EVENT                          SESSIONS_WAITING
------------------------------ ----------------
db file sequential read                        7
log file sync                                  3
direct path read temp                          2

If the majority of waits are I/O related, storage performance should be investigated.

Step 5: Check Historical Data from ASH

SELECT sql_id,
       COUNT(*) AS active_samples,
       ROUND(AVG(wait_time + time_waited)/1000,2) AS avg_ms_wait
FROM   v$active_session_history
WHERE  sample_time > SYSDATE - (1/24)
GROUP BY sql_id
ORDER BY active_samples DESC;

Sample output

SQL_ID    ACTIVE_SAMPLES AVG_MS_WAIT
--------- -------------- -----------
8h2x9k1m             120       65.4
9k3y7z2p              90       70.1

This reveals which SQLs consumed the most time in the last hour.

Step 6: Review Alert Logs

OCI allows fetching alert logs from the console, but you can also query from inside the DB.

SELECT originating_timestamp, message_text
FROM   v$diag_alert_ext
WHERE  message_text LIKE '%ORA-%'
ORDER BY originating_timestamp DESC;

Sample output

ORIGINATING_TIMESTAMP   MESSAGE_TEXT
----------------------- ---------------------------------------------------
05-SEP-25 14:32:01      ORA-01555: snapshot too old
05-SEP-25 14:28:14      ORA-01652: unable to extend temp segment

Errors such as ORA-01555 or ORA-01652 can explain poor performance.

Step 7: Take Corrective Actions

Based on findings:

  • If CPU is saturated → scale up cores in OCI or tune queries.
  • If storage waits are high → check IOPS limits or move to higher storage tier.
  • If SQL is inefficient → engage developers to optimize execution plans.
  • If temporary space is exhausted → add tempfiles.

Conclusion

Troubleshooting performance issues in OCI requires a mix of OCI CLI checks and database internal diagnostics. By starting with system status, then drilling down into metrics, sessions, waits, historical ASH data, and alert logs, a Cloud Ops DBA can quickly isolate the cause of performance degradation and apply the right solution.

Oracle DBA

Experienced OCM-certified Oracle Database Administrator with over 18 years of expertise in designing, implementing, and managing complex database solutions. My expertise spans performance optimization, security, and high-stakes solution implementation. Adept at managing complex environments with precision.

No Comments

    Leave a Message

    Your email address will not be published. All fields are mandatory. **