Troubleshooting Performance Issues in Oracle Cloud Databases
Performance issues in Oracle Cloud Infrastructure (OCI) databases can arise due to CPU saturation, I/O bottlenecks, long-running queries, or inefficient application design. As a Cloud Ops DBA, your role is to identify the root cause quickly and take corrective actions.
This guide walks through a structured troubleshooting approach using both OCI CLI and Oracle dynamic views.
Step 1: Check Database System Health
Start with confirming that the DB system is in an AVAILABLE state.
oci db system list --compartment-id ocid1.compartment.oc1..aaaaexample
Sample output
{
"data": [
{
"display-name": "Prod-DB-System",
"id": "ocid1.dbsystem.oc1.ap-mumbai-1.abcdexample",
"cpu-core-count": 4,
"node-count": 2,
"lifecycle-state": "AVAILABLE",
"version": "19.21.0.0.0"
}
]
}
If the state is AVAILABLE, move ahead. If not, check if maintenance or failover is in progress.
Step 2: Review Resource Utilization via OCI Metrics
High CPU or memory utilization may indicate a bottleneck.
oci monitoring metric-data summarize-metrics-data \
--compartment-id ocid1.compartment.oc1..aaaaexample \
--namespace oci_database \
--query-text "CpuUtilization[1m].mean()"
Sample output
{
"data": [
{
"aggregated-datapoints": [
{ "timestamp": "2025-09-05T14:00:00+00:00", "value": 85.2 },
{ "timestamp": "2025-09-05T14:01:00+00:00", "value": 87.5 }
]
}
]
}
CPU values consistently above 80 percent indicate the need to check sessions and queries.
Step 3: Identify Long-running Queries
Connect to the database using SQL*Plus and run:
SET LINESIZE 200
SET PAGESIZE 100
SELECT s.sid,
s.serial#,
s.username,
s.sql_id,
q.sql_text,
ROUND(s.last_call_et/60,2) AS minutes_running,
s.event
FROM v$session s
JOIN v$sql q
ON s.sql_id = q.sql_id
WHERE s.username IS NOT NULL
AND s.status = 'ACTIVE'
AND s.last_call_et > 600
ORDER BY s.last_call_et DESC;
Sample output
SID SERIAL# USERNAME SQL_ID MINUTES_RUNNING EVENT SQL_TEXT
--- ------- -------- --------- --------------- ------------------------- ---------------------------------------
123 4567 HR 8h2x9k1m 45.2 db file sequential read SELECT * FROM employees e, departments d ...
234 8910 APPUSER 9k3y7z2p 28.1 direct path read temp SELECT COUNT(*) FROM orders WHERE status='PENDING'
Queries running longer than expected should be tuned or terminated if they block others.
Step 4: Analyze Wait Events
Wait events show what sessions are waiting on.
SELECT event, COUNT(*) AS sessions_waiting
FROM v$session
WHERE state='WAITING'
GROUP BY event
ORDER BY sessions_waiting DESC;
Sample output
EVENT SESSIONS_WAITING
------------------------------ ----------------
db file sequential read 7
log file sync 3
direct path read temp 2
If the majority of waits are I/O related, storage performance should be investigated.
Step 5: Check Historical Data from ASH
SELECT sql_id,
COUNT(*) AS active_samples,
ROUND(AVG(wait_time + time_waited)/1000,2) AS avg_ms_wait
FROM v$active_session_history
WHERE sample_time > SYSDATE - (1/24)
GROUP BY sql_id
ORDER BY active_samples DESC;
Sample output
SQL_ID ACTIVE_SAMPLES AVG_MS_WAIT
--------- -------------- -----------
8h2x9k1m 120 65.4
9k3y7z2p 90 70.1
This reveals which SQLs consumed the most time in the last hour.
Step 6: Review Alert Logs
OCI allows fetching alert logs from the console, but you can also query from inside the DB.
SELECT originating_timestamp, message_text
FROM v$diag_alert_ext
WHERE message_text LIKE '%ORA-%'
ORDER BY originating_timestamp DESC;
Sample output
ORIGINATING_TIMESTAMP MESSAGE_TEXT
----------------------- ---------------------------------------------------
05-SEP-25 14:32:01 ORA-01555: snapshot too old
05-SEP-25 14:28:14 ORA-01652: unable to extend temp segment
Errors such as ORA-01555 or ORA-01652 can explain poor performance.
Step 7: Take Corrective Actions
Based on findings:
- If CPU is saturated → scale up cores in OCI or tune queries.
- If storage waits are high → check IOPS limits or move to higher storage tier.
- If SQL is inefficient → engage developers to optimize execution plans.
- If temporary space is exhausted → add tempfiles.
Conclusion
Troubleshooting performance issues in OCI requires a mix of OCI CLI checks and database internal diagnostics. By starting with system status, then drilling down into metrics, sessions, waits, historical ASH data, and alert logs, a Cloud Ops DBA can quickly isolate the cause of performance degradation and apply the right solution.
No Comments