Deep Dive – Performance Issues in Oracle Cloud Databases

When a performance issue occurs in Oracle Cloud Infrastructure (OCI), the Cloud Ops DBA’s job is not just to react but to diagnose systematically and resolve the root cause. Performance problems may stem from compute, storage, network, database configuration, or inefficient SQL.

This blog provides a step-by-step deep dive troubleshooting workflow with OCI CLI commands, SQL queries, and interpretation of outputs.

Step 1: Verify Database System and Instance Status

First, confirm that the database system is available.

oci db system list --compartment-id ocid1.compartment.oc1..aaaaexample

Sample output

"display-name": "Prod-DB-System",
"cpu-core-count": 8,
"node-count": 2,
"lifecycle-state": "AVAILABLE"

Then check the instance state:

oci db database list --compartment-id ocid1.compartment.oc1..aaaaexample

Output

"db-name": "PRODDB",
"lifecycle-state": "AVAILABLE",
"db-version": "19.21.0.0.0"

If lifecycle-state is not AVAILABLE, performance degradation may be due to maintenance or failover.

Step 2: Check Resource Utilization (CPU, Memory, Storage IOPS)

CPU Utilization

oci monitoring metric-data summarize-metrics-data \
--compartment-id ocid1.compartment.oc1..aaaaexample \
--namespace oci_database \
--query-text "CpuUtilization[1m].mean()"

Output

{ "timestamp": "2025-09-05T14:00:00+00:00", "value": 92.3 }

Interpretation: CPU > 90% indicates bottleneck.

Storage IOPS

oci monitoring metric-data summarize-metrics-data \
--compartment-id ocid1.compartment.oc1..aaaaexample \
--namespace oci_blockstorage \
--query-text "ReadThroughput[1m].mean() + WriteThroughput[1m].mean()"

Output

{ "timestamp": "2025-09-05T14:05:00+00:00", "value": 580.2 }

Interpretation: If IOPS or throughput hits service limits, performance slows down.

Memory Pressure

On Exadata Cloud Service or bare metal systems, connect and check with:

vmstat 5 5

Output

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  1      0 102400  20480 204800    0    0   250   100 1500 2000 85 10  5  0  0

High us + wa means CPU + I/O waits.

Step 3: Investigate Database Sessions

Identify Active Sessions

SELECT sid, serial#, username, status, sql_id, event, blocking_session, last_call_et/60 mins_running
FROM   v$session
WHERE  status = 'ACTIVE'
AND    username IS NOT NULL
ORDER BY last_call_et DESC;

Output

SID SERIAL# USERNAME STATUS  SQL_ID   EVENT                     BLOCKING_SESSION MINS_RUNNING
--- ------- -------- ------- -------  ------------------------- ---------------- ------------
123  4567    HR      ACTIVE  8h2x9k1m db file sequential read   null             42
234  8910    APPUSER ACTIVE  9k3y7z2p direct path read temp     345              28

Analyze Blocking Sessions

SELECT blocking_session, sid, serial#, sql_id, wait_class, event
FROM   v$session
WHERE  blocking_session IS NOT NULL;

Output

BLOCKING_SESSION SID SERIAL# SQL_ID   WAIT_CLASS EVENT
---------------- --- ------- -------- ---------- -------------------------
345              234 8910    9k3y7z2p Concurrency enq: TX - row lock contention

Interpretation: A blocking session is causing contention.

Step 4: Review Top SQLs by Resource Usage

From v$sql

SELECT sql_id,
       executions,
       elapsed_time/1000000 elapsed_sec,
       cpu_time/1000000 cpu_sec,
       disk_reads,
       buffer_gets
FROM   v$sql
WHERE  elapsed_time > 10000000
ORDER BY elapsed_time DESC FETCH FIRST 5 ROWS ONLY;

Output

SQL_ID    EXECUTIONS ELAPSED_SEC CPU_SEC DISK_READS BUFFER_GETS
--------- ---------- ----------- ------- ---------- -----------
8h2x9k1m          15        1200     950     450000    15000000
9k3y7z2p          10         800     600     300000    12000000

These SQL IDs are prime suspects for tuning.

Execution Plan

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('8h2x9k1m'));

Output

-------------------------------------------------------------------------------------
| Id  | Operation                  | Name        | Rows  | Bytes | Cost  | Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |             |       |       | 45000 | 00:09:00 |
|   1 |  HASH JOIN                 |             |   10M |   2GB | 45000 | 00:09:00 |
|   2 |   TABLE ACCESS FULL        | EMPLOYEES   |   10M |  800M | 20000 | 00:04:00 |
|   3 |   TABLE ACCESS FULL        | DEPARTMENTS |    1M |   20M | 25000 | 00:05:00 |
-------------------------------------------------------------------------------------

Interpretation: Full table scans and large hash joins indicate need for indexing or rewriting SQL.

Step 5: Historical Analysis

If the issue happened earlier, use ASH and AWR.

ASH

SELECT sql_id,
       COUNT(*) active_samples,
       ROUND(AVG(wait_time + time_waited)/1000,2) avg_ms_wait
FROM   v$active_session_history
WHERE  sample_time > SYSDATE - (1/24)
GROUP BY sql_id
ORDER BY active_samples DESC;

Output

SQL_ID    ACTIVE_SAMPLES AVG_MS_WAIT
--------- -------------- -----------
8h2x9k1m             120       65.4
9k3y7z2p              90       70.1

AWR

SELECT sql_id,
       plan_hash_value,
       executions_delta,
       elapsed_time_delta/1000000 elapsed_sec
FROM   dba_hist_sqlstat
WHERE  elapsed_time_delta/1000000 > 300
ORDER BY elapsed_sec DESC;

Output

SQL_ID    PLAN_HASH_VALUE EXECUTIONS_DELTA ELAPSED_SEC
--------- --------------- ---------------- -----------
8h2x9k1m        23456789               10        2450
9k3y7z2p        98765432                5         960

Step 6: Review Alert Logs for Errors

SELECT originating_timestamp, message_text
FROM   v$diag_alert_ext
WHERE  message_text LIKE '%ORA-%'
ORDER BY originating_timestamp DESC;

Output

05-SEP-25 14:32:01 ORA-01555: snapshot too old
05-SEP-25 14:28:14 ORA-01652: unable to extend temp segment

These errors explain poor performance related to undo and temp usage.

Step 7: Remediation Options

  • CPU bottleneck → scale cores in OCI, optimize queries.
  • I/O bottleneck → move to higher storage tier, optimize SQL with indexes.
  • Long-running queries → tune execution plans, use partitioning, consider parallelism.
  • Blocking sessions → terminate blockers or redesign transaction handling.
  • Undo/Temp errors → add undo/temporary space, review batch jobs.

Conclusion

Troubleshooting performance issues in Oracle Cloud Databases requires combining OCI-level monitoring (CLI metrics for CPU, memory, IOPS) with database internals (sessions, SQL, waits, AWR/ASH). By following a structured approach, a Cloud Ops DBA can isolate the root cause and take precise action rather than trial-and-error fixes.

Oracle DBA

Experienced OCM-certified Oracle Database Administrator with over 18 years of expertise in designing, implementing, and managing complex database solutions. My expertise spans performance optimization, security, and high-stakes solution implementation. Adept at managing complex environments with precision.

No Comments

    Leave a Message

    Your email address will not be published. All fields are mandatory. **