portal.octodemo.cloud

OCI Observability, organized around the work your teams need to do.

Use this launchpad to choose the right OCI service, understand how the platform fits together, and move from overview to hands-on guidance across Monitoring, Log Analytics, APM, Database Management, enterprise applications, and integrations.

Read Oracle Docs Try LiveLabs

Start Here

A guided landing page for operators, developers, DBAs, security teams, and decision makers using OCI Observability.

portal.octodemo.cloud launchpad

Find the shortest route from “something looks wrong” to the OCI capability that answers it.

The platform already has the depth. This homepage changes the starting point: begin with a role, follow a troubleshooting workflow, or use the command palette when you know exactly where you want to go.

Platform teams App owners Security teams DBAs Executives

Quick jump

Role-based paths Keyboard-first navigation Same shortcut on every module

Start with you

Tailor the launchpad to your role and industry

Pick who you are and where you work — we'll highlight the levels, services, and the lens to start from. Access to every view is governed by OCI IAM: each persona maps to an OCI Group with policies scoped to the right compartments.

I am a…

In…

Follow the path

From govern to multitenancy — in order

Each level builds on the last. Jump straight to the launchpad module that delivers it, or open the full ladder in the guide.

Persona view

How each role identifies and uses the data

Same eight personas as the guide — what each one looks for, and the OCI services they live in.

Quick jump

Recommended starting points

OCI Observability Implementation Hub

Your step-by-step path to complete cloud-native observability, from business value to code

Strategic Core Objective

Unified Log Management & Governance

Establish a centralized, auditable record of all cloud and on-premise actions. Ingest, parse, and retain log streams to satisfy global compliance standards (ISO 42001, SOC2, GDPR) while laying the foundation for threat hunting.

100% Audit & Compliance Readiness

Zero Siloed Application Log Files

50% Storage Cost Reduction via Lifecycle Rules

Business Impact

Consolidation & Risk Mitigation

Consolidating disparate logs into OCI Log Analytics removes blind spots, reduces compliance liabilities, and eliminates the cost of managing scattered log storage servers.

Cross-Environment Visibility: Unified view across VMs, OKE, on-prem, and external clouds.
Automated Ingestion: Zero-maintenance managed ingestion pipeline.
Compliance Guardrails: Encrypted, read-only indexes with lifecycle controls.

Onboarding Management Agents & Logging (Phase 1)

Set Up IAM Groups & Policies

Create dedicated OCI Compartments and write IAM policies granting the Management Agent permission to upload telemetry.

Deploy Unified Host Agents

Deploy the OCI Management Agent across your VM fleets and bare-metal nodes to collect system logs and OS metrics.

Enable Log Groups & Parsers

Establish Retention Rules

Define automated Object Storage archival rules to transition older logs from hot query pools to cost-effective cold tiers.

Fluentd Open-Source Log Shipping Integration

Use industry-standard open source collectors like Fluentd to securely forward application log files to OCI Log Analytics endpoints.

Fluentd Configuration (td-agent.conf)

<source>
  @type tail
  path /var/log/app/app.log
  pos_file /var/log/td-agent/app.log.pos
  tag oci.loganalytics.portal
  <parse>
    @type json
    time_key timestamp
    time_format %Y-%m-%dT%H:%M:%S.%L%z
  </parse>
</source>

<match oci.loganalytics.**>
  @type oci_log_analytics
  namespace <TENANCY_NAMESPACE>
  log_group_id ocid1.loggroup.oc1.eu-frankfurt-1.aaaaaaaaxf...
  log_source_name OCI_APM_Trace
  co_id ocid1.compartment.oc1..aaaaaaaasx...
  region eu-frankfurt-1
  auth_type instance_principal
</match>

Developer Checklist

Verify Agent State: Ensure `mgmt_agent` daemon is active and configured with a valid registration key.
Log Ingestion: Verify logs appear under OCI Log Analytics Trace Source using the `Search` panel.
Metadata Tagging: Enrich Fluentd records with `host.name` and `app.environment` fields for multi-dimensional filtering.

Strategic Core Objective

Operational Telemetry & SLO Alarms

Establish metric baselines and automated alert gates. Move from a reactive fire-fighting posture to proactive, anomaly-driven operations, protecting system availability before outages reach end users.

90% Reduction in MTTD / MTTR

Zero Silent Application Outages

100% Service Level Objective (SLO) Coverage

Business Impact

Proactive Uptime & Alert Lifecycle

Centralized alerting connects OCI metrics to on-call schedules, minimizing disruption and driving strict accountability via automated SLA/SLO reporting.

Static & Anomaly Alarms: Trigger alerts on simple thresholds or ML-derived baseline deviations.
Unified Connector Pipelines: Stream alarms to Slack, Teams, or ITSM tools automatically.
Capacity Guards: Predict CPU and storage constraints before they cause outages.

Alert Automation & SLO Design (Phase 2)

Define SLIs and SLOs

Align platform SLIs (latency, error rate) with business outcomes. Establish acceptable SLO budgets (e.g. 99.9% uptime).

Deploy Metric Streams

Publish custom application and system metrics to the secure OCI Monitoring service namespace.

Create Alarms & Notification Topics

Create OCI Monitoring alarms and bind them to OCI Notification topics representing on-call responder groups.

Wire Service Connectors

Configure Service Connector Hub to route critical alarms directly to PagerDuty or Slack webhooks.

OCI Monitoring MQL & Prometheus Scraping

Configure OCI Monitoring alarms using Metric Query Language (MQL), or pull metrics directly into an open-source Prometheus/Grafana stack using OCI Management Agent endpoint exporters.

OCI MQL Expression

# Trigger alert if the HTTP error rate exceeds 5% for three consecutive 1-minute intervals
HttpRequests[1m]{resourceId = "ocid1.instance.oc1..."}.status_code != "200"
| stats count() as ErrorCount
| HttpRequests[1m]{resourceId = "ocid1.instance.oc1..."}.stats(count) as TotalCount
| (ErrorCount / TotalCount) * 100 > 5

Prometheus YML Scrape block

scrape_configs:
  - job_name: 'oci-management-agent-metrics'
    scrape_interval: 15s
    metrics_path: '/metrics'
    static_configs:
      - targets: ['10.0.2.15:9100'] # IP of the node running OCI Management Agent exporter
    metric_relabel_configs:
      - source_labels: [__name__]
        regex: 'oci_computeagent_(.*)'
        target_label: __name__

Developer Checklist

Dimension Matching: Alarms must specify the correct compartment and resource OCID dimension filters.
Testing Alarms: Trigger metric anomalies (e.g. stress test CPU) to verify the PagerDuty responder loop.
OSS Exporter: Secure your local Prometheus endpoint with basic auth or network security lists (NSGs).

Strategic Core Objective

Distributed Tracing & Database Performance Hub

Pinpoint distributed latency bottlenecks and transaction-level query blocks. By tracing requests across microservices and cloud databases, you prevent slow page response rates from driving away active users.

< 2s Average End-User Page Load (RUM)

40% Database CPU Optimization

100% Transaction Trace Visibility

Business Impact

Browser-to-DB Visibility & EBS Modernization

APM Browser RUM tracks actual user transaction latency, while DB Management provides DBA dashboards that replace archaic database maintenance CLI workflows.

W3C Context Tracing: Follow span context headers from browser to application and database.
EBS & Fusion Lane: Instrument with OCI APM; Stack Monitoring's capabilities are merging into OCI Monitoring.
DB Performance Hub: Real-time AWR charts, ASH diagnostics, and SQL tuning advisors.

🔀

Stack Monitoring is converging into OCI Monitoring

OCI Stack Monitoring's capabilities — full-stack discovery, topology, and application-stack health — are being merged into the OCI Monitoring service. Existing investment carries forward; there is no need to replace it with separate services.

APM & Database Management Migration Roadmap (Phase 3)

Create OCI APM Domains

Provision OCI APM domains and retrieve your Public Datakeys (for browser RUM) and Private Datakeys (for server agents).

Instrument Browser RUM

Inject the APM Browser RUM JS agent into main templates to record end-user page speed and AJAX performance.

Configure OTel Backends

Deploy OpenTelemetry backend instrumentation across your microservices to stream tracing spans to the APM collector.

Register Databases to DB Mgmt

OpenTelemetry distributed tracing context propagation

Propagate tracing context across microservices using W3C Trace Context HTTP headers, and capture execution spans for SQL queries.

OpenTelemetry Span Tracing

from opentelemetry import trace

tracer = trace.get_tracer("demo-portal")

def get_treasury_records(db_conn, search_term):
    # Start a span to trace this transaction
    with tracer.start_as_current_span("treasury.search") as span:
        span.set_attribute("db.system", "mssql")
        span.set_attribute("db.statement", f"SELECT * FROM treasury WHERE name = '{search_term}'")
        
        # Verify SQL injection attempt (Level 3 security attribute tracking)
        if "'" in search_term:
            span.set_attribute("security.attack.type", "sqli")
            span.set_attribute("security.attack.severity", "critical")
            
        cursor = db_conn.cursor()
        cursor.execute("SELECT * FROM treasury WHERE name = %s", (search_term,))
        return cursor.fetchall()

W3C Trace Context Propagation

import httpx
from opentelemetry import trace
from opentelemetry.propagate import inject

def call_inventory_service():
    # Inject current span context into W3C traceparent headers
    headers = {}
    inject(headers)
    
    # headers will contain:
    # 'traceparent': '00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01'
    
    response = httpx.get("http://inventory.service/items", headers=headers)
    return response.json()

Developer Checklist

W3C Header Support: Verify HTTP client and proxy elements do not strip `traceparent` or `tracestate` headers.
Database Diagnostics: Log query execution latencies and cross-reference them with Database Performance Hub session IDs.
Secure Data Keys: Ensure APM private datakeys are kept secure and loaded via environment variables, not hardcoded in source.

Strategic Core Objective

Operations Insights & GenAI Troubeshooting

Apply machine learning and LLMs to infrastructure diagnostic datasets. Predict database and compute capacity boundaries, cluster raw logs to identify anomalies, and let AI agents summarize complex incidents using natural language.

99% Accuracy in Capacity Forecasting

95% Log Clustering Anomaly Detection

Minutes Incident Diagnostics Cycle Uptime

Business Impact

AIOps & Secure AI Agent Operations

Centralized AI diagnostic services enable non-expert engineers to query cluster metrics, analyze log patterns, and predict scaling blockages safely via secure Model Context Protocol (MCP) bridges.

Operations Insights Capacity: 30/90-day Exadata database storage and CPU growth projections.
LoganAI Clustering: Condenses millions of raw log entries into a few dozen anomaly clusters automatically.
AIOps MCP Engine: Securely binds LLMs to read-only OCI APIs to answer operator questions.

Enabling Capacity Forecasting & AIOps (Phase 4)

Enable OCI Operations Insights

Set Up Log Analytics Cluster Rules

Enable LoganAI machine learning models to cluster recurring log lines and highlight anomalous spikes.

Configure MCP Server

Deploy the Model Context Protocol (MCP) server container, mounting secure OCI API read-only credentials.

Connect LLM Chat Copilots

Integrate your corporate chat client or development environment with the MCP server to start querying OCI.

AIOps Model Context Protocol & Custom OCL

Configure an MCP server schema to connect LLMs to OCI APIs, or use OCI Log Analytics Query Language (OCL) with machine learning modifiers like `cluster` to isolate log anomalies.

MCP Server Tool Configuration Schema

{
  "mcpServers": {
    "oci-observability": {
      "command": "python",
      "args": ["-m", "mcp_server.main"],
      "env": {
        "OCI_COMPARTMENT_OCID": "ocid1.compartment.oc1..aaaaaa...",
        "OCI_AUTH_MODE": "instance_principal",
        "OCI_LOG_ANALYTICS_NAMESPACE": "tenancy_namespace"
      }
    }
  }
}

OCI OCL Query

# Cluster application logs over the last 24 hours to isolate warning and error anomalies
'Log Source' = 'OCI APM Trace'
| where 'Entity Type' = 'Host'
| cluster
| where 'Cluster Type' = 'Anomaly'
| stats count() as Occurrences by 'Cluster Content'
| sort - Occurrences

Developer Checklist

Cluster Settings: Ensure Log Analytics entity mappings are complete so log clusters link correctly to host names.
MCP Permissions: Apply strict IAM policies limiting the MCP API credentials to read-only metadata calls.
Trend Accuracy: Operations Insights require a minimum of 14 days of metrics data before displaying capacity forecasting charts.

Common journeys

Use the product map that matches how people actually troubleshoot.

These flows turn the service catalog into a user journey: observe, investigate, explain impact, and automate the next step.

Recommended first stop

Learn the platform through workflows, not product names.

New users usually do better when they start with one goal: establish health, investigate an incident, tune an application, or connect observability to the rest of the estate.

OCI adoption guidance Hands-on LiveLabs

Drag nodes to rearrange

Cloud Guard

OCI O&M Platform

⋮⋮

Monitoring Metrics & Alarms

⋮⋮

Merging → OCI Monitoring

Stack Monitoring Full Stack Discovery

⋮⋮

APM Tracing & Synthetics

⋮⋮

Log Analytics AI/ML Analysis

⋮⋮

Ops Insights Capacity Planning

⋮⋮

Database Mgmt Fleet Management

OCI Services Integration

Service Connector Hub

Events

Notifications

OCI Functions

Java Management

Streaming

Data Sources

Management Agents

OCI Logging

VCN Flow Logs

Service Logs

Audit Logs

Object Storage

2.4M Metrics/min

847 GB Logs Analyzed

156ms Avg Response

99.7% Security Score

OCI Observability Services

Comprehensive tools for monitoring, analysis, and management

Free Tier

Monitoring

Collect and visualize metrics from OCI resources with customizable dashboards and intelligent alerting.

✓ 500M ingested datapoints/month

✓ Metrics Explorer & Dashboards

✓ Alarms & Notifications

★ Custom Metrics (pay per use)

Metrics Explorer Custom Dashboards Alarms Notifications

Free Tier Paid

Log Analytics

AI-powered log analysis with ML clustering, anomaly detection, and 300+ pre-built parsers for any log source.

✓ 10 GB Active Storage/month

✓ 300+ Pre-built Parsers

★ LoganAI (NLP Queries)

★ Extended Storage & Retention

ML Clustering Anomaly Detection 300+ Parsers Link Analysis

Free Tier Paid

Application Performance Monitoring

End-to-end visibility from browser to database with distributed tracing and real user monitoring.

✓ 1000 Trace Events/hour

✓ Basic Distributed Tracing

★ Synthetic Monitoring

★ Real User Monitoring (RUM)

Distributed Tracing Real User Monitoring Synthetic Monitoring

Paid Only → OCI Monitoring

Stack Monitoring Merging

Auto-discover and monitor your full application stack. Its capabilities are being merged into the OCI Monitoring service.

✗ No Free Tier Available

★ Auto Discovery & Topology

★ EBS & Fusion Support

Auto Discovery Topology View EBS Support

Paid Enterprise

Operations Insights

Analyze resource utilization, forecast capacity needs, and identify optimization opportunities across your fleet.

✗ No Free Tier Available

★ Capacity Planning & Forecasting

★ SQL Warehouse Analytics

★ Exadata & Host Insights

Capacity Planning SQL Warehouse Exadata Insights Host Insights

Free Tier Full DBMS

Database Management

Comprehensive fleet management with Performance Hub, SQL tuning advisors, and proactive problem detection.

✓ Basic Fleet Overview

✓ Basic Metrics & Monitoring

★ Performance Hub & AWR

★ SQL Tuning Advisor

Fleet Summary Performance Hub SQL Tuning Job Scheduling

Service Tier Comparison

Free tier vs. paid capabilities at a glance

Service

Free Tier

Paid / Enterprise

Monitoring

✓ 500M datapoints/month, Alarms, Dashboards

★ Custom Metrics (usage-based pricing)

Log Analytics

✓ 10 GB/month, 300+ Parsers, ML Clustering

★ LoganAI, Extended Storage, Custom Retention

APM

✓ 1000 Traces/hr, Basic Distributed Tracing

★ Synthetic & Real User Monitoring

Database Mgmt

✓ Basic Fleet View, Core Metrics

★ Performance Hub, AWR, SQL Tuning

Ops Insights

✗ No free tier available

★ Capacity Planning, SQL Warehouse, Exadata

Stack Monitoring → Monitoring

✗ No free tier available

★ Auto Discovery & Topology — merging into OCI Monitoring

Use the Service Tier toggle in the sidebar to see how capabilities change between free and paid tiers.

Getting Started

Begin your observability journey in minutes

Enable Services

Navigate to the OCI Console and enable the Observability services you need. Most services are available in all OCI regions and can be enabled in minutes with just a few clicks.

Deploy Management Agents

Install the unified Management Agent on your compute instances to collect metrics, logs, and traces. The agent supports Linux, Windows, and Solaris operating systems.

Configure Data Collection

Set up log sources, enable service connectors, and configure APM instrumentation for your applications. Use pre-built integrations or create custom configurations.

Build Dashboards & Alerts

Create custom dashboards to visualize your data and set up alarms to get notified when something needs attention. Use saved searches and scheduled reports for ongoing monitoring.

Helpful Resources

OCI Observability Documentation LiveLabs Workshops Observability Blog Video Tutorials

OCI 2025 Observability Professional Certification

Validate your expertise with the official Oracle certification (1Z0-1111-25)

OCI Observability Professional 1Z0-1111-25

Target Audience

Site Reliability Engineers, Cloud Operations, Application Developers, Cloud Administrators, and roles involved in monitoring and troubleshooting OCI environments.

View Exam Details Read Blog Post

The 7 Pillars of OCI Observability

Exam objectives mapped to core competency areas

Define the Pillars of Observability

Summarize OCI Observability and Management Services portfolio and understand how they work together.

Platform Overview Service Integration

18%

Monitor Cloud Environments

Demonstrate OCI Monitoring service with metrics and alarms. Configure alarm definitions using best practices.

Metrics Explorer Alarm Configuration Custom Metrics

10%

Respond to Cloud Changes

Analyze OCI Events Service concepts. Determine event structure, types, rules, and integration with OCI services.

Events Service Event Rules Notifications

16%

Centrally Manage Log Data

Distinguish log categories and enable log collection. Work with OCI Logging for centralized log management.

OCI Logging Service Connector Hub Log Groups

22% Highest Weight

Log Analytics & Advanced Analytics

Create visualizations and identify log data patterns with OCI Logging Analytics. Use ML clustering and anomaly detection.

Log Analytics ML Clustering Link Analysis Dashboards

20%

Application Performance Monitoring

Monitor applications with deep visibility into end-user experience using distributed tracing, RUM, and synthetics.

Distributed Tracing Real User Monitoring Synthetic Tests

Monitor Distributed Components

Monitor distributed components of an application stack with full-stack observability and topology views.

Stack Monitoring Topology Discovery Dependencies

💡 Drag and drop the cards above to reorder by your learning priority. Your order is saved automatically.

End-to-End Workflow Use Cases

Real-world scenarios showcasing how OCI Observability services work together

🔥

Incident Investigation & Root Cause Analysis

From alert to resolution in minutes

When a critical alarm fires, use OCI's integrated observability stack to quickly correlate metrics, logs, and traces to identify the root cause and reduce mean time to resolution (MTTR).

Alarm Fires

→

Check Metrics

→

Query Logs

→

Trace Request

→

Root Cause

1 Receive alert via OCI Monitoring alarm → Notifications → PagerDuty/Slack

2 View metric charts in Monitoring to identify the affected resource and timing

3 Pivot to Log Analytics to search for error patterns during the incident window

4 Use APM distributed tracing to follow the request path and identify the slow/failing service

5 Correlate with Database Management Performance Hub if database is suspected

Monitoring Log Analytics APM Database Management Notifications

Monitoring Docs Log Analytics Guide Try LiveLabs

🔐

Security Monitoring & Threat Detection

Proactive security with Cloud Guard + Log Analytics

Combine Cloud Guard's automated threat detection with Log Analytics' ML-powered anomaly detection to identify and investigate security incidents across your OCI tenancy.

Cloud Guard

→

Problem Detected

→

Audit Logs

→

VCN Flow Logs

→

Remediate

1 Cloud Guard continuously monitors your tenancy and detects security problems

2 Review problem details and risk score in Cloud Guard console

3 Analyze OCI Audit logs in Log Analytics to understand user activity timeline

4 Correlate with VCN Flow Logs to identify suspicious network traffic

5 Use Cloud Guard Responders to automatically remediate the issue

Cloud Guard Log Analytics OCI Audit VCN Flow Logs Notifications

Cloud Guard Docs Audit Logging Security LiveLab

⚡

Application Performance Optimization

End-to-end visibility from browser to database

Use APM's distributed tracing combined with Real User Monitoring (RUM) to identify performance bottlenecks across your entire application stack, from frontend to backend to database.

Browser (RUM)

→

API Gateway

→

Microservices

→

Database

1 Deploy APM agents to your applications (Java, Node.js, .NET, Python, Go)

2 Enable Real User Monitoring to capture actual user experience metrics

3 Analyze trace waterfalls to identify slow service calls and database queries

4 Use Synthetic Monitoring to proactively test critical user journeys

5 Correlate with Database Management for SQL-level performance insights

APM Real User Monitoring Synthetic Monitoring Database Management

APM Documentation Browser Agent Setup APM LiveLab Reference demo: octo-observability-demo

📊

Capacity Planning & Resource Optimization

Predict needs and optimize costs with Operations Insights

Use Operations Insights to analyze historical resource utilization patterns, forecast future capacity needs, and identify optimization opportunities across your database and compute fleet.

Enable OPSI

→

Collect Data

→

Analyze Trends

→

Forecast

→

Right-size

1 Enable Operations Insights for your databases and compute hosts

2 Review CPU, memory, and storage utilization dashboards

3 Use ML-powered forecasting to predict when you'll hit capacity limits

4 Analyze SQL Warehouse for inefficient queries consuming resources

5 Identify underutilized resources for right-sizing opportunities

Operations Insights SQL Warehouse Database Management Host Insights

OPSI Documentation Capacity Planning OPSI LiveLab

🐳

Kubernetes & Microservices Observability

Full-stack visibility for containerized workloads

Monitor your OKE (Oracle Container Engine for Kubernetes) clusters with comprehensive observability covering cluster health, pod metrics, application traces, and container logs.

OKE Cluster

→

Prometheus Metrics

→

Container Logs

→

APM Traces

1 Enable OCI Monitoring for OKE cluster metrics (CPU, memory, network)

2 Configure Fluentd or OCI Logging to collect container logs

3 Send logs to Log Analytics for ML-powered clustering and analysis

4 Deploy APM agents for distributed tracing across microservices

5 Create unified dashboards combining metrics, logs, and traces

Monitoring Log Analytics APM OCI Logging Service Connector

OKE Monitoring K8s Log Analytics OKE Observability Lab

🗄️

Database Performance Troubleshooting

Deep dive into Oracle Database performance

Use Database Management's Performance Hub to analyze ASH, AWR, and real-time SQL monitoring data to identify and resolve database performance issues.

Slow Query Alert

→

Performance Hub

→

ASH Analysis

→

SQL Tuning

1 Enable Database Management for your Oracle Databases (on-prem or cloud)

2 Use Fleet Summary to identify databases with performance issues

3 Open Performance Hub to analyze ASH data and wait events

4 Identify problematic SQL statements using SQL Monitoring

5 Get SQL Tuning Advisor recommendations and implement fixes

Database Management Performance Hub SQL Tuning Advisor AWR

DB Management Docs Performance Hub DB Management Labs Hands-on lab: oci-dbman-opsi

🏢

E-Business Suite Full Stack Monitoring

Complete visibility for Oracle EBS on OCI

Monitor your Oracle E-Business Suite deployment end-to-end using Stack Monitoring's auto-discovery capabilities combined with Log Analytics for application and database logs.

Apache/OHS

→

Forms/OAF

→

Concurrent Mgr

→

EBS Database

1 Install Management Agent on EBS application and database servers

2 Enable Stack Monitoring to auto-discover the EBS stack topology

3 Configure Log Analytics to collect CM logs, Apache logs, and DB alert logs

4 Enable Database Management for EBS database performance monitoring

5 Set up alarms for critical metrics (CM queue depth, form server health)

Stack Monitoring Log Analytics Database Management Management Agent

Stack Monitoring EBS Discovery Guide EBS Observability Lab

💰

Cost Optimization with Observability Data

Use telemetry to reduce cloud spend

Leverage observability data from Operations Insights and Monitoring to identify underutilized resources, right-size infrastructure, and optimize your OCI spend.

Collect Metrics

→

Analyze Usage

→

Identify Waste

→

Right-size

1 Enable Operations Insights across your database and compute fleet

2 Review resource utilization dashboards to identify underutilized assets

3 Use Monitoring metrics to identify idle or over-provisioned instances

4 Analyze SQL performance to identify queries that can be optimized

5 Implement right-sizing recommendations and track savings

Operations Insights Monitoring Cost Analysis Budgets

Cost Management Resource Analysis Cost Estimator

📸 Product Capabilities Showcase

Real screenshots from OCI Observability services

Kubernetes Cluster Summary

Comprehensive view of cluster health, resource utilization, and workload status in Log Analytics

Log Analytics

Kubernetes Nodes Dashboard

Node-level metrics including CPU, memory, disk usage, and pod distribution across the cluster

Log Analytics

Workloads Dashboard

Deployment health, replica status, and container-level insights for all workloads

Log Analytics

Cluster Topology View

Visual representation of cluster topology with namespaces, workloads, and services

Log Analytics

🔒 Security & Compliance Use Cases

Security monitoring, threat detection, and compliance solutions

🛡️

Security Operations Center (SOC)

Centralized security monitoring with Log Analytics

Build a cloud-native Security Operations Center using OCI Log Analytics Security Fundamentals Dashboards. Aggregate and analyze security events across your entire OCI infrastructure with pre-built threat detection rules and compliance reporting.

Security Logs

→

Log Analytics

→

SOC Dashboards

→

Alerts

1 Enable OCI Audit Logs and VCN Flow Logs across all compartments

2 Import Security Fundamentals Dashboards from OCI GitHub quickstart

3 Configure OS-level security log collection (auditd, secure, syslog)

4 Set up scheduled queries for MITRE ATT&CK technique detection

5 Configure real-time alerts via Notifications and PagerDuty/Slack integration

Log Analytics Cloud Guard Audit Logs VCN Flow Logs

Security Dashboards Blog GitHub - Security Dashboards Cloud Guard Docs

🔥

WAF Security Monitoring

Web Application Firewall analytics and threat intelligence

Monitor OCI Web Application Firewall traffic and security events using Log Analytics. Visualize attack patterns, blocked requests, and protection rule effectiveness with pre-built WAF dashboards.

Web Traffic

→

OCI WAF

→

Log Analytics

→

Threat Intel

1 Enable WAF logging to OCI Logging service

2 Configure Service Connector to stream WAF logs to Log Analytics

3 Import pre-built WAF dashboards for traffic and threat analysis

4 Set up geolocation-based attack pattern detection

5 Configure alerts for SQL injection, XSS, and bot attacks

WAF Log Analytics Service Connector Hub Notifications

WAF Monitoring Blog WAF Documentation

🌐

API Gateway Monitoring

API traffic analysis and performance insights

Gain deep insights into OCI API Gateway traffic patterns, latency metrics, error rates, and usage analytics. Create custom dashboards to monitor API consumers, throttling events, and backend response times.

API Requests

→

API Gateway

→

Log Analytics

→

API Insights

1 Enable access and execution logging on API Gateway deployments

2 Configure Service Connector Hub to stream logs to Log Analytics

3 Create dashboards for latency percentiles, error rates, and throughput

4 Monitor API consumer patterns and rate limiting events

5 Set up alerts for backend failures and SLA breaches

API Gateway Log Analytics Monitoring Service Connector Hub

API Gateway Analytics Blog API Gateway Docs

🗄️ Oracle Database Use Cases

Comprehensive observability and security solutions for Oracle Databases

🛡️

Data Safe Enabled

Free tier available, paid for non-OCI databases

Oracle Data Safe provides essential security capabilities for Oracle Databases including security assessments, user assessments, data masking, activity auditing, and sensitive data discovery. Available free for OCI databases or as a paid service for on-premises and non-OCI databases.

Oracle DB

→

Data Safe

→

Security Insights

→

Compliance

1 Register your Oracle Database with Data Safe (OCI, on-prem, or other clouds)

2 Run Security Assessment to identify vulnerabilities and misconfigurations

3 Perform User Assessment to analyze user privileges and risky configurations

4 Enable Activity Auditing to track database operations and user activities

5 Use Sensitive Data Discovery to locate PII and sensitive columns

Data Safe Security Assessment Activity Auditing Data Masking

Data Safe Docs Register Database Data Safe LiveLab

📊

Data Safe → Log Analytics

Unified security analysis and compliance reporting

Stream Data Safe audit events to Log Analytics for advanced correlation, custom dashboards, and long-term retention. Combine database security events with infrastructure logs for comprehensive security posture analysis.

Oracle DB

→

Data Safe

→

Service Connector

→

Log Analytics

1 Enable Data Safe Activity Auditing for your Oracle Databases

2 Configure Service Connector Hub to stream audit data to Log Analytics

3 Create custom Log Analytics dashboards for security event visualization

4 Set up scheduled queries for compliance reporting (SOX, GDPR, HIPAA)

5 Configure alerts for anomalous database activities and policy violations

Data Safe Log Analytics Service Connector Hub Notifications

Activity Auditing Service Connector Log Analytics Ingestion

📈

Ops Insights + DB Management + Log Analytics

Complete Oracle Database observability stack

Achieve comprehensive Oracle Database observability by combining Operations Insights for capacity planning and SQL analytics, Database Management for performance monitoring, and Log Analytics for alert and audit log analysis.

Ops Insights

DB Mgmt

Log Analytics

360° View

1 Enable Database Management for real-time performance monitoring and Performance Hub

2 Enable Operations Insights for capacity planning and SQL Warehouse analytics

3 Configure Log Analytics to collect alert logs, listener logs, and audit trails

4 Use Fleet Summary for at-a-glance health across all databases

5 Create unified dashboards correlating performance metrics with log events

Operations Insights Database Management Log Analytics SQL Warehouse

Ops Insights Docs DB Management Docs Ops Insights LiveLab

🤖

Oracle DB MCP Server + GenAI

AI-powered troubleshooting and daily DBA tasks

Leverage the Oracle Database MCP (Model Context Protocol) server with Generative AI to enable natural language interactions for database troubleshooting, performance analysis, and automating routine DBA tasks through conversational interfaces.

DBA Query

→

GenAI + MCP

→

Oracle DB

→

Insights

1 Deploy Oracle Database MCP server connected to your database fleet

2 Configure GenAI agent (Claude, OCI GenAI) with MCP server access

3 Use natural language to query performance metrics and diagnose issues

4 Automate routine tasks: space analysis, user management, backup status

5 Generate AWR reports and performance recommendations via conversation

MCP Server OCI GenAI Database Management Claude Code

OCI GenAI Docs MCP Protocol Oracle DB Tools

🔍

Log Analytics Enterprise Platform

OS Security, Prometheus, Data Safe, Network, Fusion HCM

Leverage OCI Log Analytics as a unified platform for OS security logs, Prometheus metrics ingestion, Data Safe audit events, network flow logs, and Oracle Fusion HCM application logs. Enhanced with LoganAI for intelligent analysis and Logan MCP server for 3rd party LLM integration.

Multi-Source

→

Log Analytics

→

LoganAI

→

3rd Party LLMs

1 Configure OS security log collection (auditd, syslog, secure logs)

2 Set up Prometheus metric ingestion via remote write or Service Connector

3 Stream Data Safe audit events and VCN Flow Logs to Log Analytics

4 Collect Oracle Fusion HCM logs via Management Agent or API integration

5 Deploy Logan MCP server for Claude/ChatGPT natural language log queries

Log Analytics LoganAI Logan MCP Server Prometheus Data Safe

Log Analytics Docs Prometheus Ingestion Security Log Analytics Lab Reference demo: octo-observability-demo

⚡

APM for Oracle Fusion

Visual Builder apps and Fusion Cloud performance monitoring

Monitor Oracle Fusion Cloud applications including Visual Builder apps with OCI Application Performance Monitoring. Gain end-to-end visibility into user experience, transaction traces, and backend performance for your Fusion deployments.

Visual Builder

→

APM Browser

→

APM Tracer

→

Fusion Backend

1 Create APM Domain and obtain data upload endpoint and keys

2 Inject APM Browser Agent into Visual Builder applications

3 Configure Real User Monitoring (RUM) for end-user experience tracking

4 Enable server-side tracing for Fusion backend services if accessible

5 Create Apdex-based dashboards and alerts for SLA monitoring

APM Real User Monitoring Visual Builder Fusion Cloud

Fusion Observability Guide Fusion Audit Logs (Part 1) Fusion Audit Logs (Part 2) APM Docs

🔎

Log Analytics for Oracle Fusion

Fusion ERP, HCM, SCM & CX application, audit and security logs

Ingest Oracle Fusion Applications logs into OCI Log Analytics — application, audit, and security logs from Fusion ERP, HCM, SCM, and CX — using the BI Cloud Connector (BICC) export plus the Fusion Applications Observability solution. Parse, enrich, and correlate them with the OCI-side signals, then build saved searches and dashboards for functional and security teams.

Fusion Apps

→

BICC / Audit export

→

Log Analytics

→

Dashboards

1 Enable Fusion Applications audit and security log ingestion, or schedule a BICC extract to Object Storage

2 Use the predefined Fusion Applications log sources and parsers in OCI Log Analytics

3 Map entities to business services (payroll, order-to-cash, procurement) and tag by environment

4 Correlate Fusion logs with APM traces and Integration Cloud activity on a common timeline

5 Build saved searches, scheduled detections, and role-based dashboards for functional and security teams

Log Analytics Fusion HCM Audit Management Agent

Fusion Apps Observability Fusion Log Ingestion Guide Fusion HCM to LA (DevRel)

🔗

Log Analytics for OIC

Oracle Integration Cloud monitoring and troubleshooting

Monitor Oracle Integration Cloud (OIC) flows, connections, and errors using Log Analytics. Collect integration activity logs, track message flows, and create dashboards for integration health and performance analysis.

OIC Flows

→

Activity Stream

→

Log Analytics

→

Insights

1 Enable OIC Activity Stream API or configure log export to Object Storage

2 Create custom Log Analytics parser for OIC activity log format

3 Set up Service Connector to ingest OIC logs into Log Analytics

4 Build dashboards for integration success/failure rates and latency trends

5 Configure alerts for integration failures and connection timeouts

Log Analytics OIC Service Connector Hub Notifications

OIC Log Analytics Dashboards OIC Logging & Monitoring OIC Docs Custom Parsers

OCI Monitoring Query Builder

Build and execute metrics queries with dynamic namespace and region selection

View on GitHub

Query 1

Advanced mode

COMPARTMENTS (select one or more)

root

REGIONS (type or select, optional)

METRIC NAMESPACE

RESOURCE GROUP Optional

METRIC NAME

INTERVAL

STATISTIC

Metric dimensions

Aggregate metric streams

No queries yet

Add a query and click "Run Query" to visualize metrics

Common Metric Namespaces

Quick reference for OCI metric namespaces

Compute

oci_computeagent - Agent-collected metrics
oci_compute_infrastructure_health - Instance health

Database

oci_autonomous_database - ADB metrics
oci_database - Base DB metrics
oci_mysql_database - MySQL HeatWave

Networking

oci_vcn - VCN flow metrics
oci_lbaas - Load Balancer
oci_service_gateway - Service Gateway

Storage

oci_objectstorage - Object Storage
oci_blockstore - Block Volume
oci_filestorage - File Storage

EBS Observability

Complete monitoring for Oracle E-Business Suite on OCI

Presentation Tier

Apache / OHS

Forms Server

oacore (OA Framework)

Application Tier

Concurrent Manager

Workflow Engine

PL/SQL APIs

Database Tier

EBS Database

CPU: 34% IO: 12ms Sessions: 247

Integrated Observability Stack

Stack Monitoring

Auto-discover and monitor the complete EBS stack with pre-built dashboards

Auto Discovery Topology View Alerts

Log Analytics

Collect and analyze EBS application logs, concurrent manager logs, and database alert logs

300+ Parsers ML Clustering Dashboards

APM for EBS

End-to-end transaction tracing from browser to database with RUEI integration

Real User Monitoring Traces RUEI

Database Management

Performance Hub, SQL Tuning, and fleet-wide database management

AWR Analysis SQL Tuning Fleet View

Try the LiveLabs Workshop

Hands-on experience with EBS Observability using OCI Stack Monitoring and Log Analytics

Launch Workshop

The Connected Enterprise

Observability for Oracle Fusion Applications and Integration Cloud

Order-to-Cash Flow Visualization

Fusion ERP 1,247 orders

OIC 1,245 processed 2 failed

Order DB 1,243 stored 2 rejected

Log Analytics Insight

Root Cause Identified: Order #ORD-2024-78432 failed at OIC integration due to malformed JSON payload (missing "shipping_address" field).

ERROR [IntegrationFlow] OrderSync: ValidationException - Required field 'shipping_address' is null for order ORD-2024-78432

User Activity Timeline

12 AM 3 AM 6 AM 9 AM 12 PM

HCM Access

Data Egress

VCN Flow

Correlated Evidence

Fusion Audit Logs

User: payroll_admin_jsmith accessed HCM_EMPLOYEES table

VCN Flow Logs

Unusual data egress to external IP: 203.0.113.42

Log Analytics ML

Classified as "Anomalous Access Pattern" (confidence: 94%)

ESS Job Schedule Heatmap

GL Post

AP Aging

AR Aging

BIP Report STUCK

Inv Close

FA Depr

GL Consol

Report 1

Report 2

PO Close

SO Sync

Data Load

Job Details: BIP_FINANCIAL_CLOSE_REPORT

Status: WAITING

Waiting For: External File (FTP)

Expected File: /incoming/bank_reconciliation_jan2024.csv

Root Cause: OIC Integration flow failed - FTP server unreachable


[2024-01-31 02:34:17] BIP_REPORT: Waiting for input file...
[2024-01-31 02:34:17] OIC_FTP_ADAPTER: Connection timeout to ftp.partner-bank.com:21
[2024-01-31 02:34:17] OIC_FTP_ADAPTER: Retry 1/3 failed
[2024-01-31 02:34:47] OIC_FTP_ADAPTER: Retry 2/3 failed
[2024-01-31 02:35:17] OIC_FTP_ADAPTER: Retry 3/3 failed - giving up

Integrations & Deep Dives

One place for Prometheus, Fusion, OIC, MCP, LoganAI, and third-party incident workflows.

Deep Dive Hub

Design the path from OCI telemetry to action.

Use this module when the question is not “which OCI service exists?” but “how do I wire the platform into the rest of the estate?” It brings together ingestion, packaged-app monitoring, AI copilots, and incident-response handoffs.

Service Connector Hub MCP Reference

Primary Paths 4 integration lanes

Telemetry ingestion, enterprise apps, AI copilots, and incident escalation.

Deep Dive Outcome Faster pivots

Move from metric spike to parser, trace, or responder handoff without hunting across modules.

Choose Your Starting Surface

If the team knows the symptom but not the OCI service, start here and then pivot into the deeper lanes below.

Telemetry Ingestion

Prometheus → OCI Log Analytics

Use Management Agent collection or remote-write pipelines to get Prometheus data into the OCI observability plane, then enrich that data with LoganAI, clustering, and incident correlations.

Prometheus Management Agent Log Analytics Monitoring

Best for OKE, app runtimes, and custom service SLOs that need shared dashboards.

Pair with Monitoring alarms for thresholds and Log Analytics for root-cause drilldowns.

Prometheus Ingestion Security & Log Analytics Lab

Enterprise Applications

Fusion, OIC, and packaged-app observability

Bring together Fusion audit/activity data, OIC activity streams, Log Analytics parsers, APM browser/server telemetry, and database insights to debug business flows instead of isolated services.

Fusion OIC APM Log Analytics

Start in Fusion & OIC for storyline-driven troubleshooting, then use this lane for the concrete integration patterns.

Use custom parsers and activity APIs where out-of-box telemetry is incomplete.

Fusion Guide OIC Dashboard Pattern

AI Copilots

MCP, LoganAI, and LLM-assisted investigations

Expose observability domains through MCP so copilots can query live logs, metrics, AWR, and Cloud Guard context instead of relying on pre-baked answers.

User Query LLM Client MCP Server OCI APIs

Use LoganAI for natural-language log exploration and MCP for cross-system orchestration.

Keep the AI module for demo conversations; use this lane for concrete implementation choices.

OCI GenAI MCP Protocol

Incident Response

Notifications, Slack, PagerDuty, and responders

Close the loop by sending OCI Monitoring or Log Analytics signals into the systems teams already use for triage, escalation, and remediation.

Monitoring Notifications Slack / PagerDuty Functions

Use Notifications for fan-out, Functions for enrichment, and Cloud Guard responders for automatic containment.

Keep escalation metadata consistent so on-call can pivot back into OCI without context loss.

Notifications Cloud Guard

Reference Blueprints

Choose the integration shape that matches your operating model.

Platform Team

Central observability hub

Prometheus, service logs, Flow Logs, and Data Safe feed a shared Log Analytics tenancy. Monitoring drives alerts. MCP copilots read from curated observability APIs.

Log Analytics Monitoring MCP

Enterprise Apps

Fusion + OIC investigation lane

Business-process issues start in Fusion or OIC, pivot into parsers, trace evidence, and Database Management, and finish with a handoff to the application team or partner endpoint owner.

Fusion OIC DB Mgmt

On-call Operations

Alert-to-remediation loop

Monitoring alarms fan out through Notifications, Slack, or PagerDuty. Incident responders use LoganAI and MCP-backed copilots to summarize impact and trigger automations.

Notifications Slack PagerDuty

Recommended MCP Surfaces

Prioritize the server contracts that give users the highest leverage during investigation and remediation.

Log Analytics MCP

Run searches, summarize clusters, and pivot into anomaly investigations.

query_logs cluster_patterns summarize_window

Database Insight MCP

Expose AWR, SQL tuning, and performance-hub summaries to copilots.

get_awr_report analyze_sql list_wait_events

Fusion / OIC MCP

Normalize packaged-app evidence so business incidents can be explained in plain language.

list_failed_flows fetch_activity_logs trace_business_order

Responder MCP

Bridge observability signals into notifications, runbooks, and remediation workflows.

create_incident notify_channel trigger_function

One Tool to Rule Them All

AI-powered log analysis for any source, any format, any scale

Try:

2,847 log records 12 clusters identified 3 anomalies detected

847 CrashLoopBackOff

523 OOMKilled

412 ImagePullBackOff

289 Liveness probe failed

198 Connection refused

156 Pod scheduled

Cluster: CrashLoopBackOff (847 records)

Common Pattern: Back-off restarting failed container * in pod *

14:32:17 kube-system/coredns-7d89d9b847-x2jkl Back-off restarting failed container coredns in pod coredns-7d89d9b847-x2jkl

14:32:45 app-namespace/api-server-5f8c9d7b6-abc12 Back-off restarting failed container api-server in pod api-server-5f8c9d7b6-abc12

AI Insight

These CrashLoopBackOff errors correlate with a memory limit increase deployed at 14:15. Consider reverting the change or increasing pod memory limits.

ML-Powered Clustering

Automatically group similar log patterns without manual regex

Anomaly Detection

Spot unusual patterns and deviations from baseline behavior

Natural Language Queries

Ask questions in plain English, get actionable results

300+ Pre-built Parsers

Out-of-the-box support for OCI, Linux, databases, and more

LoganAI

AI-Powered Log Intelligence

LoganAI enables you to analyze logs using AI for faster investigation and troubleshooting.

📝

AI-Powered Summarization

Get concise summaries of large log datasets in plain language, dramatically reducing time-to-insight.

❓

Actionable Follow-up Questions

AI suggests relevant follow-up queries based on your current investigation context.

💡

User-Friendly Explanations

Complex log patterns explained in simple terms for faster understanding.

🔗

Correlation & Context

Automatically correlate related events across multiple log sources.

ML Analysis Capabilities

Unsupervised ML automatically categorizes logs into actionable insights

⚠️ Potential Issues 76

Clusters automatically identified as potential problems requiring attention

🔴 Outliers 12

One-time events that indicate anomalies in the system

📈 Trends 23

Cluster patterns correlated in time showing emerging issues

500+ Functional Checks With fix guidance for packaged apps

250+ Pre-built Sources Customizable for seamless operations

200+ Analytics Widgets Built-in dashboards for visualization

Log Analytics Use Cases

🔍

Security Investigation

Correlate audit logs, VCN flow logs, and application logs to investigate security incidents.

Audit Logs → Flow Logs → Correlation

🐛

Application Debugging

Use ML clustering to identify error patterns across distributed microservices.

Error Logs → Clustering → Root Cause

📊

Compliance Reporting

Generate compliance reports from centralized log data with scheduled searches.

Log Ingestion → Saved Searches → Reports

🚀

Performance Analysis

Analyze application and infrastructure logs to identify performance bottlenecks.

App Logs → Baseline → Anomaly Detection

The Digital Experience

End-to-end visibility from browser click to database query

🔄

"The Spinning Wheel of Death"

Users complain checkout is slow, but server CPU is only 20%

💻

"It works on my machine"

Devs can't reproduce errors that users in Brazil are seeing

Real User Monitoring - Global View

Configure RUM

< 200ms

200-500ms

> 500ms

Brazil - Performance Analysis

Avg Load Time 4.2s

Active Sessions 1,247

Error Rate 8.3%

Top Issue JavaScript Error

Issue: JavaScript Error on Checkout Button

Uncaught TypeError: Cannot read property 'submit' of undefined at checkout.min.js:847

This client-side error affects 73% of Brazilian users due to a CDN caching issue with the new checkout script.

Custom Attributes for RUM

Enrich your Real User Monitoring data with custom dimensions and metrics for deeper analysis.

Text Attributes (Dimensions)

Filter and group by user properties, page categories, or custom tags

                            window.apmrum.udfAttributes = [
  { name: 'userId', value: 'user@example.com' },
  { name: 'pageCategory', value: 'checkout' }
];
                        

Numeric Attributes (Metrics)

Aggregate and alert on custom numeric values like cart totals or item counts

                            window.apmrum.udfAttributes = [
  { name: 'cartTotal', value: 149.99 },
  { name: 'itemCount', value: 3 }
];
                        

Dynamic Attributes (Functions)

Execute functions on each span to capture real-time state

                            window.apmrum.udfAttributes = [
  { name: 'cartItems', value: () => getCartCount() }
];
                        

Attribute Limits

50 Max Attributes

50 Name Length

500 Value Length

Distributed Tracing - Transaction Waterfall

Service 0ms 500ms 1000ms 1500ms 2000ms 2500ms 3000ms

Browser

50ms

Load Balancer

10ms

Front-End API

200ms

Payment Gateway SLOW

3000ms

💡

Root Cause: The slowness is caused by the 3rd party payment provider, not your code. The Payment Gateway service is taking 3 seconds to respond.

Synthetic Monitoring - Availability Monitoring

Documentation

Proactively detect availability and performance issues before end users experience them using 8 different monitor types.

🌐

Browser

Monitor specific URLs and page load times

📜

Scripted Browser

Record and replay user journeys with multi-step transactions

⚡

REST API

Test REST endpoints with custom headers and payloads

🔌

Network

Monitor server/host availability via ping and port checks

🔍

DNS

DNS Server, DNS Trace, and DNSSEC validation

🗃️

SQL

Measure SQL query performance against Oracle databases

📁

FTP

Measure FTP operation performance and latency

🔗

Scripted REST

Multi-step API transaction workflows

Global Vantage Points - Live Status

🤖

🇺🇸 Virginia ✓ 234ms 2 min ago

🇩🇪 Frankfurt ✓ 189ms 2 min ago

🇯🇵 Tokyo ⚠ 523ms 2 min ago

🇧🇷 São Paulo ✗ Failed 2 min ago

Scripted browser runs login test every 5 minutes from global vantage points. Failures trigger immediate alerts.

Code-Level Diagnostics

Stack Trace for Trace #TR-2024-0131-4521 Captured at 14:32:17 UTC

com.example.checkout.PaymentService.processPayment PaymentService.java:247

                                    com.example.checkout.PaymentGateway.submitTransaction
                                    PaymentGateway.java:89
                                

com.example.http.HttpClient.post HttpClient.java:156

Captured SQL: SELECT * FROM payment_tokens WHERE user_id = ? AND status = 'ACTIVE' ORDER BY created_at DESC LIMIT 1 Execution: 2847ms

APM Data Sources & Integrations

View All Sources

Collect traces and metrics from multiple sources including language agents, open standards, and APIs.

OCI APM Agents

☕

Java Agent JVMs, application servers, Spring Boot

⬡

.NET Agent Windows & Linux, ASP.NET Core

🌐

Browser Agent Real User Monitoring for web apps

Open Standards

🔭

OpenTelemetry OTLP protocol support

🟡

Jaeger Jaeger client libraries

🔶

Zipkin Zipkin-compatible tracers

Additional Sources

⚙️

Trace Explorer API Direct trace data uploads

🤖

Synthetic Monitors Automated availability tests

📊

OpenTracing Java Tracer with JVM metrics

APM Use Cases

🔍

Root Cause Analysis

Drill down from slow page loads to the exact database query or API call causing issues.

Browser → API → Service → Database

🌍

Geographic Performance

Identify regional performance disparities with RUM and optimize CDN configuration.

RUM → Region Analysis → CDN Tuning

⏰

Proactive Monitoring

Detect outages before users report them with synthetic monitors running 24/7.

Synthetic → Alert → PagerDuty

📈

Release Validation

Compare performance metrics before and after deployments to catch regressions.

Deploy → Baseline → Compare

🔗

Microservices Debugging

Trace requests across dozens of services with distributed tracing and service maps.

Request → Trace → Dependencies

💰

SLA Monitoring

Track response times and error rates against SLA commitments with custom dashboards.

Metrics → Dashboard → Reports

Capacity Planning & Analytics

Operations Insights

Gain deep visibility into your database and host fleet performance. Operations Insights provides capacity planning, resource utilization analysis, and SQL analytics to help you optimize costs and ensure performance.

Documentation Try LiveLabs

Operations Insights

Capacity planning and resource analytics for databases and hosts

Capacity Planning

Forecast CPU, memory, and storage needs up to 30 days ahead with ML-powered predictions.

SQL Warehouse

Analyze SQL execution patterns across your entire database fleet in a centralized repository.

Exadata Insights

Deep visibility into Exadata systems including storage, compute, and network performance.

Host Insights

Monitor compute resource utilization patterns across your infrastructure fleet.

Resource Utilization Dashboards

Visualize CPU, memory, I/O and storage utilization trends across your database and host fleet.

Trend Analysis Heat Maps Top-N Reports

Demand Forecasting

Predict future resource requirements using machine learning based on historical patterns.

ML Predictions 30-Day Forecast Alerts

SQL Analytics

Identify problematic SQL statements and optimization opportunities across your database fleet.

SQL Insights Plan Changes Degradation Alerts

AWR Hub

Consolidate AWR data from multiple databases for centralized performance analysis.

Centralized AWR Cross-DB Analysis Long-term Storage

Capacity Planning Dashboard

ML-Powered Forecasting

Database Fleet CPU

67%

↑ 12% vs last week

Memory Utilization

72%

→ Stable

Storage Growth

2.4 TB

↑ 340 GB/month

Days Until Full

↓ Better than expected

SQL Warehouse

Centralized SQL analysis across your entire database fleet - identify problematic queries and optimization opportunities.

847K

SQL Statements

156

Degraded SQLs

Plan Changes

Databases

SQL ID	Database	Avg Elapsed (ms)	Executions
9xkz4wm7c2n3p	PROD_ERP_01	4,523	12,847
7fnh2kp9v4m8q	PROD_CRM_02	2,891	8,234
3bmr6yt1x9k2w	PROD_HR_01	1,756	24,891

Exadata Insights

Enterprise Feature

💾 Storage Server

Flash Cache Hit Ratio 98.7%

IOPS 45.2K

Throughput 3.2 GB/s

Smart Scan Offload 87%

🖥️ Compute Node

CPU Utilization 78%

Memory Used 512 GB

Active Sessions 234

DB Time 1,247s

🌐 InfiniBand Network

Network Bandwidth 12.4 GB/s

RDMA Latency 0.8 μs

Port Health All OK

Errors 0

Pro Tip: Enable OPSI for Cost Optimization

Operations Insights can help identify underutilized resources and right-sizing opportunities, potentially saving significant infrastructure costs. Enable it on your production databases to get actionable recommendations.

Fleet Management & Tuning

Database Management

Monitor and manage your Oracle Database fleet with a unified cloud service. Get Performance Hub diagnostics, automated tuning recommendations, and proactive alerting for your on-premises and cloud databases.

Documentation Try LiveLabs

Database Management

Unified monitoring and management for your Oracle Database fleet

Performance Hub

Real-time and historical performance analysis with ASH, AWR, and SQL monitoring in one place.

Fleet Summary

Single pane of glass view across all your managed databases with health status and alerts.

SQL Tuning

Get automated SQL tuning recommendations and implement fixes with a single click.

Job Scheduling

Schedule and monitor database jobs across your fleet with centralized management.

Managed Database Groups

Organize databases into logical groups for easier management and bulk operations.

Grouping Bulk Actions Policies

Proactive Alerting

Get notified about performance issues, space problems, and security concerns before they impact users.

Threshold Alerts Anomaly Detection Notifications

Tablespace Management

Monitor tablespace usage and get recommendations for optimization and space reclamation.

Usage Tracking Growth Trends Alerts

Optimizer Statistics

Manage optimizer statistics gathering and staleness across your database fleet.

Auto Gathering Staleness Check Recommendations

Supported Database Types

Database Management supports Oracle Autonomous Databases, Base DB Systems, Exadata Cloud Service, and on-premises Oracle Databases (11g, 12c, 18c, 19c, 21c, 23ai). Connect on-premises databases using Management Gateway or direct registration.

Performance Hub

PROD_ERP_01 (19c)

CPU

User I/O

Application Wait

Other

DB CPU

42.3%

↓ 8% from baseline

User I/O

28.1%

↑ 12% from baseline

Application Wait

18.4%

↑ 5% from baseline

Concurrency

11.2%

→ Stable

SQL Tuning Advisor

Automated recommendations to optimize SQL performance based on execution analysis.

📊

Create Index on ORDERS.CUSTOMER_ID

SQL ID: 7fnh2kp9v4m8q performs full table scan on ORDERS. Creating an index would reduce execution time by ~85%.

↓ 85% Response Time

🔄

SQL Profile Available

SQL ID: 3bmr6yt1x9k2w has a suboptimal execution plan. Accepting the SQL Profile would improve cardinality estimates.

↓ 62% CPU Usage

📈

Gather Statistics on INVENTORY

Table INVENTORY has stale statistics (last gathered 45 days ago). Fresh statistics would improve optimizer decisions.

↓ 40% Plan Changes

Fleet Overview

12 Databases

3 Alerts

PROD_ERP_01

Healthy

CPU 42%

Memory 67%

Storage 78%

PROD_CRM_02

Warning

CPU 78%

Memory 85%

Storage 62%

PROD_HR_01

Healthy

CPU 23%

Memory 45%

Storage 34%

DEV_TEST_01

Critical

CPU 92%

Memory 94%

Storage 89%

AWR Management

Automatic Workload Repository for historical performance analysis and diagnostics.

📸

Snapshot Management

Automatic snapshots every 30 minutes with 8-day retention. Configure custom intervals and retention periods.

📋

AWR Reports

Generate HTML/text AWR reports for any time range. Compare periods to identify performance changes.

🔍

ASH Analytics

Active Session History analysis for real-time and historical session activity visualization.

The Future Interface

Conversational observability powered by OCI GenAI and MCP

ℹ️

Interactive Demo

This is a demonstration with pre-built responses showcasing real-world troubleshooting scenarios. The examples below represent actual use cases that can be implemented using OCI GenAI and MCP servers for your observability workflows.

Hello! I'm your OCI Observability AI Assistant. I can help you analyze logs, investigate incidents, and provide insights across your entire observability stack. What would you like to explore?

Why is my EBS system running slow today?

I've analyzed your EBS environment across multiple data sources. Here's what I found:

🔴

Database Contention Detected

The GL_BALANCES table has 47 concurrent sessions waiting on row locks. This correlates with the month-end GL posting job that started at 08:00.

🟡

Concurrent Manager Queue Depth

Standard Manager queue has 234 pending requests vs. typical 45. Consider increasing parallelism for FNDCM.

🔵

Forms Session Spike

Active Forms sessions: 847 (normal: ~400). Likely due to end-of-month user activity.

Recommendation: The primary bottleneck is the GL posting job. I suggest pausing non-critical concurrent requests until the posting completes, or consider partitioning the GL_BALANCES table for future relief.

Real-World Troubleshooting Scenarios

These examples demonstrate how AI-powered observability can accelerate incident resolution across different OCI services.

🗄️ Database Management

"Why is my Autonomous Database running slow queries?"

AI Analysis: Detected 3 SQL statements consuming 78% of DB time. Top offender: SELECT * FROM ORDERS WHERE status='pending' - missing index on status column. AWR shows buffer cache hit ratio dropped to 67%.

Recommended Actions: Create index on ORDERS(status), increase SGA_TARGET, consider query rewrite with pagination.

Database Management Performance Hub AWR

🔐 Security Investigation

"Show me failed SSH login attempts from the last 24 hours"

AI Analysis: Detected 2,847 failed SSH attempts across 12 hosts. 94% originated from 3 IP ranges (185.x.x.x, 45.x.x.x, 91.x.x.x). Peak activity: 03:00-05:00 UTC. Target: root account (89%), admin (11%).

Recommended Actions: Block identified IP ranges via NSG, enforce key-based auth, consider fail2ban or Cloud Guard responder.

Log Analytics Cloud Guard Security Zones

⚡ APM & Tracing

"Users report checkout page is timing out"

AI Analysis: Trace analysis shows /api/checkout avg response time: 12.4s (SLA: 2s). Root cause: payment-service → external-gateway call timing out. 3rd party API latency increased from 200ms to 8s starting 14:30 UTC.

Recommended Actions: Implement circuit breaker pattern, add timeout with fallback, contact payment provider about degradation.

APM Distributed Tracing Synthetic Monitoring

☸️ Kubernetes (OKE)

"Why are my pods constantly restarting in production?"

AI Analysis: Pods in prod/api-deployment restarted 47 times in 2 hours. Cause: OOMKilled (memory limit 512Mi, actual usage peaks at 680Mi). Memory leak pattern detected in heap dumps correlating with connection pool growth.

Recommended Actions: Increase memory limit to 1Gi, investigate connection pool leak in API service, enable HPA for automatic scaling.

Log Analytics Monitoring Stack Monitoring

🌐 Network & Connectivity

"Application can't connect to database after VCN changes"

AI Analysis: VCN Flow Logs show REJECT for TCP/1521 from subnet-app (10.0.1.0/24) to subnet-db (10.0.2.0/24). Security List sl-app-subnet missing egress rule for DB port. Change made by user@example.com at 09:15 UTC.

Recommended Actions: Add egress rule to sl-app-subnet allowing TCP/1521 to 10.0.2.0/24, or use NSG for more granular control.

VCN Flow Logs Log Analytics Audit Logs

👥 Fusion Applications

"Payroll processing jobs are failing in Fusion HCM"

AI Analysis: ESS job PYUGEN failed for 3 consecutive runs. Error: "Unable to acquire lock on PAY_PAYROLL_ACTIONS". Concurrent job ARCHIVE_PAYMENTS holding lock for 4+ hours. Data volume increased 340% due to new hire batch.

Recommended Actions: Terminate stuck archive job, increase UCM storage allocation, schedule archive jobs during off-peak, partition payroll tables by period.

Log Analytics Fusion HCM Database Management

🔗 Integration Cloud (OIC)

"Why are my OIC integrations failing with timeout errors?"

AI Analysis: Integration SYNC_ORDERS_TO_ERP failed 156 times today. 89% timeout on REST invoke to on-prem endpoint. Network latency to FastConnect increased from 15ms to 890ms. BGP route flapping detected on customer edge router.

Recommended Actions: Check FastConnect partner status, increase OIC connection timeout to 120s, implement async pattern with callback for large payloads.

OIC Log Analytics FastConnect

🛡️ Data Safe & Compliance

"Show me unusual database activity that might indicate a breach"

AI Analysis: Anomaly detected: User APP_SERVICE exported 2.3M rows from CUSTOMERS table at 02:14 UTC (normal: <1000 rows/day). Same user accessed 47 tables in HR schema they never accessed before. Login from new IP: 203.x.x.x (geolocation: unexpected country).

Recommended Actions: Immediately revoke APP_SERVICE credentials, review exported data scope, enable Data Safe SQL Firewall, investigate compromised credentials source.

Data Safe Activity Auditing Cloud Guard

Model Context Protocol (MCP) Integration

Connect your AI assistant to live OCI data through secure, standardized protocols.

👤

User Query

→

🧠

LLM + MCP Client

→

⚡

MCP Servers

→

☁️

OCI APIs

Log Analytics MCP

Query logs, run ML analysis, detect anomalies

logan_execute_query logan_detect_anomalies

Database Observatory MCP

AWR reports, SQL tuning, performance analysis

get_awr_report analyze_sql

Security MCP

Cloud Guard problems, vulnerability scans

list_problems get_security_score

OCI Core MCP

Compute, networking, observability metrics

list_instances get_metrics