portal.octodemo.cloud

OCI Observability, organized around the work your teams need to do.

Use this launchpad to choose the right OCI service, understand how the platform fits together, and move from overview to hands-on guidance across Monitoring, Log Analytics, APM, Database Management, enterprise applications, and integrations.

Start Here

A guided landing page for operators, developers, DBAs, security teams, and decision makers using OCI Observability.

portal.octodemo.cloud launchpad

Find the shortest route from “something looks wrong” to the OCI capability that answers it.

The platform already has the depth. This homepage changes the starting point: begin with a role, follow a troubleshooting workflow, or use the command palette when you know exactly where you want to go.

Platform teams App owners Security teams DBAs Executives
Quick jump
Role-based paths Keyboard-first navigation Same shortcut on every module
Start with you

Tailor the launchpad to your role and industry

Pick who you are and where you work — we'll highlight the levels, services, and the lens to start from. Access to every view is governed by OCI IAM: each persona maps to an OCI Group with policies scoped to the right compartments.

I am a…

In…

Follow the path

From govern to multitenancy — in order

Each level builds on the last. Jump straight to the launchpad module that delivers it, or open the full ladder in the guide.

Persona view

How each role identifies and uses the data

Same eight personas as the guide — what each one looks for, and the OCI services they live in.

Quick jump

Recommended starting points

OCI Observability Implementation Hub

Your step-by-step path to complete cloud-native observability, from business value to code

Strategic Core Objective

Unified Log Management & Governance

Establish a centralized, auditable record of all cloud and on-premise actions. Ingest, parse, and retain log streams to satisfy global compliance standards (ISO 42001, SOC2, GDPR) while laying the foundation for threat hunting.

100% Audit & Compliance Readiness
Zero Siloed Application Log Files
50% Storage Cost Reduction via Lifecycle Rules
Business Impact

Consolidation & Risk Mitigation

Consolidating disparate logs into OCI Log Analytics removes blind spots, reduces compliance liabilities, and eliminates the cost of managing scattered log storage servers.

  • Cross-Environment Visibility: Unified view across VMs, OKE, on-prem, and external clouds.
  • Automated Ingestion: Zero-maintenance managed ingestion pipeline.
  • Compliance Guardrails: Encrypted, read-only indexes with lifecycle controls.

Onboarding Management Agents & Logging (Phase 1)

1

Set Up IAM Groups & Policies

Create dedicated OCI Compartments and write IAM policies granting the Management Agent permission to upload telemetry.

2

Deploy Unified Host Agents

Deploy the OCI Management Agent across your VM fleets and bare-metal nodes to collect system logs and OS metrics.

3

Enable Log Groups & Parsers

Register Log Groups, configure log collection paths, and apply built-in or custom parsers (JSON, Syslog, WebLogic).

4

Establish Retention Rules

Define automated Object Storage archival rules to transition older logs from hot query pools to cost-effective cold tiers.

Fluentd Open-Source Log Shipping Integration

Use industry-standard open source collectors like Fluentd to securely forward application log files to OCI Log Analytics endpoints.

Fluentd Configuration (td-agent.conf)
<source>
  @type tail
  path /var/log/app/app.log
  pos_file /var/log/td-agent/app.log.pos
  tag oci.loganalytics.portal
  <parse>
    @type json
    time_key timestamp
    time_format %Y-%m-%dT%H:%M:%S.%L%z
  </parse>
</source>

<match oci.loganalytics.**>
  @type oci_log_analytics
  namespace <TENANCY_NAMESPACE>
  log_group_id ocid1.loggroup.oc1.eu-frankfurt-1.aaaaaaaaxf...
  log_source_name OCI_APM_Trace
  co_id ocid1.compartment.oc1..aaaaaaaasx...
  region eu-frankfurt-1
  auth_type instance_principal
</match>

Developer Checklist

  • Verify Agent State: Ensure `mgmt_agent` daemon is active and configured with a valid registration key.
  • Log Ingestion: Verify logs appear under OCI Log Analytics Trace Source using the `Search` panel.
  • Metadata Tagging: Enrich Fluentd records with `host.name` and `app.environment` fields for multi-dimensional filtering.
Strategic Core Objective

Operational Telemetry & SLO Alarms

Establish metric baselines and automated alert gates. Move from a reactive fire-fighting posture to proactive, anomaly-driven operations, protecting system availability before outages reach end users.

90% Reduction in MTTD / MTTR
Zero Silent Application Outages
100% Service Level Objective (SLO) Coverage
Business Impact

Proactive Uptime & Alert Lifecycle

Centralized alerting connects OCI metrics to on-call schedules, minimizing disruption and driving strict accountability via automated SLA/SLO reporting.

  • Static & Anomaly Alarms: Trigger alerts on simple thresholds or ML-derived baseline deviations.
  • Unified Connector Pipelines: Stream alarms to Slack, Teams, or ITSM tools automatically.
  • Capacity Guards: Predict CPU and storage constraints before they cause outages.

Alert Automation & SLO Design (Phase 2)

1

Define SLIs and SLOs

Align platform SLIs (latency, error rate) with business outcomes. Establish acceptable SLO budgets (e.g. 99.9% uptime).

2

Deploy Metric Streams

Publish custom application and system metrics to the secure OCI Monitoring service namespace.

3

Create Alarms & Notification Topics

Create OCI Monitoring alarms and bind them to OCI Notification topics representing on-call responder groups.

4

Wire Service Connectors

Configure Service Connector Hub to route critical alarms directly to PagerDuty or Slack webhooks.

OCI Monitoring MQL & Prometheus Scraping

Configure OCI Monitoring alarms using Metric Query Language (MQL), or pull metrics directly into an open-source Prometheus/Grafana stack using OCI Management Agent endpoint exporters.

OCI MQL Expression
# Trigger alert if the HTTP error rate exceeds 5% for three consecutive 1-minute intervals
HttpRequests[1m]{resourceId = "ocid1.instance.oc1..."}.status_code != "200"
| stats count() as ErrorCount
| HttpRequests[1m]{resourceId = "ocid1.instance.oc1..."}.stats(count) as TotalCount
| (ErrorCount / TotalCount) * 100 > 5
Prometheus YML Scrape block
scrape_configs:
  - job_name: 'oci-management-agent-metrics'
    scrape_interval: 15s
    metrics_path: '/metrics'
    static_configs:
      - targets: ['10.0.2.15:9100'] # IP of the node running OCI Management Agent exporter
    metric_relabel_configs:
      - source_labels: [__name__]
        regex: 'oci_computeagent_(.*)'
        target_label: __name__

Developer Checklist

  • Dimension Matching: Alarms must specify the correct compartment and resource OCID dimension filters.
  • Testing Alarms: Trigger metric anomalies (e.g. stress test CPU) to verify the PagerDuty responder loop.
  • OSS Exporter: Secure your local Prometheus endpoint with basic auth or network security lists (NSGs).
Strategic Core Objective

Distributed Tracing & Database Performance Hub

Pinpoint distributed latency bottlenecks and transaction-level query blocks. By tracing requests across microservices and cloud databases, you prevent slow page response rates from driving away active users.

< 2s Average End-User Page Load (RUM)
40% Database CPU Optimization
100% Transaction Trace Visibility
Business Impact

Browser-to-DB Visibility & EBS Modernization

APM Browser RUM tracks actual user transaction latency, while DB Management provides DBA dashboards that replace archaic database maintenance CLI workflows.

  • W3C Context Tracing: Follow span context headers from browser to application and database.
  • EBS & Fusion Lane: Instrument with OCI APM; Stack Monitoring's capabilities are merging into OCI Monitoring.
  • DB Performance Hub: Real-time AWR charts, ASH diagnostics, and SQL tuning advisors.
🔀
Stack Monitoring is converging into OCI Monitoring

OCI Stack Monitoring's capabilities — full-stack discovery, topology, and application-stack health — are being merged into the OCI Monitoring service. Existing investment carries forward; there is no need to replace it with separate services.

APM & Database Management Migration Roadmap (Phase 3)

1

Create OCI APM Domains

Provision OCI APM domains and retrieve your Public Datakeys (for browser RUM) and Private Datakeys (for server agents).

2

Instrument Browser RUM

Inject the APM Browser RUM JS agent into main templates to record end-user page speed and AJAX performance.

3

Configure OTel Backends

Deploy OpenTelemetry backend instrumentation across your microservices to stream tracing spans to the APM collector.

4

Register Databases to DB Mgmt

Register Autonomous, Cloud, and External DBs into OCI Database Management for Performance Hub and AWR metrics.

OpenTelemetry distributed tracing context propagation

Propagate tracing context across microservices using W3C Trace Context HTTP headers, and capture execution spans for SQL queries.

OpenTelemetry Span Tracing
from opentelemetry import trace

tracer = trace.get_tracer("demo-portal")

def get_treasury_records(db_conn, search_term):
    # Start a span to trace this transaction
    with tracer.start_as_current_span("treasury.search") as span:
        span.set_attribute("db.system", "mssql")
        span.set_attribute("db.statement", f"SELECT * FROM treasury WHERE name = '{search_term}'")
        
        # Verify SQL injection attempt (Level 3 security attribute tracking)
        if "'" in search_term:
            span.set_attribute("security.attack.type", "sqli")
            span.set_attribute("security.attack.severity", "critical")
            
        cursor = db_conn.cursor()
        cursor.execute("SELECT * FROM treasury WHERE name = %s", (search_term,))
        return cursor.fetchall()
W3C Trace Context Propagation
import httpx
from opentelemetry import trace
from opentelemetry.propagate import inject

def call_inventory_service():
    # Inject current span context into W3C traceparent headers
    headers = {}
    inject(headers)
    
    # headers will contain:
    # 'traceparent': '00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01'
    
    response = httpx.get("http://inventory.service/items", headers=headers)
    return response.json()

Developer Checklist

  • W3C Header Support: Verify HTTP client and proxy elements do not strip `traceparent` or `tracestate` headers.
  • Database Diagnostics: Log query execution latencies and cross-reference them with Database Performance Hub session IDs.
  • Secure Data Keys: Ensure APM private datakeys are kept secure and loaded via environment variables, not hardcoded in source.
Strategic Core Objective

Operations Insights & GenAI Troubeshooting

Apply machine learning and LLMs to infrastructure diagnostic datasets. Predict database and compute capacity boundaries, cluster raw logs to identify anomalies, and let AI agents summarize complex incidents using natural language.

99% Accuracy in Capacity Forecasting
95% Log Clustering Anomaly Detection
Minutes Incident Diagnostics Cycle Uptime
Business Impact

AIOps & Secure AI Agent Operations

Centralized AI diagnostic services enable non-expert engineers to query cluster metrics, analyze log patterns, and predict scaling blockages safely via secure Model Context Protocol (MCP) bridges.

  • Operations Insights Capacity: 30/90-day Exadata database storage and CPU growth projections.
  • LoganAI Clustering: Condenses millions of raw log entries into a few dozen anomaly clusters automatically.
  • AIOps MCP Engine: Securely binds LLMs to read-only OCI APIs to answer operator questions.

Enabling Capacity Forecasting & AIOps (Phase 4)

1

Enable OCI Operations Insights

Register your databases and host compartments into Operations Insights to begin 30-day capacity trend tracking.

2

Set Up Log Analytics Cluster Rules

Enable LoganAI machine learning models to cluster recurring log lines and highlight anomalous spikes.

3

Configure MCP Server

Deploy the Model Context Protocol (MCP) server container, mounting secure OCI API read-only credentials.

4

Connect LLM Chat Copilots

Integrate your corporate chat client or development environment with the MCP server to start querying OCI.

AIOps Model Context Protocol & Custom OCL

Configure an MCP server schema to connect LLMs to OCI APIs, or use OCI Log Analytics Query Language (OCL) with machine learning modifiers like `cluster` to isolate log anomalies.

MCP Server Tool Configuration Schema
{
  "mcpServers": {
    "oci-observability": {
      "command": "python",
      "args": ["-m", "mcp_server.main"],
      "env": {
        "OCI_COMPARTMENT_OCID": "ocid1.compartment.oc1..aaaaaa...",
        "OCI_AUTH_MODE": "instance_principal",
        "OCI_LOG_ANALYTICS_NAMESPACE": "tenancy_namespace"
      }
    }
  }
}
OCI OCL Query
# Cluster application logs over the last 24 hours to isolate warning and error anomalies
'Log Source' = 'OCI APM Trace'
| where 'Entity Type' = 'Host'
| cluster
| where 'Cluster Type' = 'Anomaly'
| stats count() as Occurrences by 'Cluster Content'
| sort - Occurrences

Developer Checklist

  • Cluster Settings: Ensure Log Analytics entity mappings are complete so log clusters link correctly to host names.
  • MCP Permissions: Apply strict IAM policies limiting the MCP API credentials to read-only metadata calls.
  • Trend Accuracy: Operations Insights require a minimum of 14 days of metrics data before displaying capacity forecasting charts.
Common journeys

Use the product map that matches how people actually troubleshoot.

These flows turn the service catalog into a user journey: observe, investigate, explain impact, and automate the next step.

Recommended first stop

Learn the platform through workflows, not product names.

New users usually do better when they start with one goal: establish health, investigate an incident, tune an application, or connect observability to the rest of the estate.

Drag nodes to rearrange
Cloud Guard
OCI O&M Platform
⋮⋮
Monitoring Metrics & Alarms
⋮⋮
Merging → OCI Monitoring
Stack Monitoring Full Stack Discovery
⋮⋮
APM Tracing & Synthetics
⋮⋮
Log Analytics AI/ML Analysis
⋮⋮
Ops Insights Capacity Planning
⋮⋮
Database Mgmt Fleet Management
OCI Services Integration
Service Connector Hub
Events
Notifications
OCI Functions
Java Management
Streaming
Data Sources
Management Agents
OCI Logging
VCN Flow Logs
Service Logs
Audit Logs
Object Storage
2.4M Metrics/min
847 GB Logs Analyzed
156ms Avg Response
99.7% Security Score

OCI Observability Services

Comprehensive tools for monitoring, analysis, and management

OCI Monitoring
Free Tier

Monitoring

Collect and visualize metrics from OCI resources with customizable dashboards and intelligent alerting.

500M ingested datapoints/month
Metrics Explorer & Dashboards
Alarms & Notifications
Custom Metrics (pay per use)
Metrics Explorer Custom Dashboards Alarms Notifications
OCI Log Analytics
Free Tier

Log Analytics

AI-powered log analysis with ML clustering, anomaly detection, and 300+ pre-built parsers for any log source.

10 GB Active Storage/month
300+ Pre-built Parsers
LoganAI (NLP Queries)
Extended Storage & Retention
ML Clustering Anomaly Detection 300+ Parsers Link Analysis
OCI Application Performance Monitoring
Free Tier

Application Performance Monitoring

End-to-end visibility from browser to database with distributed tracing and real user monitoring.

1000 Trace Events/hour
Basic Distributed Tracing
Synthetic Monitoring
Real User Monitoring (RUM)
Distributed Tracing Real User Monitoring Synthetic Monitoring
Merging into OCI Monitoring View Announcement →
OCI Stack Monitoring
→ OCI Monitoring

Stack Monitoring Merging

Auto-discover and monitor your full application stack. Its capabilities are being merged into the OCI Monitoring service.

No Free Tier Available
Auto Discovery & Topology
EBS & Fusion Support
Auto Discovery Topology View EBS Support
OCI Operations Insights
Enterprise

Operations Insights

Analyze resource utilization, forecast capacity needs, and identify optimization opportunities across your fleet.

No Free Tier Available
Capacity Planning & Forecasting
SQL Warehouse Analytics
Exadata & Host Insights
Capacity Planning SQL Warehouse Exadata Insights Host Insights
OCI Database Management
Free Tier

Database Management

Comprehensive fleet management with Performance Hub, SQL tuning advisors, and proactive problem detection.

Basic Fleet Overview
Basic Metrics & Monitoring
Performance Hub & AWR
SQL Tuning Advisor
Fleet Summary Performance Hub SQL Tuning Job Scheduling

Service Tier Comparison

Free tier vs. paid capabilities at a glance

Service
Free Tier
Monitoring Monitoring
500M datapoints/month, Alarms, Dashboards
Log Analytics Log Analytics
10 GB/month, 300+ Parsers, ML Clustering
APM APM
1000 Traces/hr, Basic Distributed Tracing
Database Management Database Mgmt
Basic Fleet View, Core Metrics
Operations Insights Ops Insights
No free tier available
Stack Monitoring Stack Monitoring → Monitoring
No free tier available
Use the Service Tier toggle in the sidebar to see how capabilities change between free and paid tiers.

Getting Started

Begin your observability journey in minutes

1

Enable Services

Navigate to the OCI Console and enable the Observability services you need. Most services are available in all OCI regions and can be enabled in minutes with just a few clicks.

2

Deploy Management Agents

Install the unified Management Agent on your compute instances to collect metrics, logs, and traces. The agent supports Linux, Windows, and Solaris operating systems.

3

Configure Data Collection

Set up log sources, enable service connectors, and configure APM instrumentation for your applications. Use pre-built integrations or create custom configurations.

4

Build Dashboards & Alerts

Create custom dashboards to visualize your data and set up alarms to get notified when something needs attention. Use saved searches and scheduled reports for ongoing monitoring.

OCI 2025 Observability Professional Certification

Validate your expertise with the official Oracle certification (1Z0-1111-25)

OCI Observability Professional 1Z0-1111-25

Target Audience

Site Reliability Engineers, Cloud Operations, Application Developers, Cloud Administrators, and roles involved in monitoring and troubleshooting OCI environments.

The 7 Pillars of OCI Observability

Exam objectives mapped to core competency areas

7%
Define the Pillars of Observability

Summarize OCI Observability and Management Services portfolio and understand how they work together.

Platform Overview Service Integration
18%
Monitor Cloud Environments

Demonstrate OCI Monitoring service with metrics and alarms. Configure alarm definitions using best practices.

Metrics Explorer Alarm Configuration Custom Metrics
10%
Respond to Cloud Changes

Analyze OCI Events Service concepts. Determine event structure, types, rules, and integration with OCI services.

Events Service Event Rules Notifications
16%
Centrally Manage Log Data

Distinguish log categories and enable log collection. Work with OCI Logging for centralized log management.

OCI Logging Service Connector Hub Log Groups
20%
Application Performance Monitoring

Monitor applications with deep visibility into end-user experience using distributed tracing, RUM, and synthetics.

Distributed Tracing Real User Monitoring Synthetic Tests
7%
Monitor Distributed Components

Monitor distributed components of an application stack with full-stack observability and topology views.

Stack Monitoring Topology Discovery Dependencies
💡 Drag and drop the cards above to reorder by your learning priority. Your order is saved automatically.

End-to-End Workflow Use Cases

Real-world scenarios showcasing how OCI Observability services work together

🔥

Incident Investigation & Root Cause Analysis

From alert to resolution in minutes

When a critical alarm fires, use OCI's integrated observability stack to quickly correlate metrics, logs, and traces to identify the root cause and reduce mean time to resolution (MTTR).

Alarm Fires
Check Metrics
Query Logs
Trace Request
Root Cause
1 Receive alert via OCI Monitoring alarm → Notifications → PagerDuty/Slack
2 View metric charts in Monitoring to identify the affected resource and timing
3 Pivot to Log Analytics to search for error patterns during the incident window
4 Use APM distributed tracing to follow the request path and identify the slow/failing service
5 Correlate with Database Management Performance Hub if database is suspected
Monitoring Log Analytics APM Database Management Notifications
🔐

Security Monitoring & Threat Detection

Proactive security with Cloud Guard + Log Analytics

Combine Cloud Guard's automated threat detection with Log Analytics' ML-powered anomaly detection to identify and investigate security incidents across your OCI tenancy.

Cloud Guard
Problem Detected
Audit Logs
VCN Flow Logs
Remediate
1 Cloud Guard continuously monitors your tenancy and detects security problems
2 Review problem details and risk score in Cloud Guard console
3 Analyze OCI Audit logs in Log Analytics to understand user activity timeline
4 Correlate with VCN Flow Logs to identify suspicious network traffic
5 Use Cloud Guard Responders to automatically remediate the issue
Cloud Guard Log Analytics OCI Audit VCN Flow Logs Notifications

Application Performance Optimization

End-to-end visibility from browser to database

Use APM's distributed tracing combined with Real User Monitoring (RUM) to identify performance bottlenecks across your entire application stack, from frontend to backend to database.

Browser (RUM)
API Gateway
Microservices
Database
1 Deploy APM agents to your applications (Java, Node.js, .NET, Python, Go)
2 Enable Real User Monitoring to capture actual user experience metrics
3 Analyze trace waterfalls to identify slow service calls and database queries
4 Use Synthetic Monitoring to proactively test critical user journeys
5 Correlate with Database Management for SQL-level performance insights
APM Real User Monitoring Synthetic Monitoring Database Management
📊

Capacity Planning & Resource Optimization

Predict needs and optimize costs with Operations Insights

Use Operations Insights to analyze historical resource utilization patterns, forecast future capacity needs, and identify optimization opportunities across your database and compute fleet.

Enable OPSI
Collect Data
Analyze Trends
Forecast
Right-size
1 Enable Operations Insights for your databases and compute hosts
2 Review CPU, memory, and storage utilization dashboards
3 Use ML-powered forecasting to predict when you'll hit capacity limits
4 Analyze SQL Warehouse for inefficient queries consuming resources
5 Identify underutilized resources for right-sizing opportunities
Operations Insights SQL Warehouse Database Management Host Insights
🐳

Kubernetes & Microservices Observability

Full-stack visibility for containerized workloads

Monitor your OKE (Oracle Container Engine for Kubernetes) clusters with comprehensive observability covering cluster health, pod metrics, application traces, and container logs.

OKE Cluster
Prometheus Metrics
Container Logs
APM Traces
1 Enable OCI Monitoring for OKE cluster metrics (CPU, memory, network)
2 Configure Fluentd or OCI Logging to collect container logs
3 Send logs to Log Analytics for ML-powered clustering and analysis
4 Deploy APM agents for distributed tracing across microservices
5 Create unified dashboards combining metrics, logs, and traces
Monitoring Log Analytics APM OCI Logging Service Connector
🗄️

Database Performance Troubleshooting

Deep dive into Oracle Database performance

Use Database Management's Performance Hub to analyze ASH, AWR, and real-time SQL monitoring data to identify and resolve database performance issues.

Slow Query Alert
Performance Hub
ASH Analysis
SQL Tuning
1 Enable Database Management for your Oracle Databases (on-prem or cloud)
2 Use Fleet Summary to identify databases with performance issues
3 Open Performance Hub to analyze ASH data and wait events
4 Identify problematic SQL statements using SQL Monitoring
5 Get SQL Tuning Advisor recommendations and implement fixes
Database Management Performance Hub SQL Tuning Advisor AWR
🏢

E-Business Suite Full Stack Monitoring

Complete visibility for Oracle EBS on OCI

Monitor your Oracle E-Business Suite deployment end-to-end using Stack Monitoring's auto-discovery capabilities combined with Log Analytics for application and database logs.

Apache/OHS
Forms/OAF
Concurrent Mgr
EBS Database
1 Install Management Agent on EBS application and database servers
2 Enable Stack Monitoring to auto-discover the EBS stack topology
3 Configure Log Analytics to collect CM logs, Apache logs, and DB alert logs
4 Enable Database Management for EBS database performance monitoring
5 Set up alarms for critical metrics (CM queue depth, form server health)
Stack Monitoring Log Analytics Database Management Management Agent
💰

Cost Optimization with Observability Data

Use telemetry to reduce cloud spend

Leverage observability data from Operations Insights and Monitoring to identify underutilized resources, right-size infrastructure, and optimize your OCI spend.

Collect Metrics
Analyze Usage
Identify Waste
Right-size
1 Enable Operations Insights across your database and compute fleet
2 Review resource utilization dashboards to identify underutilized assets
3 Use Monitoring metrics to identify idle or over-provisioned instances
4 Analyze SQL performance to identify queries that can be optimized
5 Implement right-sizing recommendations and track savings
Operations Insights Monitoring Cost Analysis Budgets

📸 Product Capabilities Showcase

Real screenshots from OCI Observability services

Kubernetes Cluster Summary Dashboard

Kubernetes Cluster Summary

Comprehensive view of cluster health, resource utilization, and workload status in Log Analytics

Log Analytics
Kubernetes Nodes Dashboard

Kubernetes Nodes Dashboard

Node-level metrics including CPU, memory, disk usage, and pod distribution across the cluster

Log Analytics
Kubernetes Workloads Dashboard

Workloads Dashboard

Deployment health, replica status, and container-level insights for all workloads

Log Analytics
Log Analytics Cluster View

Cluster Topology View

Visual representation of cluster topology with namespaces, workloads, and services

Log Analytics

🔒 Security & Compliance Use Cases

Security monitoring, threat detection, and compliance solutions

🛡️

Security Operations Center (SOC)

Centralized security monitoring with Log Analytics

Build a cloud-native Security Operations Center using OCI Log Analytics Security Fundamentals Dashboards. Aggregate and analyze security events across your entire OCI infrastructure with pre-built threat detection rules and compliance reporting.

Security Logs
Log Analytics
SOC Dashboards
Alerts
1 Enable OCI Audit Logs and VCN Flow Logs across all compartments
2 Import Security Fundamentals Dashboards from OCI GitHub quickstart
3 Configure OS-level security log collection (auditd, secure, syslog)
4 Set up scheduled queries for MITRE ATT&CK technique detection
5 Configure real-time alerts via Notifications and PagerDuty/Slack integration
Log Analytics Cloud Guard Audit Logs VCN Flow Logs
🔥

WAF Security Monitoring

Web Application Firewall analytics and threat intelligence

Monitor OCI Web Application Firewall traffic and security events using Log Analytics. Visualize attack patterns, blocked requests, and protection rule effectiveness with pre-built WAF dashboards.

Web Traffic
OCI WAF
Log Analytics
Threat Intel
1 Enable WAF logging to OCI Logging service
2 Configure Service Connector to stream WAF logs to Log Analytics
3 Import pre-built WAF dashboards for traffic and threat analysis
4 Set up geolocation-based attack pattern detection
5 Configure alerts for SQL injection, XSS, and bot attacks
WAF Log Analytics Service Connector Hub Notifications
🌐

API Gateway Monitoring

API traffic analysis and performance insights

Gain deep insights into OCI API Gateway traffic patterns, latency metrics, error rates, and usage analytics. Create custom dashboards to monitor API consumers, throttling events, and backend response times.

API Requests
API Gateway
Log Analytics
API Insights
1 Enable access and execution logging on API Gateway deployments
2 Configure Service Connector Hub to stream logs to Log Analytics
3 Create dashboards for latency percentiles, error rates, and throughput
4 Monitor API consumer patterns and rate limiting events
5 Set up alerts for backend failures and SLA breaches
API Gateway Log Analytics Monitoring Service Connector Hub

🗄️ Oracle Database Use Cases

Comprehensive observability and security solutions for Oracle Databases

🛡️

Data Safe Enabled

Free tier available, paid for non-OCI databases

Oracle Data Safe provides essential security capabilities for Oracle Databases including security assessments, user assessments, data masking, activity auditing, and sensitive data discovery. Available free for OCI databases or as a paid service for on-premises and non-OCI databases.

Oracle DB
Data Safe
Security Insights
Compliance
1 Register your Oracle Database with Data Safe (OCI, on-prem, or other clouds)
2 Run Security Assessment to identify vulnerabilities and misconfigurations
3 Perform User Assessment to analyze user privileges and risky configurations
4 Enable Activity Auditing to track database operations and user activities
5 Use Sensitive Data Discovery to locate PII and sensitive columns
Data Safe Security Assessment Activity Auditing Data Masking
📊

Data Safe → Log Analytics

Unified security analysis and compliance reporting

Stream Data Safe audit events to Log Analytics for advanced correlation, custom dashboards, and long-term retention. Combine database security events with infrastructure logs for comprehensive security posture analysis.

Oracle DB
Data Safe
Service Connector
Log Analytics
1 Enable Data Safe Activity Auditing for your Oracle Databases
2 Configure Service Connector Hub to stream audit data to Log Analytics
3 Create custom Log Analytics dashboards for security event visualization
4 Set up scheduled queries for compliance reporting (SOX, GDPR, HIPAA)
5 Configure alerts for anomalous database activities and policy violations
Data Safe Log Analytics Service Connector Hub Notifications
📈

Ops Insights + DB Management + Log Analytics

Complete Oracle Database observability stack

Achieve comprehensive Oracle Database observability by combining Operations Insights for capacity planning and SQL analytics, Database Management for performance monitoring, and Log Analytics for alert and audit log analysis.

Ops Insights
+
DB Mgmt
+
Log Analytics
=
360° View
1 Enable Database Management for real-time performance monitoring and Performance Hub
2 Enable Operations Insights for capacity planning and SQL Warehouse analytics
3 Configure Log Analytics to collect alert logs, listener logs, and audit trails
4 Use Fleet Summary for at-a-glance health across all databases
5 Create unified dashboards correlating performance metrics with log events
Operations Insights Database Management Log Analytics SQL Warehouse
🤖

Oracle DB MCP Server + GenAI

AI-powered troubleshooting and daily DBA tasks

Leverage the Oracle Database MCP (Model Context Protocol) server with Generative AI to enable natural language interactions for database troubleshooting, performance analysis, and automating routine DBA tasks through conversational interfaces.

DBA Query
GenAI + MCP
Oracle DB
Insights
1 Deploy Oracle Database MCP server connected to your database fleet
2 Configure GenAI agent (Claude, OCI GenAI) with MCP server access
3 Use natural language to query performance metrics and diagnose issues
4 Automate routine tasks: space analysis, user management, backup status
5 Generate AWR reports and performance recommendations via conversation
MCP Server OCI GenAI Database Management Claude Code
🔍

Log Analytics Enterprise Platform

OS Security, Prometheus, Data Safe, Network, Fusion HCM

Leverage OCI Log Analytics as a unified platform for OS security logs, Prometheus metrics ingestion, Data Safe audit events, network flow logs, and Oracle Fusion HCM application logs. Enhanced with LoganAI for intelligent analysis and Logan MCP server for 3rd party LLM integration.

Multi-Source
Log Analytics
LoganAI
3rd Party LLMs
1 Configure OS security log collection (auditd, syslog, secure logs)
2 Set up Prometheus metric ingestion via remote write or Service Connector
3 Stream Data Safe audit events and VCN Flow Logs to Log Analytics
4 Collect Oracle Fusion HCM logs via Management Agent or API integration
5 Deploy Logan MCP server for Claude/ChatGPT natural language log queries
Log Analytics LoganAI Logan MCP Server Prometheus Data Safe

APM for Oracle Fusion

Visual Builder apps and Fusion Cloud performance monitoring

Monitor Oracle Fusion Cloud applications including Visual Builder apps with OCI Application Performance Monitoring. Gain end-to-end visibility into user experience, transaction traces, and backend performance for your Fusion deployments.

Visual Builder
APM Browser
APM Tracer
Fusion Backend
1 Create APM Domain and obtain data upload endpoint and keys
2 Inject APM Browser Agent into Visual Builder applications
3 Configure Real User Monitoring (RUM) for end-user experience tracking
4 Enable server-side tracing for Fusion backend services if accessible
5 Create Apdex-based dashboards and alerts for SLA monitoring
APM Real User Monitoring Visual Builder Fusion Cloud
🔎

Log Analytics for Oracle Fusion

Fusion ERP, HCM, SCM & CX application, audit and security logs

Ingest Oracle Fusion Applications logs into OCI Log Analytics — application, audit, and security logs from Fusion ERP, HCM, SCM, and CX — using the BI Cloud Connector (BICC) export plus the Fusion Applications Observability solution. Parse, enrich, and correlate them with the OCI-side signals, then build saved searches and dashboards for functional and security teams.

Fusion Apps
BICC / Audit export
Log Analytics
Dashboards
1 Enable Fusion Applications audit and security log ingestion, or schedule a BICC extract to Object Storage
2 Use the predefined Fusion Applications log sources and parsers in OCI Log Analytics
3 Map entities to business services (payroll, order-to-cash, procurement) and tag by environment
4 Correlate Fusion logs with APM traces and Integration Cloud activity on a common timeline
5 Build saved searches, scheduled detections, and role-based dashboards for functional and security teams
Log Analytics Fusion HCM Audit Management Agent
🔗

Log Analytics for OIC

Oracle Integration Cloud monitoring and troubleshooting

Monitor Oracle Integration Cloud (OIC) flows, connections, and errors using Log Analytics. Collect integration activity logs, track message flows, and create dashboards for integration health and performance analysis.

OIC Flows
Activity Stream
Log Analytics
Insights
1 Enable OIC Activity Stream API or configure log export to Object Storage
2 Create custom Log Analytics parser for OIC activity log format
3 Set up Service Connector to ingest OIC logs into Log Analytics
4 Build dashboards for integration success/failure rates and latency trends
5 Configure alerts for integration failures and connection timeouts
Log Analytics OIC Service Connector Hub Notifications

OCI Monitoring Query Builder

Build and execute metrics queries with dynamic namespace and region selection

Query 1
root

No queries yet

Add a query and click "Run Query" to visualize metrics

Common Metric Namespaces

Quick reference for OCI metric namespaces

Compute

  • oci_computeagent - Agent-collected metrics
  • oci_compute_infrastructure_health - Instance health

Database

  • oci_autonomous_database - ADB metrics
  • oci_database - Base DB metrics
  • oci_mysql_database - MySQL HeatWave

Networking

  • oci_vcn - VCN flow metrics
  • oci_lbaas - Load Balancer
  • oci_service_gateway - Service Gateway

Storage

  • oci_objectstorage - Object Storage
  • oci_blockstore - Block Volume
  • oci_filestorage - File Storage

EBS Observability

Complete monitoring for Oracle E-Business Suite on OCI

Presentation Tier
Apache / OHS
Forms Server
oacore (OA Framework)
Application Tier
Concurrent Manager
Workflow Engine
PL/SQL APIs
Database Tier
EBS Database
CPU: 34% IO: 12ms Sessions: 247

Integrated Observability Stack

Stack Monitoring

Auto-discover and monitor the complete EBS stack with pre-built dashboards

Auto Discovery Topology View Alerts

Log Analytics

Collect and analyze EBS application logs, concurrent manager logs, and database alert logs

300+ Parsers ML Clustering Dashboards

APM for EBS

End-to-end transaction tracing from browser to database with RUEI integration

Real User Monitoring Traces RUEI

Database Management

Performance Hub, SQL Tuning, and fleet-wide database management

AWR Analysis SQL Tuning Fleet View

Try the LiveLabs Workshop

Hands-on experience with EBS Observability using OCI Stack Monitoring and Log Analytics

Launch Workshop

The Connected Enterprise

Observability for Oracle Fusion Applications and Integration Cloud

⚠️

Customer Crisis

"A customer's order vanished somewhere between Fusion ERP and our Order Management system. We have no idea where it failed."

Order-to-Cash Flow Visualization

Fusion ERP 1,247 orders
OIC 1,245 processed 2 failed
Order DB 1,243 stored 2 rejected
Log Analytics Insight

Root Cause Identified: Order #ORD-2024-78432 failed at OIC integration due to malformed JSON payload (missing "shipping_address" field).

ERROR [IntegrationFlow] OrderSync: ValidationException - Required field 'shipping_address' is null for order ORD-2024-78432
🔒

Security Alert

"A Payroll Admin suddenly downloaded 5,000 employee records at 3 AM. Was it a hack or a mistake?"

User Activity Timeline

12 AM 3 AM 6 AM 9 AM 12 PM
HCM Access
5,000 records accessed
Data Egress
847MB transferred
VCN Flow
Unusual outbound traffic

Correlated Evidence

Fusion Audit Logs
User: payroll_admin_jsmith accessed HCM_EMPLOYEES table
VCN Flow Logs
Unusual data egress to external IP: 203.0.113.42
Log Analytics ML
Classified as "Anomalous Access Pattern" (confidence: 94%)

Operations Alert

"Month-End Close is stalled. Everyone is blaming the Database, but the DBAs say it's fine."

ESS Job Schedule Heatmap

GL Post
AP Aging
AR Aging
BIP Report STUCK
Inv Close
FA Depr
GL Consol
Report 1
Report 2
PO Close
SO Sync
Data Load

Job Details: BIP_FINANCIAL_CLOSE_REPORT

Status: WAITING
Waiting For: External File (FTP)
Expected File: /incoming/bank_reconciliation_jan2024.csv
Root Cause: OIC Integration flow failed - FTP server unreachable
[2024-01-31 02:34:17] BIP_REPORT: Waiting for input file... [2024-01-31 02:34:17] OIC_FTP_ADAPTER: Connection timeout to ftp.partner-bank.com:21 [2024-01-31 02:34:17] OIC_FTP_ADAPTER: Retry 1/3 failed [2024-01-31 02:34:47] OIC_FTP_ADAPTER: Retry 2/3 failed [2024-01-31 02:35:17] OIC_FTP_ADAPTER: Retry 3/3 failed - giving up

Integrations & Deep Dives

One place for Prometheus, Fusion, OIC, MCP, LoganAI, and third-party incident workflows.

Deep Dive Hub

Design the path from OCI telemetry to action.

Use this module when the question is not “which OCI service exists?” but “how do I wire the platform into the rest of the estate?” It brings together ingestion, packaged-app monitoring, AI copilots, and incident-response handoffs.

Primary Paths 4 integration lanes

Telemetry ingestion, enterprise apps, AI copilots, and incident escalation.

Deep Dive Outcome Faster pivots

Move from metric spike to parser, trace, or responder handoff without hunting across modules.

Choose Your Starting Surface

If the team knows the symptom but not the OCI service, start here and then pivot into the deeper lanes below.

Telemetry Ingestion

Prometheus → OCI Log Analytics

Use Management Agent collection or remote-write pipelines to get Prometheus data into the OCI observability plane, then enrich that data with LoganAI, clustering, and incident correlations.

Prometheus Management Agent Log Analytics Monitoring
Best for OKE, app runtimes, and custom service SLOs that need shared dashboards.
Pair with Monitoring alarms for thresholds and Log Analytics for root-cause drilldowns.
Enterprise Applications

Fusion, OIC, and packaged-app observability

Bring together Fusion audit/activity data, OIC activity streams, Log Analytics parsers, APM browser/server telemetry, and database insights to debug business flows instead of isolated services.

Fusion OIC APM Log Analytics
Start in Fusion & OIC for storyline-driven troubleshooting, then use this lane for the concrete integration patterns.
Use custom parsers and activity APIs where out-of-box telemetry is incomplete.
AI Copilots

MCP, LoganAI, and LLM-assisted investigations

Expose observability domains through MCP so copilots can query live logs, metrics, AWR, and Cloud Guard context instead of relying on pre-baked answers.

User Query LLM Client MCP Server OCI APIs
Use LoganAI for natural-language log exploration and MCP for cross-system orchestration.
Keep the AI module for demo conversations; use this lane for concrete implementation choices.
Incident Response

Notifications, Slack, PagerDuty, and responders

Close the loop by sending OCI Monitoring or Log Analytics signals into the systems teams already use for triage, escalation, and remediation.

Monitoring Notifications Slack / PagerDuty Functions
Use Notifications for fan-out, Functions for enrichment, and Cloud Guard responders for automatic containment.
Keep escalation metadata consistent so on-call can pivot back into OCI without context loss.

Reference Blueprints

Choose the integration shape that matches your operating model.

Platform Team

Central observability hub

Prometheus, service logs, Flow Logs, and Data Safe feed a shared Log Analytics tenancy. Monitoring drives alerts. MCP copilots read from curated observability APIs.

Log Analytics Monitoring MCP
Enterprise Apps

Fusion + OIC investigation lane

Business-process issues start in Fusion or OIC, pivot into parsers, trace evidence, and Database Management, and finish with a handoff to the application team or partner endpoint owner.

Fusion OIC DB Mgmt
On-call Operations

Alert-to-remediation loop

Monitoring alarms fan out through Notifications, Slack, or PagerDuty. Incident responders use LoganAI and MCP-backed copilots to summarize impact and trigger automations.

Notifications Slack PagerDuty

Recommended MCP Surfaces

Prioritize the server contracts that give users the highest leverage during investigation and remediation.

Log Analytics MCP

Run searches, summarize clusters, and pivot into anomaly investigations.

query_logs cluster_patterns summarize_window

Database Insight MCP

Expose AWR, SQL tuning, and performance-hub summaries to copilots.

get_awr_report analyze_sql list_wait_events

Fusion / OIC MCP

Normalize packaged-app evidence so business incidents can be explained in plain language.

list_failed_flows fetch_activity_logs trace_business_order

Responder MCP

Bridge observability signals into notifications, runbooks, and remediation workflows.

create_incident notify_channel trigger_function

One Tool to Rule Them All

AI-powered log analysis for any source, any format, any scale

Try:
2,847 log records 12 clusters identified 3 anomalies detected
847 CrashLoopBackOff
523 OOMKilled
412 ImagePullBackOff
289 Liveness probe failed
198 Connection refused
156 Pod scheduled

Cluster: CrashLoopBackOff (847 records)

Common Pattern: Back-off restarting failed container * in pod *
14:32:17 kube-system/coredns-7d89d9b847-x2jkl Back-off restarting failed container coredns in pod coredns-7d89d9b847-x2jkl
14:32:45 app-namespace/api-server-5f8c9d7b6-abc12 Back-off restarting failed container api-server in pod api-server-5f8c9d7b6-abc12
AI Insight

These CrashLoopBackOff errors correlate with a memory limit increase deployed at 14:15. Consider reverting the change or increasing pod memory limits.

ML-Powered Clustering

Automatically group similar log patterns without manual regex

Anomaly Detection

Spot unusual patterns and deviations from baseline behavior

Natural Language Queries

Ask questions in plain English, get actionable results

300+ Pre-built Parsers

Out-of-the-box support for OCI, Linux, databases, and more

LoganAI

AI-Powered Log Intelligence

LoganAI enables you to analyze logs using AI for faster investigation and troubleshooting.

📝

AI-Powered Summarization

Get concise summaries of large log datasets in plain language, dramatically reducing time-to-insight.

Actionable Follow-up Questions

AI suggests relevant follow-up queries based on your current investigation context.

💡

User-Friendly Explanations

Complex log patterns explained in simple terms for faster understanding.

🔗

Correlation & Context

Automatically correlate related events across multiple log sources.

ML Analysis Capabilities

Unsupervised ML automatically categorizes logs into actionable insights

⚠️ Potential Issues 76

Clusters automatically identified as potential problems requiring attention

🔴 Outliers 12

One-time events that indicate anomalies in the system

Cluster patterns correlated in time showing emerging issues

500+ Functional Checks With fix guidance for packaged apps
250+ Pre-built Sources Customizable for seamless operations
200+ Analytics Widgets Built-in dashboards for visualization

Log Analytics Use Cases

🔍

Security Investigation

Correlate audit logs, VCN flow logs, and application logs to investigate security incidents.

Audit Logs → Flow Logs → Correlation
🐛

Application Debugging

Use ML clustering to identify error patterns across distributed microservices.

Error Logs → Clustering → Root Cause
📊

Compliance Reporting

Generate compliance reports from centralized log data with scheduled searches.

Log Ingestion → Saved Searches → Reports
🚀

Performance Analysis

Analyze application and infrastructure logs to identify performance bottlenecks.

App Logs → Baseline → Anomaly Detection

The Digital Experience

End-to-end visibility from browser click to database query

🔄

"The Spinning Wheel of Death"

Users complain checkout is slow, but server CPU is only 20%

💻

"It works on my machine"

Devs can't reproduce errors that users in Brazil are seeing

Real User Monitoring - Global View

Configure RUM
North America: 124ms avg Europe: 89ms avg Asia: 312ms avg Brazil: 4.2s avg - Click for details Australia: 198ms avg
< 200ms
200-500ms
> 500ms

Brazil - Performance Analysis

Avg Load Time 4.2s
Active Sessions 1,247
Error Rate 8.3%
Top Issue JavaScript Error
Issue: JavaScript Error on Checkout Button
Uncaught TypeError: Cannot read property 'submit' of undefined at checkout.min.js:847

This client-side error affects 73% of Brazilian users due to a CDN caching issue with the new checkout script.

Custom Attributes for RUM

Enrich your Real User Monitoring data with custom dimensions and metrics for deeper analysis.

Aa

Text Attributes (Dimensions)

Filter and group by user properties, page categories, or custom tags

window.apmrum.udfAttributes = [
  { name: 'userId', value: 'user@example.com' },
  { name: 'pageCategory', value: 'checkout' }
];
#

Numeric Attributes (Metrics)

Aggregate and alert on custom numeric values like cart totals or item counts

window.apmrum.udfAttributes = [
  { name: 'cartTotal', value: 149.99 },
  { name: 'itemCount', value: 3 }
];
ƒ

Dynamic Attributes (Functions)

Execute functions on each span to capture real-time state

window.apmrum.udfAttributes = [
  { name: 'cartItems', value: () => getCartCount() }
];

Attribute Limits

50 Max Attributes
50 Name Length
500 Value Length

Distributed Tracing - Transaction Waterfall

Service 0ms 500ms 1000ms 1500ms 2000ms 2500ms 3000ms
Browser
50ms
Load Balancer
10ms
Front-End API
200ms
Payment Gateway SLOW
3000ms
💡

Root Cause: The slowness is caused by the 3rd party payment provider, not your code. The Payment Gateway service is taking 3 seconds to respond.

Synthetic Monitoring - Availability Monitoring

Documentation

Proactively detect availability and performance issues before end users experience them using 8 different monitor types.

🌐

Browser

Monitor specific URLs and page load times

📜

Scripted Browser

Record and replay user journeys with multi-step transactions

REST API

Test REST endpoints with custom headers and payloads

🔌

Network

Monitor server/host availability via ping and port checks

🔍

DNS

DNS Server, DNS Trace, and DNSSEC validation

🗃️

SQL

Measure SQL query performance against Oracle databases

📁

FTP

Measure FTP operation performance and latency

🔗

Scripted REST

Multi-step API transaction workflows

Global Vantage Points - Live Status

🤖
🇺🇸 Virginia ✓ 234ms 2 min ago
🇩🇪 Frankfurt ✓ 189ms 2 min ago
🇯🇵 Tokyo ⚠ 523ms 2 min ago
🇧🇷 São Paulo ✗ Failed 2 min ago

Scripted browser runs login test every 5 minutes from global vantage points. Failures trigger immediate alerts.

Code-Level Diagnostics

Stack Trace for Trace #TR-2024-0131-4521 Captured at 14:32:17 UTC
com.example.checkout.PaymentService.processPayment PaymentService.java:247
com.example.checkout.PaymentGateway.submitTransaction PaymentGateway.java:89
com.example.http.HttpClient.post HttpClient.java:156
Captured SQL: SELECT * FROM payment_tokens WHERE user_id = ? AND status = 'ACTIVE' ORDER BY created_at DESC LIMIT 1 Execution: 2847ms

APM Data Sources & Integrations

View All Sources

Collect traces and metrics from multiple sources including language agents, open standards, and APIs.

OCI APM Agents

Java Agent JVMs, application servers, Spring Boot
.NET Agent Windows & Linux, ASP.NET Core
🌐
Browser Agent Real User Monitoring for web apps

Open Standards

🔭
OpenTelemetry OTLP protocol support
🟡
Jaeger Jaeger client libraries
🔶
Zipkin Zipkin-compatible tracers

Additional Sources

⚙️
Trace Explorer API Direct trace data uploads
🤖
Synthetic Monitors Automated availability tests
📊
OpenTracing Java Tracer with JVM metrics

APM Use Cases

🔍

Root Cause Analysis

Drill down from slow page loads to the exact database query or API call causing issues.

Browser → API → Service → Database
🌍

Geographic Performance

Identify regional performance disparities with RUM and optimize CDN configuration.

RUM → Region Analysis → CDN Tuning

Proactive Monitoring

Detect outages before users report them with synthetic monitors running 24/7.

Synthetic → Alert → PagerDuty
📈

Release Validation

Compare performance metrics before and after deployments to catch regressions.

Deploy → Baseline → Compare
🔗

Microservices Debugging

Trace requests across dozens of services with distributed tracing and service maps.

Request → Trace → Dependencies
💰

SLA Monitoring

Track response times and error rates against SLA commitments with custom dashboards.

Metrics → Dashboard → Reports
Capacity Planning & Analytics

Operations Insights

Gain deep visibility into your database and host fleet performance. Operations Insights provides capacity planning, resource utilization analysis, and SQL analytics to help you optimize costs and ensure performance.

Documentation Try LiveLabs

Operations Insights

Capacity planning and resource analytics for databases and hosts

Capacity Planning

Forecast CPU, memory, and storage needs up to 30 days ahead with ML-powered predictions.

SQL Warehouse

Analyze SQL execution patterns across your entire database fleet in a centralized repository.

Exadata Insights

Deep visibility into Exadata systems including storage, compute, and network performance.

Host Insights

Monitor compute resource utilization patterns across your infrastructure fleet.

Resource Utilization Dashboards

Visualize CPU, memory, I/O and storage utilization trends across your database and host fleet.

Trend Analysis Heat Maps Top-N Reports

Demand Forecasting

Predict future resource requirements using machine learning based on historical patterns.

ML Predictions 30-Day Forecast Alerts

SQL Analytics

Identify problematic SQL statements and optimization opportunities across your database fleet.

SQL Insights Plan Changes Degradation Alerts

AWR Hub

Consolidate AWR data from multiple databases for centralized performance analysis.

Centralized AWR Cross-DB Analysis Long-term Storage

Capacity Planning Dashboard

ML-Powered Forecasting
Database Fleet CPU
67%
↑ 12% vs last week
Memory Utilization
72%
→ Stable
Storage Growth
2.4 TB
↑ 340 GB/month
Days Until Full
47
↓ Better than expected

SQL Warehouse

Centralized SQL analysis across your entire database fleet - identify problematic queries and optimization opportunities.

847K
SQL Statements
156
Degraded SQLs
23
Plan Changes
12
Databases
SQL ID Database Avg Elapsed (ms) Executions CPU Impact
9xkz4wm7c2n3p PROD_ERP_01 4,523 12,847
7fnh2kp9v4m8q PROD_CRM_02 2,891 8,234
3bmr6yt1x9k2w PROD_HR_01 1,756 24,891

Exadata Insights

Enterprise Feature

💾 Storage Server

Flash Cache Hit Ratio 98.7%
IOPS 45.2K
Throughput 3.2 GB/s
Smart Scan Offload 87%

🖥️ Compute Node

CPU Utilization 78%
Memory Used 512 GB
Active Sessions 234
DB Time 1,247s

🌐 InfiniBand Network

Network Bandwidth 12.4 GB/s
RDMA Latency 0.8 μs
Port Health All OK
Errors 0

Pro Tip: Enable OPSI for Cost Optimization

Operations Insights can help identify underutilized resources and right-sizing opportunities, potentially saving significant infrastructure costs. Enable it on your production databases to get actionable recommendations.

Fleet Management & Tuning

Database Management

Monitor and manage your Oracle Database fleet with a unified cloud service. Get Performance Hub diagnostics, automated tuning recommendations, and proactive alerting for your on-premises and cloud databases.

Documentation Try LiveLabs

Database Management

Unified monitoring and management for your Oracle Database fleet

Performance Hub

Real-time and historical performance analysis with ASH, AWR, and SQL monitoring in one place.

Fleet Summary

Single pane of glass view across all your managed databases with health status and alerts.

SQL Tuning

Get automated SQL tuning recommendations and implement fixes with a single click.

Job Scheduling

Schedule and monitor database jobs across your fleet with centralized management.

Managed Database Groups

Organize databases into logical groups for easier management and bulk operations.

Grouping Bulk Actions Policies

Proactive Alerting

Get notified about performance issues, space problems, and security concerns before they impact users.

Threshold Alerts Anomaly Detection Notifications

Tablespace Management

Monitor tablespace usage and get recommendations for optimization and space reclamation.

Usage Tracking Growth Trends Alerts

Optimizer Statistics

Manage optimizer statistics gathering and staleness across your database fleet.

Auto Gathering Staleness Check Recommendations

Supported Database Types

Database Management supports Oracle Autonomous Databases, Base DB Systems, Exadata Cloud Service, and on-premises Oracle Databases (11g, 12c, 18c, 19c, 21c, 23ai). Connect on-premises databases using Management Gateway or direct registration.

Performance Hub

PROD_ERP_01 (19c)
CPU
User I/O
Application Wait
Other

DB CPU

42.3%
↓ 8% from baseline

User I/O

28.1%
↑ 12% from baseline

Application Wait

18.4%
↑ 5% from baseline

Concurrency

11.2%
→ Stable

SQL Tuning Advisor

Automated recommendations to optimize SQL performance based on execution analysis.

📊

Create Index on ORDERS.CUSTOMER_ID

SQL ID: 7fnh2kp9v4m8q performs full table scan on ORDERS. Creating an index would reduce execution time by ~85%.

↓ 85% Response Time
🔄

SQL Profile Available

SQL ID: 3bmr6yt1x9k2w has a suboptimal execution plan. Accepting the SQL Profile would improve cardinality estimates.

↓ 62% CPU Usage
📈

Gather Statistics on INVENTORY

Table INVENTORY has stale statistics (last gathered 45 days ago). Fresh statistics would improve optimizer decisions.

↓ 40% Plan Changes

Fleet Overview

12 Databases
3 Alerts

PROD_ERP_01

Healthy
CPU 42%
Memory 67%
Storage 78%

PROD_CRM_02

Warning
CPU 78%
Memory 85%
Storage 62%

PROD_HR_01

Healthy
CPU 23%
Memory 45%
Storage 34%

DEV_TEST_01

Critical
CPU 92%
Memory 94%
Storage 89%

AWR Management

Automatic Workload Repository for historical performance analysis and diagnostics.

📸

Snapshot Management

Automatic snapshots every 30 minutes with 8-day retention. Configure custom intervals and retention periods.

📋

AWR Reports

Generate HTML/text AWR reports for any time range. Compare periods to identify performance changes.

🔍

ASH Analytics

Active Session History analysis for real-time and historical session activity visualization.

100+
Performance Metrics
5min
Data Granularity
31 Days
Historical Retention
Free
Basic Tier Available

The Future Interface

Conversational observability powered by OCI GenAI and MCP

ℹ️
Interactive Demo

This is a demonstration with pre-built responses showcasing real-world troubleshooting scenarios. The examples below represent actual use cases that can be implemented using OCI GenAI and MCP servers for your observability workflows.

Hello! I'm your OCI Observability AI Assistant. I can help you analyze logs, investigate incidents, and provide insights across your entire observability stack. What would you like to explore?

Why is my EBS system running slow today?

I've analyzed your EBS environment across multiple data sources. Here's what I found:

🔴
Database Contention Detected

The GL_BALANCES table has 47 concurrent sessions waiting on row locks. This correlates with the month-end GL posting job that started at 08:00.

🟡
Concurrent Manager Queue Depth

Standard Manager queue has 234 pending requests vs. typical 45. Consider increasing parallelism for FNDCM.

🔵
Forms Session Spike

Active Forms sessions: 847 (normal: ~400). Likely due to end-of-month user activity.

Recommendation: The primary bottleneck is the GL posting job. I suggest pausing non-critical concurrent requests until the posting completes, or consider partitioning the GL_BALANCES table for future relief.

Real-World Troubleshooting Scenarios

These examples demonstrate how AI-powered observability can accelerate incident resolution across different OCI services.

🗄️ Database Management
"Why is my Autonomous Database running slow queries?"

AI Analysis: Detected 3 SQL statements consuming 78% of DB time. Top offender: SELECT * FROM ORDERS WHERE status='pending' - missing index on status column. AWR shows buffer cache hit ratio dropped to 67%.

Recommended Actions: Create index on ORDERS(status), increase SGA_TARGET, consider query rewrite with pagination.

Database Management Performance Hub AWR
🔐 Security Investigation
"Show me failed SSH login attempts from the last 24 hours"

AI Analysis: Detected 2,847 failed SSH attempts across 12 hosts. 94% originated from 3 IP ranges (185.x.x.x, 45.x.x.x, 91.x.x.x). Peak activity: 03:00-05:00 UTC. Target: root account (89%), admin (11%).

Recommended Actions: Block identified IP ranges via NSG, enforce key-based auth, consider fail2ban or Cloud Guard responder.

Log Analytics Cloud Guard Security Zones
APM & Tracing
"Users report checkout page is timing out"

AI Analysis: Trace analysis shows /api/checkout avg response time: 12.4s (SLA: 2s). Root cause: payment-service → external-gateway call timing out. 3rd party API latency increased from 200ms to 8s starting 14:30 UTC.

Recommended Actions: Implement circuit breaker pattern, add timeout with fallback, contact payment provider about degradation.

APM Distributed Tracing Synthetic Monitoring
☸️ Kubernetes (OKE)
"Why are my pods constantly restarting in production?"

AI Analysis: Pods in prod/api-deployment restarted 47 times in 2 hours. Cause: OOMKilled (memory limit 512Mi, actual usage peaks at 680Mi). Memory leak pattern detected in heap dumps correlating with connection pool growth.

Recommended Actions: Increase memory limit to 1Gi, investigate connection pool leak in API service, enable HPA for automatic scaling.

Log Analytics Monitoring Stack Monitoring
🌐 Network & Connectivity
"Application can't connect to database after VCN changes"

AI Analysis: VCN Flow Logs show REJECT for TCP/1521 from subnet-app (10.0.1.0/24) to subnet-db (10.0.2.0/24). Security List sl-app-subnet missing egress rule for DB port. Change made by user@example.com at 09:15 UTC.

Recommended Actions: Add egress rule to sl-app-subnet allowing TCP/1521 to 10.0.2.0/24, or use NSG for more granular control.

VCN Flow Logs Log Analytics Audit Logs
👥 Fusion Applications
"Payroll processing jobs are failing in Fusion HCM"

AI Analysis: ESS job PYUGEN failed for 3 consecutive runs. Error: "Unable to acquire lock on PAY_PAYROLL_ACTIONS". Concurrent job ARCHIVE_PAYMENTS holding lock for 4+ hours. Data volume increased 340% due to new hire batch.

Recommended Actions: Terminate stuck archive job, increase UCM storage allocation, schedule archive jobs during off-peak, partition payroll tables by period.

Log Analytics Fusion HCM Database Management
🔗 Integration Cloud (OIC)
"Why are my OIC integrations failing with timeout errors?"

AI Analysis: Integration SYNC_ORDERS_TO_ERP failed 156 times today. 89% timeout on REST invoke to on-prem endpoint. Network latency to FastConnect increased from 15ms to 890ms. BGP route flapping detected on customer edge router.

Recommended Actions: Check FastConnect partner status, increase OIC connection timeout to 120s, implement async pattern with callback for large payloads.

OIC Log Analytics FastConnect
🛡️ Data Safe & Compliance
"Show me unusual database activity that might indicate a breach"

AI Analysis: Anomaly detected: User APP_SERVICE exported 2.3M rows from CUSTOMERS table at 02:14 UTC (normal: <1000 rows/day). Same user accessed 47 tables in HR schema they never accessed before. Login from new IP: 203.x.x.x (geolocation: unexpected country).

Recommended Actions: Immediately revoke APP_SERVICE credentials, review exported data scope, enable Data Safe SQL Firewall, investigate compromised credentials source.

Data Safe Activity Auditing Cloud Guard

Model Context Protocol (MCP) Integration

Connect your AI assistant to live OCI data through secure, standardized protocols.

👤
User Query
🧠
LLM + MCP Client
MCP Servers
☁️
OCI APIs

Log Analytics MCP

Query logs, run ML analysis, detect anomalies

logan_execute_query logan_detect_anomalies

Database Observatory MCP

AWR reports, SQL tuning, performance analysis

get_awr_report analyze_sql

Security MCP

Cloud Guard problems, vulnerability scans

list_problems get_security_score

OCI Core MCP

Compute, networking, observability metrics

list_instances get_metrics