Oracle Database 23ai Property Graph, SQL/PGQ & In-Database Machine Learning — Technical Deep Dive

Oracle Database 23ai Property Graph, SQL/PGQ & In-Database Machine Learning — Technical Deep Dive

  • Article By : Rostan Team
  • Jan 24, 2024
  • Share This:

Oracle Database 23ai brings two underrated capabilities that most teams miss: first-class Property Graph support via the SQL/PGQ standard, and a mature in-database Machine Learning engine (OML) that runs 30+ algorithms natively inside Oracle — no data export to Python, no ETL pipeline, no separate ML platform. This post covers both in technical depth.

Part 1: Property Graphs with SQL/PGQ

What Is SQL/PGQ?

SQL/PGQ (Property Graph Queries in SQL) is an ISO standard extension to SQL:2023. Oracle 23ai is one of the first databases to implement it. Unlike PGQL or Cypher, SQL/PGQ is a first-class SQL extension — graph traversals are SQL clauses, results are SQL result sets, and you can join graph traversal results with relational tables in a single query.

Creating a Property Graph

-- Relational tables (can be existing ERP tables)
CREATE TABLE accounts (
  acc_id   NUMBER PRIMARY KEY,
  name     VARCHAR2(200),
  type     VARCHAR2(50),    -- CUSTOMER, VENDOR, BANK, INTERNAL
  country  VARCHAR2(100)
);

CREATE TABLE transactions (
  txn_id    NUMBER PRIMARY KEY,
  from_acc  NUMBER REFERENCES accounts(acc_id),
  to_acc    NUMBER REFERENCES accounts(acc_id),
  amount    NUMBER(18,2),
  txn_date  DATE,
  txn_type  VARCHAR2(50)   -- PAYMENT, REFUND, TRANSFER
);

-- Define the graph on top of existing relational tables
CREATE PROPERTY GRAPH finance_graph
  VERTEX TABLES (
    accounts
      KEY (acc_id)
      LABEL Account
      PROPERTIES (name, type, country)
  )
  EDGE TABLES (
    transactions
      KEY (txn_id)
      SOURCE KEY (from_acc) REFERENCES accounts (acc_id)
      DESTINATION KEY (to_acc) REFERENCES accounts (acc_id)
      LABEL Transfer
      PROPERTIES (amount, txn_date, txn_type)
  );

The graph is a virtual view over your existing relational data — no data is moved or duplicated.

SQL/PGQ Queries

Find Circular Payment Chains (Fraud Pattern)

-- Find accounts that send and receive money to/from the same counterpart within 7 days
SELECT src.name AS source_account,
       dst.name AS dest_account,
       SUM(e.amount) AS total_cycled
FROM   GRAPH_TABLE (finance_graph
         MATCH (src:Account) -[t1:Transfer]-> (mid:Account) -[t2:Transfer]-> (src)
         WHERE t1.txn_date BETWEEN t2.txn_date - 7 AND t2.txn_date
           AND t1.amount > 100000
         COLUMNS (src.name, dst.name, t1.amount)
       ) AS e,
       accounts src, accounts dst
GROUP BY src.name, dst.name
HAVING SUM(e.amount) > 500000
ORDER BY total_cycled DESC;

Shortest Path Between Entities

-- Find the shortest payment chain between two companies
SELECT path_length, nodes
FROM   GRAPH_TABLE (finance_graph
         MATCH SHORTEST (
           (a:Account WHERE a.name = 'Acme Corp')
           -[t:Transfer]->+
           (b:Account WHERE b.name = 'Overseas LLC')
         )
         COLUMNS (path_length(t) AS path_length,
                  vertex_id(a) AS nodes)
       );

k-Hop Neighbour Analysis

-- All accounts reachable within 3 hops from a flagged account
SELECT DISTINCT acc.name, acc.type, acc.country
FROM   GRAPH_TABLE (finance_graph
         MATCH (flagged:Account WHERE flagged.acc_id = :flagged_id)
               -[t:Transfer]->{1,3}
               (connected:Account)
         COLUMNS (connected.acc_id AS cid)
       ) AS g
JOIN   accounts acc ON acc.acc_id = g.cid;

Part 2: Oracle Machine Learning (OML) In-Database

Why In-Database ML?

  • No data export — the model trains on Oracle data inside the DB engine
  • Inference at query time via SQL — PREDICTION() function in any SELECT
  • 30+ algorithms: classification, regression, clustering, anomaly detection, time series
  • AutoML UI in Oracle Cloud — best algorithm and hyperparameters found automatically

Training a Fraud Detection Model

-- Step 1: Create training settings
BEGIN
  DBMS_DATA_MINING.CREATE_MODEL2(
    model_name          => 'FRAUD_CLASSIFIER',
    mining_function     => DBMS_DATA_MINING.CLASSIFICATION,
    data_query          => 'SELECT txn_id, amount, txn_type, hour_of_day,
                                   day_of_week, country_risk_score, is_fraud
                            FROM   txn_training_view',
    set_list            => DBMS_DATA_MINING.SETTING_LIST(
                             'ALGO_NAME' VALUE DBMS_DATA_MINING.ALGO_GRADIENT_BOOSTING,
                             'TREE_IMPURITY_METRIC' VALUE 'TREE_IMPURITY_GINI',
                             'RFOR_NUM_TREES' VALUE '100',
                             'CLAS_COST_TABLE_NAME' VALUE 'FRAUD_COST_TABLE'
                           ),
    case_id_column_name => 'txn_id',
    target_column_name  => 'is_fraud'
  );
END;
/

Real-Time Inference in SQL

-- Score new transactions at query time — no application code needed
SELECT t.txn_id,
       t.amount,
       t.from_acc,
       t.to_acc,
       PREDICTION(FRAUD_CLASSIFIER USING
         t.amount,
         t.txn_type,
         EXTRACT(HOUR FROM t.txn_date)   AS hour_of_day,
         TO_CHAR(t.txn_date, 'D')       AS day_of_week,
         a.country_risk_score
       )                                   AS predicted_fraud,
       PREDICTION_PROBABILITY(FRAUD_CLASSIFIER, 1 USING
         t.amount, t.txn_type,
         EXTRACT(HOUR FROM t.txn_date)   AS hour_of_day,
         TO_CHAR(t.txn_date, 'D')       AS day_of_week,
         a.country_risk_score
       )                                   AS fraud_probability
FROM   transactions t
JOIN   accounts a ON a.acc_id = t.from_acc
WHERE  t.txn_date > SYSDATE - 1   -- last 24 hours
  AND  t.processed_flag = 'N'
ORDER BY fraud_probability DESC;

Part 3: Combining Graph + ML + Vector Search in One Query

Oracle 23ai's real power is composing all three paradigms in a single SQL statement:

-- Find high-risk transactions that:
-- 1. Are predicted fraudulent by ML model (score > 0.85)
-- 2. Involve accounts within 2 hops of a known bad actor (graph)
-- 3. Have description text semantically similar to known fraud patterns (vector)
SELECT t.txn_id, t.amount, t.from_acc, t.to_acc,
       ml.fraud_prob,
       vd.description_similarity
FROM transactions t
-- ML scoring
JOIN (
  SELECT txn_id,
         PREDICTION_PROBABILITY(FRAUD_CLASSIFIER, 1 USING amount, txn_type) AS fraud_prob
  FROM   transactions
  WHERE  txn_date > SYSDATE - 1
) ml ON ml.txn_id = t.txn_id AND ml.fraud_prob > 0.85
-- Graph: within 2 hops of flagged account
JOIN (
  SELECT DISTINCT g.cid AS connected_acc
  FROM GRAPH_TABLE (finance_graph
    MATCH (bad:Account WHERE bad.acc_id IN (SELECT acc_id FROM flagged_accounts))
    -[t:Transfer]->{1,2} (c:Account)
    COLUMNS (c.acc_id AS cid)
  ) g
) graph_risk ON (t.from_acc = graph_risk.connected_acc OR t.to_acc = graph_risk.connected_acc)
-- Vector: description similar to known fraud patterns
JOIN (
  SELECT txn_id,
         1 - VECTOR_DISTANCE(description_embedding,
               (SELECT fraud_pattern_embedding FROM fraud_patterns WHERE id = 1),
               COSINE) AS description_similarity
  FROM   transactions
  WHERE  VECTOR_DISTANCE(description_embedding,
           (SELECT fraud_pattern_embedding FROM fraud_patterns WHERE id = 1),
           COSINE) < 0.25
) vd ON vd.txn_id = t.txn_id
ORDER BY ml.fraud_prob DESC;

This query combines ML inference, graph traversal, and vector similarity — all in native SQL, in one round-trip, with Oracle's cost-based optimiser choosing the best execution plan across all three subsystems.

OML AutoML UI — Best Model Without Code

  1. Oracle Machine Learning UI (bundled with Oracle DB on OCI, ADB, and on-prem with ORDS)
  2. Create Experiment → select training table → choose target column
  3. AutoML runs: feature selection, algorithm comparison (GLM, SVM, RF, GBM, Neural Net, XGBoost), hyperparameter optimisation
  4. Deploy best model as a SQL function automatically
  5. No Python, no Jupyter, no data export required

Key Takeaways

  • SQL/PGQ lets you write graph traversal queries as SQL — results join natively with relational data
  • Oracle Property Graphs are virtual views — no data duplication, no ETL
  • OML's PREDICTION() function scores ML models inside any SQL SELECT — real-time inference at query time
  • Graph + ML + Vector search can compose in a single SQL query — Oracle 23ai is the only database that offers all three natively
  • AutoML UI finds and deploys the best model without writing Python

ROSTAN Technologies builds Oracle 23ai solutions combining graph analytics, in-database ML, and vector search for fraud detection, ERP analytics, and compliance use cases. Contact our Oracle data team to explore what is possible on your database.

Have questions about Oracle, AWS or Cloud?

Talk to our certified experts — free consultation, no commitment.


You May Also Know About
Back to Top
ROSTAN Support
Online · Typically replies instantly
WhatsApp Chat directly, fastest response Call Us +91-9810958952 Email Us info@rostantechnologies.com Send a Message Fill the contact form
Chat with us