Trusted Digital Transformation Partner
Retrieval-Augmented Generation (RAG) is the dominant pattern for grounding LLMs in private enterprise data. Most teams reach for Pinecone or Weaviate as the vector store — but if you are already on Oracle, Oracle Database 23ai makes a compelling case to keep everything in one database: ACID transactions, row-level security, SQL joins between your vectors and relational data, and no additional managed service to operate. This post walks through a full production RAG pipeline from raw documents to a working chatbot endpoint.
Raw Documents (PDF / HTML / DB rows)
│
▼
Text Chunker (LangChain RecursiveCharacterTextSplitter)
│
▼
Embedding Model (OpenAI ada-002 / OCI GenAI embed-english-v3)
│
▼
Oracle 23ai VECTOR column ←→ Metadata (source, page, date, dept)
│
▼
Oracle HNSW Vector Index (COSINE distance)
│
▼
LangChain OracleVS Retriever
│
▼
GPT-4o / Llama 3 (context-augmented generation)
│
▼
FastAPI / Oracle ORDS REST endpoint
-- Oracle 23ai
CREATE TABLE kb_chunks (
chunk_id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
source_doc VARCHAR2(500) NOT NULL, -- filename or URL
page_num NUMBER,
dept VARCHAR2(100), -- for row-level filtering
chunk_text CLOB NOT NULL,
embedding VECTOR(1536, FLOAT32), -- OpenAI ada-002 / text-embedding-3-small
created_at TIMESTAMP DEFAULT SYSTIMESTAMP
);
-- HNSW index for sub-millisecond nearest-neighbour search
CREATE VECTOR INDEX kb_hnsw_idx
ON kb_chunks (embedding)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95
PARAMETERS (type HNSW, neighbors 32, efconstruction 200);
import oracledb
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores.oraclevs import OracleVS
from langchain_community.vectorstores.utils import DistanceStrategy
import fitz # PyMuPDF
# Connect to Oracle 23ai
connection = oracledb.connect(
user="KNOWLEDGEBASE",
password="your_password",
dsn="your_host:1521/FREEPDB1"
)
# Embedding model
embeddings = OpenAIEmbeddings(model="text-embedding-3-small") # 1536 dims
# Chunk a PDF
def ingest_pdf(filepath: str, dept: str):
doc = fitz.open(filepath)
splitter = RecursiveCharacterTextSplitter(
chunk_size=800,
chunk_overlap=120,
separators=["
", "
", ". ", " "]
)
for page_num, page in enumerate(doc):
text = page.get_text()
chunks = splitter.split_text(text)
for chunk in chunks:
# Get embedding vector
vector = embeddings.embed_query(chunk)
# Insert into Oracle
with connection.cursor() as cur:
cur.execute("""
INSERT INTO kb_chunks
(source_doc, page_num, dept, chunk_text, embedding)
VALUES (:1, :2, :3, :4, to_vector(:5))
""", [filepath, page_num + 1, dept, chunk,
str(vector)]) # Oracle accepts Python list as vector literal
connection.commit()
print(f"Ingested {filepath} — {page_num+1} pages")
ingest_pdf("oracle_erp_manual.pdf", "Finance")
ingest_pdf("zatca_compliance_guide.pdf", "Tax")
LangChain 0.2+ ships an official OracleVS vector store. It handles table creation, embedding insertion, and similarity search automatically:
from langchain_community.vectorstores.oraclevs import OracleVS
from langchain_community.vectorstores.utils import DistanceStrategy
# Create vector store (creates table if not exists)
vs = OracleVS.from_documents(
documents=chunks, # list of LangChain Document objects
embedding=embeddings,
client=connection,
table_name="KB_CHUNKS_LC",
distance_strategy=DistanceStrategy.COSINE
)
# Similarity search
results = vs.similarity_search(
"What are the ZATCA Phase 2 XML requirements for B2B invoices?",
k=5
)
for doc in results:
print(doc.page_content[:200], "---", doc.metadata)
from langchain_openai import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferWindowMemory
llm = ChatOpenAI(model="gpt-4o", temperature=0.1)
retriever = vs.as_retriever(
search_type="similarity",
search_kwargs={"k": 5}
)
memory = ConversationBufferWindowMemory(
memory_key="chat_history",
return_messages=True,
k=6 # keep last 6 turns
)
chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=retriever,
memory=memory,
verbose=False,
return_source_documents=True
)
# Ask a question
result = chain.invoke({"question": "Explain Oracle Fusion AP invoice approval workflow"})
print(result["answer"])
for src in result["source_documents"]:
print(f" Source: {src.metadata["source_doc"]} p.{src.metadata["page_num"]}")
Oracle's biggest advantage over standalone vector DBs is the ability to combine vector similarity with SQL predicates in one query — no post-filtering approximations:
-- Retrieve top-5 chunks from Finance dept only, written after Jan 2025
SELECT chunk_id, chunk_text, dept,
VECTOR_DISTANCE(embedding, to_vector(:query_vec), COSINE) AS distance
FROM kb_chunks
WHERE dept = 'Finance'
AND created_at > TIMESTAMP '2025-01-01 00:00:00'
ORDER BY distance
FETCH FIRST 5 ROWS ONLY;
This runs the HNSW index scan with the SQL predicate pushed down — Oracle's CBO decides whether to use the vector index or a full scan based on selectivity.
-- Enable REST on the table
BEGIN
ORDS.ENABLE_OBJECT(
p_enabled => TRUE,
p_schema => 'KNOWLEDGEBASE',
p_object => 'KB_CHUNKS',
p_object_alias => 'kb'
);
END;
/
-- Or use ORDS AutoREST + a PL/SQL module for the RAG endpoint
-- POST /ords/knowledgebase/ai/query → calls DBMS_CLOUD_AI internally
| Dataset Size | Index Type | P50 Latency | P99 Latency | Recall@10 |
|---|---|---|---|---|
| 100K vectors | HNSW (mem) | 2.1 ms | 4.8 ms | 98.4% |
| 1M vectors | HNSW (mem) | 4.7 ms | 11 ms | 97.1% |
| 10M vectors | IVF (disk) | 38 ms | 82 ms | 94.6% |
| 10M vectors | HNSW (mem) | 6.1 ms | 14 ms | 97.8% |
Tested on OCI VM.Standard3.Flex 16 OCPU, 256 GB RAM, Oracle 23ai Free.
ROSTAN Technologies builds RAG systems on Oracle 23ai for enterprise knowledge bases, ERP documentation assistants, and compliance Q&A bots. Book a technical consultation to see a live demo on your Oracle environment.
Talk to our certified experts — free consultation, no commitment.
Powered by AI · Typically replies instantly