Getting Started

1

What you'll need

Docker & Docker Compose — to run the database and backend
An LLM API key — from Gemini, OpenAI, Anthropic, Azure, or OpenRouter
curl (or any HTTP client) — to test the API

2

Clone & launch with Docker Compose

git clone https://github.com/harshit-sandilya/CortexDB.git
cd CortexDB/memory
docker compose up -d

This starts two containers:

Service	Port	Description
PostgreSQL (pgvector)	5432	Vector-enabled database
CortexDB Backend	8080	REST API server

💡 Tip:

Wait for all containers to be healthy before proceeding. Run docker compose ps to check.

3

Tell CortexDB which LLM to use

CortexDB needs an LLM for embeddings and entity extraction. Configure it with a single API call:

curl -X POST http://localhost:8080/api/setup \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "GEMINI",
    "chatModelName": "gemini-2.0-flash",
    "embedModelName": "gemini-embedding-001",
    "apiKey": "YOUR_API_KEY"
  }'

Supported Providers

Provider	Value	Extra Fields
Google Gemini	`GEMINI`	apiKey
OpenAI	`OPENAI`	apiKey
Anthropic	`ANTHROPIC`	apiKey
Azure OpenAI	`AZURE`	apiKey, baseUrl
OpenRouter	`OPENROUTER`	apiKey

Response

{
  "message": "LLM configured successfully",
  "success": true,
  "configuredProvider": "GEMINI",
  "configuredChatModel": "gemini-2.0-flash",
  "configuredEmbedModel": "gemini-embedding-001",
  "timestamp": "2025-01-15T10:00:00Z"
}

4

Send content to CortexDB

curl -X POST http://localhost:8080/api/ingest/document \
  -H "Content-Type: application/json" \
  -d '{
    "uid": "user-1",
    "converser": "USER",
    "content": "Java uses garbage collection for automatic memory management. The JVM handles this process efficiently."
  }'

Response

{
  "knowledgeBase": {
    "id": "a1b2c3d4-...",
    "uid": "user-1",
    "content": "Java uses garbage collection...",
    "createdAt": "2025-01-15T10:00:00Z"
  },
  "status": "SUCCESS",
  "message": "Document ingested successfully",
  "processingTimeMs": 245,
  "embeddingTimeMs": 120
}

ℹ️ Async Pipeline:

The response returns immediately after storing the knowledge base entry. Chunking, embedding, entity extraction, and graph building happen asynchronously in the background via PostgreSQL triggers. Check your server console logs for KB_ROW, CONTEXT_ROW, ENTITY_ROW, and RELATION_NEW tags.

5

Semantic search on your ingested content

curl -X POST http://localhost:8080/api/query/contexts \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How does Java manage memory?",
    "limit": 5,
    "minRelevance": 0.7
  }'

Response

{
  "query": "How does Java manage memory?",
  "results": [
    {
      "id": "c1d2e3f4-...",
      "content": "Java uses garbage collection for automatic memory management.",
      "score": 0.92,
      "type": "CHUNK",
      "metadata": { "chunkNumber": 1, "totalChunks": 2 }
    }
  ],
  "processingTimeMs": 85
}

You can also try:

Entity search — find extracted concepts
Graph traversal — explore entity connections
Hybrid search — combine vector + graph results

6

Install a client library for your language

Python

pip install cortexdb

from cortexdb import CortexDB

db = CortexDB("http://localhost:8080")
db.setup.configure(provider="GEMINI", api_key="...",
                   chat_model="gemini-2.0-flash",
                   embed_model="gemini-embedding-001")

db.ingest.document(uid="user-1", converser="USER", content="Hello world")
results = db.query.search_contexts("greeting", limit=5)

JavaScript / TypeScript

npm install cortexdb

import { CortexDB } from "cortexdb";

const db = new CortexDB("http://localhost:8080");
await db.setup.configure("GEMINI", "gemini-2.0-flash",
                         "gemini-embedding-001", "YOUR_KEY");

await db.ingest.document("user-1", "USER", "Hello world");
const results = await db.query.searchContexts("greeting");

Java

CortexDBClient db = new CortexDBClient("http://localhost:8080");

db.setup().configure(LLMApiProvider.GEMINI, "YOUR_KEY",
                     "gemini-2.0-flash", "gemini-embedding-001");

db.ingest().document("user-1", ConverserRole.USER, "Hello world");
QueryResponse results = db.query().searchContexts("greeting");

See the SDK documentation for full API details.

Prerequisites

What you'll need

Start the Server

Clone & launch with Docker Compose

Configure an LLM Provider

Tell CortexDB which LLM to use

Supported Providers

Response

Ingest a Document

Send content to CortexDB

Response

Run Your First Query

Semantic search on your ingested content

Response

Use an SDK

Install a client library for your language

Python

JavaScript / TypeScript

Java