API Reference

This page documents the internal data model, database schema, Pydantic validation layer, and server configuration of MCP Memory v2. It is intended for contributors who need to understand or modify the implementation.

Server Configuration

Setting	Value
Entry point	`mcp-memory` → `mcp_memory.server:main`
Transport	stdio
Logs	stderr
MCP name	`"memory"`
Default DB path	`~/.config/opencode/mcp-memory/memory.db`
Model cache	`~/.cache/mcp-memory-v2/models/`

The database directory is created automatically on first run. Model files (ONNX + tokenizer) are downloaded via scripts/download_model.py.

Entity-Relationship Diagram

erDiagram
    entities ||--o{ observations : "has"
    entities ||--o{ relations : "from"
    entities ||--o{ relations : "to"
    entities ||--|| entity_embeddings : "1:1 (rowid)"
    entities ||--|| entity_access : "tracks"
    entities ||--o{ co_occurrences : "co-occurs"
    entities ||--|| entity_fts : "indexed (FTS5)"

    entities {
        INTEGER id PK
        TEXT name UK
        TEXT entity_type
        TEXT created_at
        TEXT updated_at
    }

    observations {
        INTEGER id PK
        INTEGER entity_id FK
        TEXT content
        TEXT created_at
    }

    relations {
        INTEGER id PK
        INTEGER from_entity FK
        INTEGER to_entity FK
        TEXT relation_type
        TEXT created_at
    }

    entity_embeddings {
        INTEGER rowid PK
        FLOAT embedding_384
    }

    entity_access {
        INTEGER entity_id PK_FK
        INTEGER access_count
        TEXT last_access
    }

    co_occurrences {
        INTEGER entity_a_id FK
        INTEGER entity_b_id FK
        INTEGER co_count
        TEXT last_co
    }

    entity_fts {
        INTEGER rowid PK
        TEXT name
        TEXT entity_type
        TEXT obs_text
    }

    db_metadata {
        TEXT key PK
        TEXT value
    }

Database Schema

MCP Memory v2 stores a knowledge graph in SQLite composed of three core elements — entities, observations, and relations — extended by a fourth layer of vector embeddings (via sqlite-vec) for semantic search.

Entities are nodes in the graph.
Observations are facts attached to an entity.
Relations connect two entities with a typed link.
Embeddings project each entity into a 384-dimensional vector space for cosine similarity search.

Two additional auxiliary tables (entity_access, co_occurrences) power the Limbic Scoring system, and one FTS5 virtual table enables full-text search.

`entities`

The primary table of the knowledge graph. Each row represents a node with a unique name.

Column	Type	Constraints	Description
`id`	`INTEGER`	`PRIMARY KEY AUTOINCREMENT`	Unique internal identifier
`name`	`TEXT`	`NOT NULL UNIQUE`	Human-readable entity name. Business key — no two entities may share a name
`entity_type`	`TEXT`	`NOT NULL DEFAULT 'Generic'`	Entity classification (e.g. `Sesion`, `Componente`, `Sistema`)
`created_at`	`TEXT`	`NOT NULL DEFAULT (datetime('now'))`	Creation timestamp in ISO-8601 format
`updated_at`	`TEXT`	`NOT NULL DEFAULT (datetime('now'))`	Last update timestamp

`observations`

Facts or data points attached to an entity. An entity may have zero or many observations.

Column	Type	Constraints	Description
`id`	`INTEGER`	`PRIMARY KEY AUTOINCREMENT`	Unique internal identifier
`entity_id`	`INTEGER`	`NOT NULL REFERENCES entities(id) ON DELETE CASCADE`	FK to the parent entity. Cascade deletes observations when the entity is removed
`content`	`TEXT`	`NOT NULL`	Free-text content of the observation
`created_at`	`TEXT`	`NOT NULL DEFAULT (datetime('now'))`	Creation timestamp

`relations`

Edges connecting two entities with a semantic relation type.

Column	Type	Constraints	Description
`id`	`INTEGER`	`PRIMARY KEY AUTOINCREMENT`	Unique internal identifier
`from_entity`	`INTEGER`	`NOT NULL REFERENCES entities(id) ON DELETE CASCADE`	FK to the source entity
`to_entity`	`INTEGER`	`NOT NULL REFERENCES entities(id) ON DELETE CASCADE`	FK to the target entity
`relation_type`	`TEXT`	`NOT NULL`	Type of relationship (e.g. `uses`, `depends_on`, `part_of`)
`created_at`	`TEXT`	`NOT NULL DEFAULT (datetime('now'))`	Creation timestamp

`entity_embeddings` (Virtual)

Virtual table implemented with the sqlite-vec extension (vec0). Stores the embedding vector for each entity to power semantic search.

Column	Type	Description
`embedding`	`float[384]`	384-dimensional vector generated by the ONNX model. Distance metric: cosine
`rowid`	`INTEGER` (implicit)	Corresponds to `entities.id`. Links the embedding to its entity

Virtual vec0 tables do not participate in SQLite’s ON DELETE CASCADE. When deleting an entity, embeddings must be removed manually before the entity row. The codebase handles this in delete_entities_by_names:

# 1. Delete embeddings first (vec0 has no CASCADE support)
self.db.execute(
    f"DELETE FROM entity_embeddings WHERE rowid IN ({id_placeholders})", ids
)
# 2. Delete entity (CASCADE handles observations & relations)
self.db.execute(
    f"DELETE FROM entities WHERE id IN ({id_placeholders})", ids
)

If you delete entities via raw SQL, clean up embeddings first to avoid orphaned records.

The rowid-based link enables direct JOINs without an explicit FK column:

SELECT e.name, e.entity_type
FROM entities e
JOIN entity_embeddings ee ON e.id = ee.rowid
WHERE ee.embedding MATCH ?
ORDER BY distance;

`db_metadata`

Auxiliary key-value table for system metadata (schema version, last migration timestamp, internal configuration).

Column	Type	Constraints	Description
`key`	`TEXT`	`PRIMARY KEY`	Unique metadata key
`value`	`TEXT`	`NOT NULL`	Associated value

`entity_access`

Support table for the Limbic Scoring system. Records how often and how recently each entity appears in search_semantic results.

Column	Type	Constraints	Description
`entity_id`	`INTEGER`	`PRIMARY KEY REFERENCES entities(id) ON DELETE CASCADE`	FK to the entity. One row per entity
`access_count`	`INTEGER`	`NOT NULL DEFAULT 1`	Number of times the entity appeared in semantic search results
`last_access`	`TEXT`	`NOT NULL DEFAULT (datetime('now'))`	Timestamp of last access (used for temporal decay)

`co_occurrences`

Support table for the Limbic Scoring system. Records how often two entities appear together in search_semantic results.

Column	Type	Constraints	Description
`entity_a_id`	`INTEGER`	`NOT NULL REFERENCES entities(id) ON DELETE CASCADE`	FK to the entity with the lower ID (canonical ordering)
`entity_b_id`	`INTEGER`	`NOT NULL REFERENCES entities(id) ON DELETE CASCADE`	FK to the entity with the higher ID
`co_count`	`INTEGER`	`NOT NULL DEFAULT 1`	Number of recorded co-occurrences
`last_co`	`TEXT`	`NOT NULL DEFAULT (datetime('now'))`	Timestamp of the last co-occurrence

`entity_fts` (FTS5 Virtual)

FTS5 virtual table for full-text search. Indexes entity names, types, and concatenated observation text.

Column	Type	Description
`name`	`TEXT`	Entity name (searchable)
`entity_type`	`TEXT`	Entity type (searchable)
`obs_text`	`TEXT`	All observations concatenated with `" \| "` as separator
`rowid`	`INTEGER` (implicit)	Corresponds to `entities.id`

The tokenizer is unicode61, which correctly handles accented characters (é, ñ, ü) and other Unicode. Synchronization is code-level (not SQLite triggers): _sync_fts() is called manually in upsert_entity, add_observations, and delete_observations. On init_db(), if the FTS table is empty but entities exist, _backfill_fts() populates it.

Full SQL Schema

The complete DDL including all tables, virtual tables, and indexes:

CREATE TABLE entities (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    name        TEXT    NOT NULL UNIQUE,
    entity_type TEXT    NOT NULL DEFAULT 'Generic',
    created_at  TEXT    NOT NULL DEFAULT (datetime('now')),
    updated_at  TEXT    NOT NULL DEFAULT (datetime('now'))
);

CREATE TABLE observations (
    id         INTEGER PRIMARY KEY AUTOINCREMENT,
    entity_id  INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
    content    TEXT    NOT NULL,
    created_at TEXT    NOT NULL DEFAULT (datetime('now'))
);

CREATE TABLE relations (
    id            INTEGER PRIMARY KEY AUTOINCREMENT,
    from_entity   INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
    to_entity     INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
    relation_type TEXT    NOT NULL,
    created_at    TEXT    NOT NULL DEFAULT (datetime('now')),
    UNIQUE(from_entity, to_entity, relation_type)
);

CREATE VIRTUAL TABLE entity_embeddings
USING vec0(embedding float[384] distance_metric=cosine);

CREATE TABLE db_metadata (
    key   TEXT PRIMARY KEY,
    value TEXT NOT NULL
);

-- Limbic scoring tables
CREATE TABLE entity_access (
    entity_id    INTEGER PRIMARY KEY REFERENCES entities(id) ON DELETE CASCADE,
    access_count INTEGER NOT NULL DEFAULT 1,
    last_access  TEXT    NOT NULL DEFAULT (datetime('now'))
);

CREATE TABLE co_occurrences (
    entity_a_id  INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
    entity_b_id  INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
    co_count     INTEGER NOT NULL DEFAULT 1,
    last_co      TEXT    NOT NULL DEFAULT (datetime('now')),
    PRIMARY KEY (entity_a_id, entity_b_id)
);

-- Full-text search (FTS5)
CREATE VIRTUAL TABLE IF NOT EXISTS entity_fts
USING fts5(name, entity_type, obs_text, tokenize="unicode61");

-- Indexes
CREATE INDEX idx_entities_name  ON entities(name);
CREATE INDEX idx_entities_type  ON entities(entity_type);
CREATE INDEX idx_obs_entity     ON observations(entity_id);
CREATE INDEX idx_rel_from       ON relations(from_entity);
CREATE INDEX idx_rel_to         ON relations(to_entity);
CREATE INDEX idx_rel_type       ON relations(relation_type);
CREATE INDEX idx_access_last    ON entity_access(last_access);
CREATE INDEX idx_cooc_b         ON co_occurrences(entity_b_id);

Indexes

Index	Table	Column(s)	Purpose
`idx_entities_name`	`entities`	`name`	Fast lookup by entity name
`idx_entities_type`	`entities`	`entity_type`	Filter by entity type
`idx_obs_entity`	`observations`	`entity_id`	Retrieve all observations for an entity
`idx_rel_from`	`relations`	`from_entity`	Relations originating from an entity
`idx_rel_to`	`relations`	`to_entity`	Relations targeting an entity
`idx_rel_type`	`relations`	`relation_type`	Filter by relation type
`idx_access_last`	`entity_access`	`last_access`	Sort by access recency (temporal decay)
`idx_cooc_b`	`co_occurrences`	`entity_b_id`	Lookup co-occurrences by entity B

Pydantic Models

Pydantic models serve a dual purpose in MCP Memory v2:

Input validation: Each MCP tool receives JSON from the client. Models validate structure and types before touching the database.
Output serialization: Tool responses are serialized to consistently typed JSON.

The 10 MCP tools use these models to validate and return data about entities and relations.

Complete Source (`models.py`)

from pydantic import BaseModel, Field


class EntityInput(BaseModel):
    name: str = Field(..., min_length=1)
    entityType: str = Field(default="Generic")
    observations: list[str] = Field(default_factory=list)


class EntityOutput(BaseModel):
    name: str
    entityType: str
    observations: list[str]


class RelationInput(BaseModel):
    from_entity: str = Field(..., alias="from")
    to_entity: str   = Field(..., alias="to")
    relationType: str
    model_config = {"populate_by_name": True}


class RelationOutput(BaseModel):
    from_entity: str = Field(..., alias="from")
    to_entity: str   = Field(..., alias="to")
    relationType: str
    model_config = {"populate_by_name": True}

`EntityInput`

Input model for creating or updating entities.

Field	Type	Required	Default	Validation
`name`	`str`	Yes	—	`min_length=1`
`entityType`	`str`	No	`"Generic"`	—
`observations`	`list[str]`	No	`[]` (via factory)	New list per instance

`EntityOutput`

Output model for entity responses. All fields are required — the server always populates them from the database.

`RelationInput` / `RelationOutput`

Both models share the same field structure. RelationInput validates incoming client data; RelationOutput serializes responses.

Field	JSON Alias	Type	Required
`from_entity`	`"from"`	`str`	Yes
`to_entity`	`"to"`	`str`	Yes
`relationType`	—	`str`	Yes

from and to are Python reserved keywords, so they cannot be used as attribute names. Pydantic aliases solve this:

In Python: access as relation.from_entity (legal name)
In JSON: serializes as "from" (format expected by Anthropic MCP)

populate_by_name=True enables bidirectional deserialization — both forms are accepted:

# Both work:
RelationInput(**{"from": "EntityA", "to": "EntityB", "relationType": "uses"})
RelationInput(**{"from_entity": "EntityA", "to_entity": "EntityB", "relationType": "uses"})

This guarantees compatibility with Anthropic’s MCP format (which sends "from"/"to") and with internal calls that use Python attribute names.

SQLite PRAGMA Configuration

The following PRAGMAs are set on every database connection in MemoryStore:

PRAGMA journal_mode = WAL       # Write without blocking reads
PRAGMA busy_timeout  = 10000    # Wait 10s if locked
PRAGMA synchronous   = NORMAL   # Balance safety vs speed
PRAGMA cache_size    = -64000   # 64 MB cache
PRAGMA temp_store    = MEMORY   # Temp tables in RAM
PRAGMA foreign_keys  = ON       # Enforce referential integrity

PRAGMA	Value	Rationale
`journal_mode`	`WAL`	Write-Ahead Logging allows concurrent readers while a writer is active. Readers never block writers and writers never block readers.
`busy_timeout`	`10000`	Wait up to 10 seconds for lock contention before raising `SQLITE_BUSY`. In the MCP context (sequential tool calls), this is more than sufficient.
`synchronous`	`NORMAL`	Safe enough for WAL mode (the WAL file is still synced), but faster than `FULL` which syncs the database file too. The right trade-off for a local knowledge graph.
`cache_size`	`-64000`	64 MB page cache. Negative values indicate KiB. Reduces disk I/O for repeated queries.
`temp_store`	`MEMORY`	Temporary tables and intermediate results stay in RAM. Speeds up complex queries and index rebuilding.
`foreign_keys`	`ON`	Enforces `ON DELETE CASCADE` constraints. Without this pragma, SQLite silently ignores foreign key enforcement.

WAL Mode and Concurrency

Operation	Behavior
Concurrent reads	Allowed (WAL supports multiple simultaneous readers)
Writes	Sequential (single writer)
Lock contention	Readers wait up to 10 seconds (`busy_timeout`) for a write lock
Cache	64 MB in memory to reduce I/O

In the MCP context, where tool calls are sequential, this model is well-suited.

Technical Gotchas

vec0 Does Not Support CASCADE

Virtual vec0 tables do not participate in ON DELETE CASCADE. When deleting an entity, observations and relations are removed by CASCADE, but the embedding is not. Always delete embeddings manually before the entity row (see the code example in the entity_embeddings section above).

”Cannot Start a Transaction” Error

sqlite-vec can intermittently fail with "cannot start a transaction" during delete operations. This is a known bug in sqlite-vec related to how virtual tables interact with SQLite’s transaction system. Retrying the operation usually resolves it. The codebase catches this error and logs it as a warning without failing.

Embeddings Are Non-Incremental

Each time an entity’s observations change, the embedding is completely regenerated from scratch. The input text includes a full snapshot of all current observations. This means:

Consistency: the embedding always reflects the current state, with no partial-update artifacts
Cost: every update triggers a full ONNX encoding (~5 ms on CPU for a single vector)
Overwrite: INSERT OR REPLACE in vec0 ensures old versions don’t accumulate

Lazy Embedding Engine

The MCP server starts in ~1 second because it does not load the embedding model at startup. The lazy architecture has two layers:

Import lazy: mcp_memory.embeddings is not imported at module scope in server.py. The import happens inside _get_engine().
Instance lazy: EmbeddingEngine.get_instance() creates the singleton only on the first call.

Consequences:

First search_semantic call: ~3–5 extra seconds while the model loads
Subsequent calls: millisecond responses (engine already in memory)
Server startup: always fast, regardless of whether the model is downloaded