Skip to content

API Reference

This page documents the internal data model, database schema, Pydantic validation layer, and server configuration of MCP Memory v2. It is intended for contributors who need to understand or modify the implementation.

SettingValue
Entry pointmcp-memorymcp_memory.server:main
Transportstdio
Logsstderr
MCP name"memory"
Default DB path~/.config/opencode/mcp-memory/memory.db
Model cache~/.cache/mcp-memory-v2/models/

The database directory is created automatically on first run. Model files (ONNX + tokenizer) are downloaded via scripts/download_model.py.

erDiagram
entities ||--o{ observations : "has"
entities ||--o{ relations : "from"
entities ||--o{ relations : "to"
entities ||--|| entity_embeddings : "1:1 (rowid)"
entities ||--|| entity_access : "tracks"
entities ||--o{ co_occurrences : "co-occurs"
entities ||--|| entity_fts : "indexed (FTS5)"
entities {
INTEGER id PK
TEXT name UK
TEXT entity_type
TEXT created_at
TEXT updated_at
}
observations {
INTEGER id PK
INTEGER entity_id FK
TEXT content
TEXT created_at
}
relations {
INTEGER id PK
INTEGER from_entity FK
INTEGER to_entity FK
TEXT relation_type
TEXT created_at
}
entity_embeddings {
INTEGER rowid PK
FLOAT embedding_384
}
entity_access {
INTEGER entity_id PK_FK
INTEGER access_count
TEXT last_access
}
co_occurrences {
INTEGER entity_a_id FK
INTEGER entity_b_id FK
INTEGER co_count
TEXT last_co
}
entity_fts {
INTEGER rowid PK
TEXT name
TEXT entity_type
TEXT obs_text
}
db_metadata {
TEXT key PK
TEXT value
}

MCP Memory v2 stores a knowledge graph in SQLite composed of three core elements — entities, observations, and relations — extended by a fourth layer of vector embeddings (via sqlite-vec) for semantic search.

  • Entities are nodes in the graph.
  • Observations are facts attached to an entity.
  • Relations connect two entities with a typed link.
  • Embeddings project each entity into a 384-dimensional vector space for cosine similarity search.

Two additional auxiliary tables (entity_access, co_occurrences) power the Limbic Scoring system, and one FTS5 virtual table enables full-text search.

The primary table of the knowledge graph. Each row represents a node with a unique name.

ColumnTypeConstraintsDescription
idINTEGERPRIMARY KEY AUTOINCREMENTUnique internal identifier
nameTEXTNOT NULL UNIQUEHuman-readable entity name. Business key — no two entities may share a name
entity_typeTEXTNOT NULL DEFAULT 'Generic'Entity classification (e.g. Sesion, Componente, Sistema)
created_atTEXTNOT NULL DEFAULT (datetime('now'))Creation timestamp in ISO-8601 format
updated_atTEXTNOT NULL DEFAULT (datetime('now'))Last update timestamp

Facts or data points attached to an entity. An entity may have zero or many observations.

ColumnTypeConstraintsDescription
idINTEGERPRIMARY KEY AUTOINCREMENTUnique internal identifier
entity_idINTEGERNOT NULL REFERENCES entities(id) ON DELETE CASCADEFK to the parent entity. Cascade deletes observations when the entity is removed
contentTEXTNOT NULLFree-text content of the observation
created_atTEXTNOT NULL DEFAULT (datetime('now'))Creation timestamp

Edges connecting two entities with a semantic relation type.

ColumnTypeConstraintsDescription
idINTEGERPRIMARY KEY AUTOINCREMENTUnique internal identifier
from_entityINTEGERNOT NULL REFERENCES entities(id) ON DELETE CASCADEFK to the source entity
to_entityINTEGERNOT NULL REFERENCES entities(id) ON DELETE CASCADEFK to the target entity
relation_typeTEXTNOT NULLType of relationship (e.g. uses, depends_on, part_of)
created_atTEXTNOT NULL DEFAULT (datetime('now'))Creation timestamp

Virtual table implemented with the sqlite-vec extension (vec0). Stores the embedding vector for each entity to power semantic search.

ColumnTypeDescription
embeddingfloat[384]384-dimensional vector generated by the ONNX model. Distance metric: cosine
rowidINTEGER (implicit)Corresponds to entities.id. Links the embedding to its entity

The rowid-based link enables direct JOINs without an explicit FK column:

SELECT e.name, e.entity_type
FROM entities e
JOIN entity_embeddings ee ON e.id = ee.rowid
WHERE ee.embedding MATCH ?
ORDER BY distance;

Auxiliary key-value table for system metadata (schema version, last migration timestamp, internal configuration).

ColumnTypeConstraintsDescription
keyTEXTPRIMARY KEYUnique metadata key
valueTEXTNOT NULLAssociated value

Support table for the Limbic Scoring system. Records how often and how recently each entity appears in search_semantic results.

ColumnTypeConstraintsDescription
entity_idINTEGERPRIMARY KEY REFERENCES entities(id) ON DELETE CASCADEFK to the entity. One row per entity
access_countINTEGERNOT NULL DEFAULT 1Number of times the entity appeared in semantic search results
last_accessTEXTNOT NULL DEFAULT (datetime('now'))Timestamp of last access (used for temporal decay)

Support table for the Limbic Scoring system. Records how often two entities appear together in search_semantic results.

ColumnTypeConstraintsDescription
entity_a_idINTEGERNOT NULL REFERENCES entities(id) ON DELETE CASCADEFK to the entity with the lower ID (canonical ordering)
entity_b_idINTEGERNOT NULL REFERENCES entities(id) ON DELETE CASCADEFK to the entity with the higher ID
co_countINTEGERNOT NULL DEFAULT 1Number of recorded co-occurrences
last_coTEXTNOT NULL DEFAULT (datetime('now'))Timestamp of the last co-occurrence

FTS5 virtual table for full-text search. Indexes entity names, types, and concatenated observation text.

ColumnTypeDescription
nameTEXTEntity name (searchable)
entity_typeTEXTEntity type (searchable)
obs_textTEXTAll observations concatenated with " | " as separator
rowidINTEGER (implicit)Corresponds to entities.id

The tokenizer is unicode61, which correctly handles accented characters (é, ñ, ü) and other Unicode. Synchronization is code-level (not SQLite triggers): _sync_fts() is called manually in upsert_entity, add_observations, and delete_observations. On init_db(), if the FTS table is empty but entities exist, _backfill_fts() populates it.

The complete DDL including all tables, virtual tables, and indexes:

CREATE TABLE entities (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE,
entity_type TEXT NOT NULL DEFAULT 'Generic',
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE observations (
id INTEGER PRIMARY KEY AUTOINCREMENT,
entity_id INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
content TEXT NOT NULL,
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE relations (
id INTEGER PRIMARY KEY AUTOINCREMENT,
from_entity INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
to_entity INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
relation_type TEXT NOT NULL,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
UNIQUE(from_entity, to_entity, relation_type)
);
CREATE VIRTUAL TABLE entity_embeddings
USING vec0(embedding float[384] distance_metric=cosine);
CREATE TABLE db_metadata (
key TEXT PRIMARY KEY,
value TEXT NOT NULL
);
-- Limbic scoring tables
CREATE TABLE entity_access (
entity_id INTEGER PRIMARY KEY REFERENCES entities(id) ON DELETE CASCADE,
access_count INTEGER NOT NULL DEFAULT 1,
last_access TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE co_occurrences (
entity_a_id INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
entity_b_id INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
co_count INTEGER NOT NULL DEFAULT 1,
last_co TEXT NOT NULL DEFAULT (datetime('now')),
PRIMARY KEY (entity_a_id, entity_b_id)
);
-- Full-text search (FTS5)
CREATE VIRTUAL TABLE IF NOT EXISTS entity_fts
USING fts5(name, entity_type, obs_text, tokenize="unicode61");
-- Indexes
CREATE INDEX idx_entities_name ON entities(name);
CREATE INDEX idx_entities_type ON entities(entity_type);
CREATE INDEX idx_obs_entity ON observations(entity_id);
CREATE INDEX idx_rel_from ON relations(from_entity);
CREATE INDEX idx_rel_to ON relations(to_entity);
CREATE INDEX idx_rel_type ON relations(relation_type);
CREATE INDEX idx_access_last ON entity_access(last_access);
CREATE INDEX idx_cooc_b ON co_occurrences(entity_b_id);
IndexTableColumn(s)Purpose
idx_entities_nameentitiesnameFast lookup by entity name
idx_entities_typeentitiesentity_typeFilter by entity type
idx_obs_entityobservationsentity_idRetrieve all observations for an entity
idx_rel_fromrelationsfrom_entityRelations originating from an entity
idx_rel_torelationsto_entityRelations targeting an entity
idx_rel_typerelationsrelation_typeFilter by relation type
idx_access_lastentity_accesslast_accessSort by access recency (temporal decay)
idx_cooc_bco_occurrencesentity_b_idLookup co-occurrences by entity B

Pydantic models serve a dual purpose in MCP Memory v2:

  1. Input validation: Each MCP tool receives JSON from the client. Models validate structure and types before touching the database.
  2. Output serialization: Tool responses are serialized to consistently typed JSON.

The 10 MCP tools use these models to validate and return data about entities and relations.

from pydantic import BaseModel, Field
class EntityInput(BaseModel):
name: str = Field(..., min_length=1)
entityType: str = Field(default="Generic")
observations: list[str] = Field(default_factory=list)
class EntityOutput(BaseModel):
name: str
entityType: str
observations: list[str]
class RelationInput(BaseModel):
from_entity: str = Field(..., alias="from")
to_entity: str = Field(..., alias="to")
relationType: str
model_config = {"populate_by_name": True}
class RelationOutput(BaseModel):
from_entity: str = Field(..., alias="from")
to_entity: str = Field(..., alias="to")
relationType: str
model_config = {"populate_by_name": True}

Input model for creating or updating entities.

FieldTypeRequiredDefaultValidation
namestrYesmin_length=1
entityTypestrNo"Generic"
observationslist[str]No[] (via factory)New list per instance

Output model for entity responses. All fields are required — the server always populates them from the database.

Both models share the same field structure. RelationInput validates incoming client data; RelationOutput serializes responses.

FieldJSON AliasTypeRequired
from_entity"from"strYes
to_entity"to"strYes
relationTypestrYes

The following PRAGMAs are set on every database connection in MemoryStore:

PRAGMA journal_mode = WAL # Write without blocking reads
PRAGMA busy_timeout = 10000 # Wait 10s if locked
PRAGMA synchronous = NORMAL # Balance safety vs speed
PRAGMA cache_size = -64000 # 64 MB cache
PRAGMA temp_store = MEMORY # Temp tables in RAM
PRAGMA foreign_keys = ON # Enforce referential integrity
PRAGMAValueRationale
journal_modeWALWrite-Ahead Logging allows concurrent readers while a writer is active. Readers never block writers and writers never block readers.
busy_timeout10000Wait up to 10 seconds for lock contention before raising SQLITE_BUSY. In the MCP context (sequential tool calls), this is more than sufficient.
synchronousNORMALSafe enough for WAL mode (the WAL file is still synced), but faster than FULL which syncs the database file too. The right trade-off for a local knowledge graph.
cache_size-6400064 MB page cache. Negative values indicate KiB. Reduces disk I/O for repeated queries.
temp_storeMEMORYTemporary tables and intermediate results stay in RAM. Speeds up complex queries and index rebuilding.
foreign_keysONEnforces ON DELETE CASCADE constraints. Without this pragma, SQLite silently ignores foreign key enforcement.
OperationBehavior
Concurrent readsAllowed (WAL supports multiple simultaneous readers)
WritesSequential (single writer)
Lock contentionReaders wait up to 10 seconds (busy_timeout) for a write lock
Cache64 MB in memory to reduce I/O

In the MCP context, where tool calls are sequential, this model is well-suited.

Virtual vec0 tables do not participate in ON DELETE CASCADE. When deleting an entity, observations and relations are removed by CASCADE, but the embedding is not. Always delete embeddings manually before the entity row (see the code example in the entity_embeddings section above).

sqlite-vec can intermittently fail with "cannot start a transaction" during delete operations. This is a known bug in sqlite-vec related to how virtual tables interact with SQLite’s transaction system. Retrying the operation usually resolves it. The codebase catches this error and logs it as a warning without failing.

Each time an entity’s observations change, the embedding is completely regenerated from scratch. The input text includes a full snapshot of all current observations. This means:

  • Consistency: the embedding always reflects the current state, with no partial-update artifacts
  • Cost: every update triggers a full ONNX encoding (~5 ms on CPU for a single vector)
  • Overwrite: INSERT OR REPLACE in vec0 ensures old versions don’t accumulate

The MCP server starts in ~1 second because it does not load the embedding model at startup. The lazy architecture has two layers:

  1. Import lazy: mcp_memory.embeddings is not imported at module scope in server.py. The import happens inside _get_engine().
  2. Instance lazy: EmbeddingEngine.get_instance() creates the singleton only on the first call.

Consequences:

  • First search_semantic call: ~3–5 extra seconds while the model loads
  • Subsequent calls: millisecond responses (engine already in memory)
  • Server startup: always fast, regardless of whether the model is downloaded