Getting Started
What is mcp-memory?
Section titled “What is mcp-memory?”mcp-memory is a drop-in replacement for Anthropic’s MCP Memory server. It provides a persistent knowledge graph where AI agents store entities, observations, and relationships — and retrieve them across sessions.
It keeps full API compatibility with Anthropic’s 8 tools while adding semantic search, hybrid retrieval, and a dynamic scoring engine. All data is stored in SQLite with WAL mode for safe concurrent access. See the Architecture page for a deep dive into how it works.
Why it exists
Section titled “Why it exists”The official Anthropic server stores the entire knowledge graph in a single JSONL file. This works for demos, but breaks under real usage:
| Dimension | JSONL (Anthropic) | mcp-memory |
|---|---|---|
| Indexing | None — full file scan on every query | SQLite indexes on name, type, and content |
| Semantic search | Not available | KNN with ONNX embeddings (94+ languages) |
| Hybrid search | Not available | KNN + FTS5 via RRF |
| Query routing | Not available | Dynamic 3-strategy routing (COSINE_HEAVY/LIMBIC_HEAVY/HYBRID_BALANCED) |
| Limbic scoring | Not available | Salience + temporal decay + co-occurrence with temporal decay |
| Entity splitting | Not available | Automatic TF-IDF based splitting with approval workflow |
| A/B testing | Not available | Shadow mode with NDCG@K metrics |
| Auto-tuning | Not available | Grid search for GAMMA/BETA_SAL optimization |
| Concurrency | Race conditions confirmed | SQLite WAL with 5-second busy timeout |
| Scale | Degrades linearly with file size | O(log n) indexed queries |
| Data corruption | Documented in issues #1819, #2579 (May 2025, still open) | ACID transactions with auto-rollback |
The official server rewrites the entire file on every operation. Without locking or atomic writes, concurrent operations produce JSON merging and duplicate lines. mcp-memory solves these problems at the root with a storage engine designed for persistent data.
Requirements
Section titled “Requirements”- Python >= 3.12
- uv (recommended) or pip for dependency management
- Git for cloning the repository
- ~465 MB disk space if you download the embedding model (optional)
- ~50 MB for test suite (313 tests passing)
Installation
Section titled “Installation”1. Clone the repository
Section titled “1. Clone the repository”git clone https://github.com/Yarlan1503/mcp-memory.gitcd mcp-memory2. Install dependencies
Section titled “2. Install dependencies”uv syncuv sync creates a virtual environment, resolves all dependencies from pyproject.toml, and generates the mcp-memory entry point.
3. Download the embedding model (optional)
Section titled “3. Download the embedding model (optional)”uv run python scripts/download_model.pyThis downloads the sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 sentence model (~465 MB) to ~/.cache/mcp-memory-v2/models/:
| File | Purpose |
|---|---|
model.onnx | ONNX-exported model for CPU inference |
tokenizer.json | HuggingFace fast tokenizer (Rust) |
tokenizer_config.json | Tokenizer configuration |
special_tokens_map.json | Special token mappings |
4. Verify the installation
Section titled “4. Verify the installation”uv run mcp-memoryThe server starts as a stdio process. It registers as "memory" in the MCP protocol, listens for JSON-RPC on stdin, and writes logs to stderr (no interference with MCP communication).
Configuration
Section titled “Configuration”OpenCode
Section titled “OpenCode”Add to the mcp section of your opencode.json:
{ "mcp": { "memory": { "command": "uv", "args": ["--directory", "/path/to/mcp-memory", "run", "mcp-memory"] } }}Replace /path/to/mcp-memory with the absolute path to the cloned repository.
Claude Desktop
Section titled “Claude Desktop”Add to your Claude Desktop config file:
{ "mcpServers": { "memory": { "command": "uv", "args": ["run", "mcp-memory"], "cwd": "/path/to/mcp-memory" } }}Replace /path/to/mcp-memory with the absolute path to the cloned repository.
uvx (no clone required)
Section titled “uvx (no clone required)”If you prefer not to clone the repo, run directly from GitHub:
{ "mcpServers": { "memory": { "command": "uvx", "args": ["--from", "git+https://github.com/Yarlan1503/mcp-memory", "mcp-memory"] } }}First steps
Section titled “First steps”Create entities
Section titled “Create entities”Store knowledge as entities with a name, type, and observations:
{ "entities": [ { "name": "My Project", "entityType": "Project", "observations": [ "Built with Astro and Starlight", "Deployed on Vercel", "Uses Pagefind for search" ] } ]}If an entity already exists, create_entities merges observations instead of overwriting. Duplicates are discarded.
Link entities with relations
Section titled “Link entities with relations”Connect entities with typed relationships:
{ "relations": [ { "from": "My Project", "to": "Astro", "relationType": "uses" }, { "from": "My Project", "to": "Vercel", "relationType": "deployed_on" } ]}Both entities must exist before creating a relation between them.
Search by substring
Section titled “Search by substring”Find entities by keyword across names, types, and observation content:
{ "query": "project"}search_nodes uses LIKE pattern matching. It requires no embedding model and returns all entities whose name, type, or observations contain the query string.
Search by meaning
Section titled “Search by meaning”Find entities that are semantically related to your query, even without matching keywords:
{ "query": "web framework deployment", "limit": 5}search_semantic encodes the query into a 384-dimensional vector and finds the nearest neighbors by cosine similarity. Results are re-ranked by the Limbic Scoring engine, which considers access frequency, recency, and co-occurrence patterns.
Split large entities automatically
Section titled “Split large entities automatically”Entities with many observations can be automatically split into focused sub-entities:
{ "entity_name": "My Project"}analyze_entity_split evaluates if an entity exceeds its type threshold (Sesion=15, Proyecto=25, otras=20) and uses TF-IDF to group observations into topics. If splitting is recommended, propose_entity_split returns suggested new entity names and the relations to create.
{ "entity_name": "My Project", "approved_splits": [ { "name": "My Project - Architecture", "entity_type": "Project", "observations": ["Stack: FastMCP + SQLite", "MCP Memory v2"] } ]}execute_entity_split creates the new entities, moves observations, and establishes contiene/parte_de relations — all within an atomic transaction.
Without the model
Section titled “Without the model”The server works without the embedding model downloaded. Here’s what changes:
| Feature | Without model | With model |
|---|---|---|
create_entities | ✅ Works | ✅ Works + generates embedding |
create_relations | ✅ Works | ✅ Works |
add_observations | ✅ Works | ✅ Works + regenerates embedding |
delete_entities | ✅ Works | ✅ Works + removes embedding |
delete_observations | ✅ Works | ✅ Works + regenerates embedding |
delete_relations | ✅ Works | ✅ Works |
search_nodes | ✅ Works | ✅ Works |
open_nodes | ✅ Works | ✅ Works |
migrate | ✅ Works | ✅ Works + generates embeddings |
search_semantic | ❌ Error | ✅ Works |
find_duplicate_observations | ❌ Error | ✅ Works |
consolidation_report | ✅ Works | ✅ Works |
end_relation | ✅ Works | ✅ Works |
add_reflection | ✅ Works | ✅ Works + generates embedding |
search_reflections | ❌ Error | ✅ Works |
When the model is not available, search_semantic returns a clear error message instructing you to run the download script. All other tools function normally.
Next steps
Section titled “Next steps”- Architecture — understand the storage engine, embedding pipeline, and data flow
- Tools Reference — parameters, responses, and edge cases for all 19 tools
- Semantic Search — how vector search, hybrid retrieval, and Limbic Scoring work together
- Maintenance & Operations — deduplication, entity splitting, consolidation reports, and best practices
- Auto-tuning — optimize GAMMA and BETA_SAL via grid search