v2 substrate · m9 · v2.1 brain-aligned · m12

the cognitive
substrate for
autonomous agents.

agidb is a content-addressable hyperdimensional memory database with first-class goals, beliefs, sensory input, and self-model — and in v2.1, the only agent memory substrate with brain-aligned multimodal encoding.

not a vector database. not a graph database. a new category — built around the seven things an autonomous agent must persist, each as a first-class typed shape. one rust binary. one API. observe() · recall() · set_goal() · assert_belief() · unlearn(). no LLM in the read path. sub-50ms p95.

$ cargo add agidb → read the brain-alignment spec

primitive8192-bit HV

recall p95< 50ms

floors7 cognitive

networkzero

┌── agidb.observe_multimodal · live pipeline ── v2.1 · brain-aligned

idle · sensory buffer

000ms

▸ video V-JEPA 2 · 1.2B

1024d

▸ audio Wav2Vec-BERT 2.0

1024d

▸ text Llama-3.2-3B

▍

2048d

Charikar '02 · sign(R·x) ↓ VSA · ⊕ role-filler bind

[ bound episode signature · 8192-bit ] ep #—

ROLE_v⊗ŝᵥ ⊕ ROLE_a⊗ŝₐ ⊕ ROLE_t⊗ŝₜ ⊕ ROLE_τ⊗ŝ_τ

BAMS · RSA vs TRIBE v2 DMN D-Att FPN SM Vis V-Att r̄ = 0.67

└── no LLM in read path · pure rust · embedded ── ▼

┌─ inheriting hippocampal indexing· complementary learning systems · McClelland '95· HDC · Kanerva '88 · Charikar '02· bi-temporal · Snodgrass '99· TRIBE v2 · Meta FAIR · Mar '26· V-JEPA 2 · Wav2Vec-BERT · Llama-3.2-3B

[ 001 ] premise · why agidb exists

every database was built
for a different consumer.

postgres for accountants. mongo for app developers. neo4j for analysts. pinecone for retrieval pipelines. mem0 / letta / zep for chat-style RAG memory. none were designed for an autonomous agent that must remember, reason, revise, and forget across years — with provenance, audit, and goal awareness.

six structural problems of the vector-DB pipeline

latency — p95 1–3s, sometimes 60s, every recall is multiple network calls
cost — embedding API + context-window tokens, every query
no temporal grounding — vector DBs don't know what was true when
no provenance — weak attribution, no audit trail
no graceful degradation — below threshold = empty result
no consolidation — store grows without bound, no sleep

four cognitive gaps specific to autonomous agents

no first-class goals — text-stored; state, parent-child, success criteria in agent code
no first-class beliefs — confidence + revision history live in agent code, badly
no introspection — agent can't ask "what did I learn?" — no event log
no clean unlearn — removal cascades through atoms, beliefs, procedures · nobody handles this

▾ these aren't bugs · they are properties of the wrong primitive. ▾

today six-step glue pipeline

01embed every conversation
02store vectors · pinecone / qdrant / pgvector
03graph DB for relations · neo4j / kuzu
04at recall: embed query · cosine top-k
05rerank with another LLM call
06stuff chunks into prompt · pray

p951–3ssome hit 60s

cost$/recallembed + tokens

goals?nonot first-class

unlearn?DELETEdestructive

agidb one substrate · seven floors

let db = Agidb::open("./memory.agidb").await?;

db.observe("sarah recommended bawri in bandra last weekend").await?;
db.assert_belief(Belief::new("sarah likes thai food").with_confidence(0.8)).await?;
db.set_goal(Goal::new("find a thai place for the team dinner")).await?;

let r = db.recall("what thai place did sarah mention?").await?;
// → "Bawri"  conf=0.94  tier=Exact  goal-biased  source=msg#1217

p95< 50mslocal POPCOUNT

cost$0no API in read

goals?typedstate machines

unlearn?cascading+ self-vector subtract

[ 002 ] the seven cognitive floors core architecture

seven things an
autonomous agent persists.

50 years of memory research (tulving, squire, baddeley, mcclelland) names five biological memory systems. add goals/beliefs and the self-model — what an agent needs that a brain organism doesn't — and you get the seven floors agidb stores as first-class typed shapes.

notation

↗ = surprise-gated promotion · ↻ = revision · ⊕ = bound role-filler · α = EMA rate

brain mapping

floors 1·3 → hippocampus · floor 4 → neocortex · floor 5 → basal ganglia · floor 7 → DMN + dlPFC

phase status

✅ shipped from sochdb v1 inheritance · ⬜ phase 9–16 (v2.0 + v2.1)

self-model DMN · dlPFC ⬜ phase 10 · v2.0

append-only log of every learning event · self-vector EMA — a slowly drifting 8192-bit centroid representing "what kind of agent am I"

db.what_did_i_learn(since) · db.self_vector() · db.attention_trace(recall_id)

goals + beliefs PFC ⬜ phase 9 · v2.0

what the agent wants (state machine: Active / Paused / Completed / Abandoned) and what it thinks is true (confidence + evidence + revision audit)

db.set_goal(g) · db.assert_belief(b) · db.revise_belief(evidence) · db.what_do_i_believe(about)

procedural basal ganglia · cerebellum ⬜ phase 9+

skills and workflows with execution traces and success-count statistics — typed episode shape

db.observe_procedure(p) · db.record_execution(p, trace) · db.procedure_stats(p)

semantic neocortex ✅ inherited · phase 6

decoupled general knowledge · facts consolidated from N≥3 episodic patterns into a SemanticAtom

db.consolidate() · db.what_about(concept_id)

episodic hippocampus · CA1 · CA3 ✅ inherited · phase 4

events with bi-temporal stamps and HDC signatures · the core typed shape · multimodal in v2.1

db.observe(text, ctx) · db.observe_multimodal(video, audio, text) · db.recall(query)

working dlPFC · ~7±2 slots ✅ inherited

active context · session-scoped · recency-weighted retrieval over episodic — no separate table

db.recall(Query::with_session(sid))

sensory primary cortices (V1 / A1 / S1) ⬜ phase 10 · multimodal v2.1

raw signal ring buffer with surprise-gated promotion to episodic · v2.1: multimodal (video + audio + text) with brain-calibrated θ_brain

db.observe_sensory(frame) · db.observe_multimodal(...) · surprise > θ_brain ↗ episodic

[ 003 ] content-addressable retrieval

give it a fragment.
it returns the whole pattern.

classical DBs need keys. vector DBs need similar embeddings. agidb takes the partial pattern itself — a few entities, a rough time, half a sentence — and lets activation converge toward the stored memory it best overlaps. this is what a hippocampus does.

memory cortex 8192-bit space · projected 40×16 · 4 stored episodes

phase 1 · awaiting query

┌┐ └┘

row 00row 04row 08row 12

col 00col 08col 16col 24col 32

stored episode patterns

ep #1217 · sarah's thai pick—
ep #841 · "i live in berlin"—
ep #1402 · lisbon offsite—
ep #1654 · auth refactor—

the four phases

01storeepisode binds to sparse cell pattern
02probepartial query activates ~35% of target
03propagateactivation spreads through neighbors
04convergenearest attractor lights · match

░ inactive cell ▒ stored pattern (dim) ▓ probe activation ▓ propagating █ converged match

[ 004 ] multimodal sensory encoding v2.1 · phase 14

one episode.
three modalities.
one bound signature.

v2.1 adds a multimodal sensory pipeline using the same frozen encoder stack as Meta FAIR's TRIBE v2 brain-encoding foundation model. each modality projects to an 8192-bit signature via Charikar '02 random projection (training-free, JL-distance-preserving). modalities are bound via VSA role-filler XOR — factorable, unlike attention fusion.

▸ video 64 frames · 256×256 · 4s window

V-JEPA 2 Gigantic-256

1.2B params · self-supervised on 1M+ hours · ViT + 3D-RoPE · EMA target

CPU 1.5s · GPU 200ms

1024d latent → 8192-bit ŝᵥ

▸ audio 60s @ 16kHz · resampled 50→2 Hz

Wav2Vec-BERT 2.0

multilingual SSL · frame-level 1024-d latents · mean-pooled

CPU 400ms · GPU 80ms

1024d latent → 8192-bit ŝₐ

▸ text 1024 tokens · preceding context

Llama-3.2-3B

layer-32 mean-pool hidden state · no generation · forward only

CPU 200ms · GPU 30ms

2048d latent → 8192-bit ŝₜ

↓ VSA role-filler XOR · factorable

episode = ROLE_v⊗ŝᵥ ⊕ ROLE_a⊗ŝₐ ⊕ ROLE_t⊗ŝₜ ⊕ ROLE_τ⊗ŝ_τ

recover any modality ⇒ ŝₐ ≈ episode ⊕ ROLE_a then nearest-neighbor cleanup against the audio codebook

attention fusion · TRIBE / mem0 / letta

dense hidden state. components entangled. cannot recover original audio from a fused episode.

↘ ↓ ↙ ▓▓▓ entangled ▓▓▓

VSA · XOR role-filler · agidb

each modality bound to its own role HV. unbind ⊕ ROLE recovers the modality signature with clean-up.

↘ ↓ ↙ ⊕ [ factorable bound HV ]

[ 005 ] brain alignment · BAMS v2.1 · phase 15–16

not a metaphor.
a benchmark.

because agidb uses the same encoder stack as TRIBE v2 (Meta FAIR, March 2026 · the brain-encoding foundation model that won Algonauts 2025), its internal HDC signatures can be benchmarked directly against TRIBE-predicted cortical activations across 720 fMRI subjects. that benchmark is BAMS — representational similarity analysis across six functional networks.

cortical flatmap · schaefer 1000-parcel atlas · 6 networks TRIBE v2 · n=720 · OOD r̄=0.215

DMN · default moder=0.72 dorsal attentionr=0.64 frontoparietalr=0.81 somatomotorr=0.41 visualr=0.88 ventral attentionr=0.55

RSA correlation · agidb signatures × TRIBE-predicted BOLD

r < 0.3 0.3–0.6 > 0.6 diag mean · 0.67

θ_brain · brain-calibrated surprise gate phase 15 calibration

surprise(t) = 1 − ham_sim( s(t), bundle( s[t-K..t] ) )

fit against TRIBE-predicted neural surprise · associative cortex (TPJ · dlPFC · DMN) θ_brain = 0.52

┌─ inheriting from TRIBE v2 · Meta FAIR · arxiv 2507.22229 · mar '26 · CC-BY-NC · 720 subjects · 70k voxel-level V-JEPA 2 · arxiv 2506.09985 · Gigantic-256 · 1.2B params Charikar '02 · STOC · similarity estimation via rounding · JL preservation Algonauts 2025 · TRIBE v1 · 1st of 263 teams

[ 006 ] bi-temporal supersession ✅ inherited · phase 2

it remembers
what was true when.

new facts don't overwrite. they're stamped t_valid_start = now; the old fact gets t_valid_end = now − 1ms and a superseded_by pointer. ask the db "as of" any date — get the answer that was true then.

┌── valid-time axis · drag the scrubber ──

2024-01jan 2025-08aug 2026-03mar todaymay '26

i live in mumbai · superseded

i live in berlin · superseded

i live in lisbon · current

recall("where do i live?") as of 2025-09-15

berlin

tier · exact confidence · 0.97 source · msg #841

[ 007 ] sleep consolidation · self-vector EMA phase 6 ✅ · phase 10 ⬜

the substrate
that sleeps & becomes.

every five minutes a background worker clusters episodic patterns into semantic atoms, detects contradictions, decays unused entries, and — new in v2 — updates the self-vector EMA: self_vec ← (1−α)·self_vec + α·bundle(consolidated). inspired by V-JEPA 2's target encoder + TRIBE's per-subject embedding.

cluster

scan 7d · hamming group

surprise

score vs current beliefs

bundle

N≥3 → semantic atom

contradict

overlap t_valid → supersede

self-vec

EMA update α=0.05

decay

unused 90d → cold

compact

rewrite signatures.dat

consolidation

cluster

┌─ last consolidation report

scanned12,471 episodes7d
clusters218 candidates≥ 3 members
surprise > θ1,084 (8.7%)brain-cal
semantic atoms+ 84 newbundled
contradictions11 resolvedsuperseded
self-vec driftΔ 184 bitshamming
decayed2,103 archivedcold
duration1.84sidle

self_vector trajectory · 30d · α=0.05 ||drift|| = 1,842 bits

day -30 today

[ 008 ] non-destructive unlearn · self-vector subtraction ⬜ phase 11 · v2.0

the only substrate
that can truly forget.

GDPR Article 17. poisoned memories. right-to-be-forgotten. a hard DELETE leaves the self-model contaminated by centroid drift from the deleted data. agidb cascades through episodes → beliefs → semantic atoms → procedures, tombstones non-destructively for a 30-day recovery window, and subtracts the deleted signatures from the self-vector itself.

identify cascade

find episodes · beliefs · atoms · procedures referencing target · compute dependency graph

47 episodes · 12 beliefs · 3 atoms · 1 procedure

tombstone

mark t_tombstoned = now · invalidate in mmap · concept HV withdrawn · removed from inverted index

47 signatures invalidated · 30-day recovery window

cascade revise

beliefs with evidence under threshold → confidence reduced or withdrawn · atoms recomputed without removed evidence

8 beliefs withdrawn · 4 revised · 1 procedure degraded

self-vector subtract

the step nobody else does. self_vec ← self_vec − α·bundle(tombstoned_sigs). without it, the self-model still "remembers" via centroid contamination.

Δ self-vector = 184 bits · drift recorded in self_vector_history

self_vec ← self_vec − α · bundle(tombstoned)

audit

emit LearningEvent::Unlearned with target_ref, cascade_size, self_vec_drift, dependency_graph_id, reason — permanent record

audit entry persisted · cannot be unlearned itself

[ 009 ] 5-year trajectory · v2.0 → v2.5

the database
AGI will run on.

v2.0 is the substrate. v2.1 is brain-aligned multimodal. v2.2–v2.5 builds the cognitive engine on top — pattern completion, formal belief revision, causal claims, world model fragments, closed-loop self-modification with formal safety guarantees. five years committed.

v2.0

2026 · m9

substrate

seven floors · typed cognitive shapes
goals + beliefs first-class
self-model · learning log · unlearn
neurosymbolic interface

v2.1

2026 · m12

brain-aligned

V-JEPA 2 + W2V-BERT + Llama-3.2-3B sensory
brain-calibrated θ_brain
BAMS benchmark · ICLR '26 paper

v2.2

2027

cognitive engine v0.1

Hopfield pattern completion
AGM-formal belief revision
analogical retrieval via HDC bind

v2.3

2028

causal + world model

causal claim storage · intervention semantics
world model fragments
on-line learning state

v2.4

2029–30

production · distributed

enterprise tier · distributed mode
formal safety on self-modification
BCI sensory experimental · Brain-JEPA

v2.5

2031

AGI-grade

closed-loop self-modification
causal reasoning over beliefs
full cognitive engine

[ 010 ] against the field

memory frameworks above the LLM.
agidb is a substrate beneath.

mem0, letta, zep, cognee, hippoRAG, MemMachine all sit above the LLM as python frameworks. agidb is a rust substrate beneath the agent loop. different layer, different shape.

layer

read path

representation

cognitive prims

brain-align

agidbrust substrate

beneath agent

no LLM · POPCOUNT

HDC · 8192-bit · VSA bind

goals · beliefs · self-vec · unlearn

BAMS · TRIBE-aligned

mem0$24M · 41K★

python framework

LLM optional · 1–3s p95

vector + graph + kv

lettamemgpt · $10M · 22K★

agent runtime

LLM in loop

memory blocks · postgres

core mem text

zep · graphiti25.7K★

python + neo4j

cypher + embed hybrid

temporal KG

cognee€7.5M · 12K★

python pipeline

LLM-heavy

vector + graph hybrid

hippoRAG / hippoMMOSU-NLP · linyueqian

app on LLM

PPR over KG

KG + dentate gyrus

architectural claim only

hyperonsingularitynet

research substrate

MeTTa interpreter

metagraph · AtomSpace

yes

agentmemoryrohitg00 · rust

rust server

BM25 + HNSW

RocksDB hybrid

[ 011 ] why now · five converging trends

may 2026 is
the right window.

agent memory became a category.

real funding · the question shifted from whether to how.

mem0 · $24M · oct '25
letta · $10M · sep '24 · felicis
cognee · €7.5M · feb '26 · pebblebed

HDC matured for production.

torchhd · karunaratne nature electronics · pathHD · HPE hippocampus papers.

31× lower latency vs vector dbs
14× lower token cost
math is no longer experimental

rust embedded dbs grew up.

duckdb · lancedb · redb · surrealdb · tigerbeetle — single binary is normal.

redb · pure rust · ACID · MVCC
cargo add · sqlite-shaped story
MCP reaches agents directly

frontier labs aren't shipping substrates.

anthropic memory tool = CRUD over /memories. openai memory = product feature. gemini personal context = product feature. vendor-neutral wedge is open.

anthropic · sep '25 · file directory
openai · apr '25 · feature
nobody else · open

TRIBE v2 made brain-alignment tractable.

meta FAIR released open weights for fMRI-prediction across 720 subjects from V-JEPA 2 + Wav2Vec-BERT + Llama-3.2-3B. shared encoder stack = free brain-alignment for agidb.

720 subjects · 70k voxel
algonauts 2025 · 1st of 263
no other agent memory can ship this

[ 012 ] the decision gate · week 12 binding

one week
collapses the bet.

every claim on this page resolves at phase 7 · week 12 against a shared harness: LongMemEval-S · LoCoMo · BEAM · cognitive (goal · belief · unlearn · multi-floor). six metrics per run · never a single number · raw logs + harness hash with every claim. three outcomes — and the project commits to one.

→ commit proceed to launch · v2.1 · fundraise

≥ Zep/Graphiti accuracy on LongMemEval-S (within 1pp F1 + LLM-judge)
≥ 3× lower p95 latency vs mem0 (target < 50ms)
≥ 3× lower token cost vs mem0 (< 2,500 tokens/query)
wins noisy-cue degradation test
all four cognitive benchmarks pass
holds across all three standard benchmarks · no cherry-picking

↦ reposition ship smaller · no v2.1 · no fundraise

within 3pp of mem0 F1
≥ 10× memory savings vs alternatives
partial cognitive benchmark pass acceptable
reposition as agidb-lite · embedded cognitive memory for edge agents
skip brain-alignment milestone

↤ retreat fold back · reposition product

>10pp behind dense baselines
gap doesn't close with reranking
cognitive benchmarks fail
reposition as ctxgraph · temporal graph memory
preserve the IP · publish what we learned

[ 013 ] ship

embedded.
one binary.
zero infra.

agidb v2 is pre-alpha · milestone-driven · open-source. v2.0 substrate ships month 9, v2.1 brain-aligned multimodal at month 12.

licenseapache-2.0

platformsmacos · linux

installcurl · one line

bindingsrust · python · MCP

statusv2 pre-alpha

tests128 passing

v2.1 weights~4GB · downloaded on first use

# one-line installer · pinned release (always in sync with binaries)
$ curl -fsSL https://github.com/rohansx/agidb/releases/download/v0.1.0-dev.1/install.sh | sh

# or latest from master (may lag a few minutes after release)
$ curl -fsSL https://raw.githubusercontent.com/rohansx/agidb/master/install.sh | sh

# verify
$ agidb --version
agidb 0.1.0-dev

# use — one CLI, one store, no query language
$ agidb observe  ./mem.agidb "sarah recommended bawri in bandra"
$ agidb recall   ./mem.agidb "what thai place did sarah pick?"
$ agidb set-goal ./mem.agidb "find a thai place"
$ agidb assert-belief ./mem.agidb "sarah likes thai" --confidence 0.8
$ agidb consolidate ./mem.agidb
$ agidb stats    ./mem.agidb
$ agidb serve    ./mem.agidb      # MCP stdio for claude desktop / cursor

# pin a version, choose install dir, or override repo
$ curl -fsSL .../install.sh | sh -s -- --tag v0.1.0-dev.1 --to ~/bin
$ curl -fsSL .../install.sh | sh -s -- --repo myfork/agidb

# Cargo.toml
[dependencies]
agidb = "0.2"     # v2.0 pre-alpha

// main.rs
use agidb::{Agidb, Goal, Belief};

let db = Agidb::open("./memory.agidb").await?;

db.observe("sarah recommended bawri in bandra").await?;
db.assert_belief(Belief::new("sarah likes thai").with_confidence(0.8)).await?;
db.set_goal(Goal::new("find a thai place")).await?;

let r = db.recall("what thai place did sarah pick?").await?;
println!("{:?}", r.top());
// → Match { text: "Bawri", confidence: 0.94, tier: Exact, source: "msg#1217" }

# install
$ pip install agidb

# use
from agidb import Agidb, Goal, Belief

db = await Agidb.open("./memory.agidb")
await db.observe("sarah recommended bawri")
await db.assert_belief(Belief("sarah likes thai", confidence=0.8))
await db.set_goal(Goal("find a thai place"))

r = await db.recall("what thai place?")
print(r.top)   # → "Bawri" conf=0.94

# run as an MCP server
$ agidb mcp --path ./memory.agidb

# MCP tools exposed:
#   agidb.observe / agidb.recall / agidb.consolidate
#   agidb.set_goal / agidb.assert_belief / agidb.revise_belief
#   agidb.what_did_i_learn / agidb.unlearn

# claude desktop / cursor / any MCP-compatible agent
# now has typed cognitive memory · no glue code

// v2.1 · multimodal observation
use agidb::{Agidb, VideoClip, AudioClip};

let db = Agidb::open("./memory.agidb").await?;
// encoders download on first use · ~4GB

let ep_id = db.observe_multimodal(
    VideoClip::from_path("clip.mp4")?,
    AudioClip::from_path("clip.wav")?,
    "sarah pointed at the bawri sign".into(),
).await?;
// V-JEPA 2 · W2V-BERT · Llama-3.2-3B
// → bound 8192-bit episode HV · ~2s laptop CPU

// later: recover the audio component
let audio_sig = db.extract_modality(ep_id, Modality::Audio).await?;

the cognitive substrate for autonomous agents.

every database was built for a different consumer.

seven things an autonomous agent persists.

give it a fragment. it returns the whole pattern.

one episode. three modalities. one bound signature.

not a metaphor. a benchmark.

it remembers what was true when.

the substrate that sleeps & becomes.

the only substrate that can truly forget.

the database AGI will run on.

memory frameworks above the LLM. agidb is a substrate beneath.

may 2026 is the right window.