A TOML-based configuration format and Node.js library for structured AI agent memory.
Six layers. Explicit decay. Zero amnesia. Model agnostic.
compatible with
built for
architecture
Each layer serves a distinct cognitive role. Memories flow through layers as they age, compress, and consolidate.
What the agent is thinking about right now. Short-lived, high-throughput context that drives the current inference window. Expires in seconds to minutes and evicts old entries via LRU when capacity is reached.
[memory.buffer]
ttl = "5m"
capacity = 1000
strategy = "lru"
priority = 6
A chronological record of what happened and when. Every entry carries a timestamp and decays according to a configurable half-life — older events grow dimmer without being erased outright.
[memory.episode]
half_life = "2h"
decay = "exponential"
max_entries = 500
priority = 5
Structured facts and relational knowledge stored as a persistent, updatable graph. What the agent knows about the world — independent of when it learned it. Nodes merge and evolve over time.
[memory.graph]
persistent = true
storage = "json"
strategy = "merge"
priority = 4
Encoded workflows, task patterns, and learned behaviors. The agent's procedural memory — how to accomplish goals, not just what goals exist. Reinforced on success, versioned across updates.
[memory.skill]
persistent = true
versioned = true
reinforce_on = "success"
priority = 3
Lossy but searchable distillations of memories that have aged out of EPISODE. Not deleted — compressed. The residue of experience that shapes behavior without consuming context budget.
[memory.residue]
compression = "semantic"
ratio = 0.05
source_layers = ["episode"]
priority = 2
Write-once memories that define the agent's fundamental identity, values, and constraints. CORE never decays, never overwrites, and always resolves first. The bedrock that holds everything else steady.
[memory.core]
immutable = true
ttl = "forever"
priority = 1
comparison
Retrieval is not memory. ENGRAM gives agents structured, layered, decaying memory — not just a vector dump.
All memories treated as equal vectors — no hierarchy, no priority, no structure
Embeddings carry no intrinsic timestamp — recency is bolted on as metadata at best
Old context survives indefinitely or vanishes entirely — no graceful aging
Locked to cosine distance — semantically distant but critically important memories get lost
Every inference starts fresh — no continuity, no accumulated understanding
No way to declare that some memories must always win conflicts over others
Retrieval quality degrades as the vector store grows — signal drowns in noise
Opaque internals — memory state is unreadable without embedding model access
BUFFER, EPISODE, GRAPH, SKILL, RESIDUE, CORE — each with a defined cognitive role
EPISODE layer carries full timestamps and half-life decay from the moment of storage
TTL, half-life, and decay strategy are declared per layer in plain TOML
Explicit conflict resolution — CORE always wins, BUFFER is lowest priority
Memory accumulates across sessions — agents build genuine continuity over time
BUFFER → EPISODE → GRAPH consolidation moves memories up the stack automatically
Decayed EPISODE entries compress into RESIDUE — searchable, lossy, never truly lost
Full memory architecture is a human-readable file you can read, commit, and review in a PR
design principles
ENGRAM is opinionated by design. These four principles drive every decision in the library and the format.
A slot-based memory architecture beats a bag of vectors. Know what each memory means before you store it. ENGRAM gives every memory a declared type, layer, and role — nothing floats anonymously in latent space.
Memories don't vanish — they compress into RESIDUE and consolidate into GRAPH. Nothing is truly lost, just transformed. Forgetting should be intentional, gradual, and reversibly auditable.
Not all memories are equal. CORE never changes. BUFFER expires in minutes. The architecture reflects cognitive reality — cramming everything into one context window is not a memory strategy.
Your agent's memory should be readable by humans. Declare it in a file, commit it to git, review it in a PR. If you can't explain your memory architecture in a text editor, you don't control it.
usage
The ENGRAM library gives you a clean TypeScript/JavaScript API over the full layer stack. One config file. Six layers. Zero boilerplate.
# engram.toml
[agent]
id = "research-assistant"
version = "0.1.0"
[memory.buffer]
ttl = "5m"
capacity = 1000
strategy = "lru"
[memory.episode]
half_life = "2h"
max_entries = 500
[memory.graph]
persistent = true
storage = "json"
[memory.core]
immutable = true
// npm install @MateoKnox/engram
import { EngramEngine } from '@MateoKnox/engram';
const engine = new EngramEngine('./engram.toml');
await engine.init();
// Store a memory in the episode layer
await engine.store('episode', 'User asked about photosynthesis', {
tags: ['biology', 'user-query'],
importance: 0.8
});
// Recall across all layers
const memories = await engine.recall('photosynthesis', {
layers: ['graph', 'episode', 'core'],
limit: 5
});
// Run decay pass
await engine.decay();
// Consolidate buffer → episode → graph
await engine.consolidate();
Install the library, drop in a config file, and give your agent a memory that actually works across sessions.