Memory System
Three-Layer Architecture
Section titled “Three-Layer Architecture”Layer 1: Episodic Memory (Raw Sessions)
Section titled “Layer 1: Episodic Memory (Raw Sessions)”Every message, tool call, and event is persisted as append-only JSONL. One file per session (typically per burst) at .annihilation/sessions/.
Session.Writer appends events atomically. Session.Reader replays them for analysis.
Searchable via the Search.Index — a separate SQLite database (.annihilation/search.db) with an FTS5 virtual table using porter unicode61 tokenization:
# Search past sessionsSearch.Index.search(project_root, "SSE parser edge case", limit: 10, agent: "coder")# -> {:ok, [%{burst_id: "...", content_snippet: "...>>>SSE parser<<<...", ...}]}Supports boolean operators (AND, OR, NOT), phrase matching ("exact phrase"), prefix matching (word*), and column-specific filtering (content:word).
Layer 2: Working Memory (Diary Entries)
Section titled “Layer 2: Working Memory (Diary Entries)”After each burst, the reflection pipeline processes transcripts into structured summaries:
%Annihilation.Memory.DiaryEntry{ id: "diary_42_12345", burst_id: "burst_20260227T103045_001", bead_id: "42", agent_template: "coder", accomplishments: ["Implemented phase transitions", "Added tests for all phases"], decisions: ["Used handle_continue pattern over gen_statem"], challenges: ["SSE parser edge case with split chunks"], key_learnings: ["Always buffer SSE until double newline"], tags: ["elixir", "genserver", "streaming"], status: :success, # :success | :failure | :mixed quality_score: 0.85, timestamp: "2026-02-27T10:45:00Z"}Diary entries are persisted as JSONL via Memory.DiaryWriter. They support tag matching, quality scoring, and success/failure filtering.
Layer 3: Procedural Memory (Playbook Rules)
Section titled “Layer 3: Procedural Memory (Playbook Rules)”Distilled from diary entries into reusable, confidence-scored rules:
%Annihilation.Memory.PlaybookRule{ id: "rule_xyz", text: "Always buffer SSE chunks until \\n\\n delimiter before parsing", confidence: 0.75, maturity: :established, # :nascent | :established | :proven success_count: 5, failure_count: 0, anti_pattern: false, source_entries: ["diary_42_12345"], tags: ["streaming", "sse"], created_at: ~U[2026-02-27 10:50:00Z], last_applied_at: ~U[2026-02-27 12:00:00Z]}Storage
Section titled “Storage”Playbook rules are stored in YAML for human readability:
- Project:
.annihilation/playbook.yaml(project-specific rules) - Global:
~/.annihilation/playbook.yaml(cross-project rules)
Atomic writes via temp file + rename prevent corruption. Project rules take precedence over global rules with the same ID via Playbook.load_merged/1.
Confidence Scoring
Section titled “Confidence Scoring”# Scoring formuladecayed_confidence = confidence * 0.5^(days_since_last_applied / 90)
# Feedback asymmetrysuccess_increment = +0.05failure_decrement = -0.20 # 4x asymmetry -- one bug outweighs several successesTime Decay
Section titled “Time Decay”Rules lose half their confidence every 90 days of non-use. If last_applied_at is nil, created_at is used as the baseline.
Maturity Progression
Section titled “Maturity Progression”:nascent --> :established --> :proven (new) (confidence >= 0.5 (confidence > 0.8 AND 3+ applications) AND 10+ applications)Promotion happens one level at a time. Demotion occurs when confidence falls below thresholds:
:proven->:establishedif confidence < 0.5:established->:nascentif confidence < 0.3
Maintenance Sweep
Section titled “Maintenance Sweep”Confidence.sweep/2 runs periodic maintenance across the entire playbook:
{:ok, %{promoted: 2, demoted: 1, flagged: 3}} = Confidence.sweep(project_root)Applies time decay, promotes/demotes based on thresholds, and flags demotion candidates (confidence < 0.2) and removal candidates (confidence < 0.1, more failures than successes).
Anti-Patterns
Section titled “Anti-Patterns”When a rule accumulates too many failures, the system proposes inversion:
Detection Criteria
Section titled “Detection Criteria”failure_count >= 3 AND failure_count > 2 * success_countInversion Process
Section titled “Inversion Process”AntiPattern.scan/1identifies candidates across the playbook- Each candidate gets an inversion proposal:
- Original rule is deprecated (confidence set to 0.0)
- Inverted rule created:
"AVOID: {original rule} -- this pattern has caused repeated issues ({N} failures vs {M} successes)."
- Tether reviews proposals before they are applied
AntiPattern.apply_inversions/2deprecates originals and adds inversions
Anti-patterns are surfaced alongside positive rules so psychonauts know what NOT to do.
Context Injection
Section titled “Context Injection”Before starting a task, Memory.Context queries the playbook for relevant rules and formats them for injection into the agent’s system prompt.
Relevance Scoring
Section titled “Relevance Scoring”relevance = matching_tags / total_tags # 0.0 - 1.0combined_score = decayed_confidence * relevance# Anti-pattern boost: combined_score * 1.5Rule tags are matched against bead labels and type (case-insensitive).
Selection
Section titled “Selection”context = Memory.Context.select_rules(project_root, bead_labels, bead_type)# -> %{# rules: [%PlaybookRule{}, ...], # Positive rules, ranked by score# anti_patterns: [%PlaybookRule{}, ...], # Things to avoid# total_score: 2.45,# token_count: 312# }Budget: max 500 tokens, max 10 rules. Rules below score 0.05 are excluded.
Prompt Format
Section titled “Prompt Format”Memory.Context.format_for_prompt(context)# -># ## Relevant Guidelines## The following rules are based on past experience with similar tasks:## 1. [ESTABLISHED] Always buffer SSE chunks until \n\n delimiter (confidence: 0.75)# 2. [PROVEN] Use handle_continue for phase transitions (confidence: 0.92)## ## Patterns to Avoid## These patterns have caused problems in similar past work:## 1. AVOID: Using gen_statem for simple state machines (confidence: 0.67)Feedback Attribution
Section titled “Feedback Attribution”Context.track_injection/3 records which rules were injected for each agent/bead combo (in ETS). After task completion, Context.record_outcome/4 applies success/failure feedback to all injected rules:
# After agent completes successfullyMemory.Context.record_outcome(project_root, agent_id, bead_id, :success)# -> All injected rules get +0.05 confidenceSkills
Section titled “Skills”Skills are reusable prompt/code templates stored at .annihilation/skills.yaml:
%Annihilation.Skills.Skill{ id: "abc123", name: "GenServer Testing", description: "Pattern for testing GenServer state machines", template: "defmodule MyServerTest do\n use ExUnit.Case...", tags: ["testing", "genserver", "elixir"], success_count: 7, failure_count: 1, alpha: 8.0, # Beta distribution parameter (successes + 1) beta: 2.0, # Beta distribution parameter (failures + 1) created_by: "coder", created_at: ~U[2026-02-20 14:00:00Z]}Thompson Sampling Ranking
Section titled “Thompson Sampling Ranking”Skills are ranked using Thompson sampling — a bandit algorithm that balances exploitation (proven skills) with exploration (new/uncertain skills):
# Non-deterministic: samples from each skill's Beta(alpha, beta) distributionranked = Skills.Ranking.rank(skills)
# Deterministic alternatives for debugging:Skills.Ranking.rank_by_mean(skills) # Expected value rankingSkills.Ranking.rank_by_ucb(skills) # Upper Confidence Bound rankingNew skills (alpha=1, beta=1, uniform prior) get natural exploration because their sampled values vary widely. Proven skills with many successes converge to consistently high samples.
Skill Tools
Section titled “Skill Tools”search_skills— Find relevant skills by query, ranked by Thompson samplingcreate_skill— Add a new skill to the catalog
Feedback
Section titled “Feedback”Skills.Ranking.record_feedback(project_root, skill_id, :success)# -> Increments alpha by 1.0 (and success_count)
Skills.Ranking.record_feedback(project_root, skill_id, :failure)# -> Increments beta by 1.0 (and failure_count)