Skip to main content

Norns Memory System

The Raven Model cognitive architecture for long-term memory, learning, and RAG.


Overview

The Norns agent implements a three-plane cognitive architecture inspired by Odin's ravens (Huginn and Muninn), providing real-time state management, contextual awareness, long-term learning, and document-based RAG.

┌─────────────────────────────────────────────────────────────┐
│ HUGINN (State Plane) - Real-time Session State │
│ - Conversation flow and immediate context │
│ - Active messages and task state │
│ - Redis-backed (session_id keyed, fast access) │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ CONTEXT (Identity Plane) - Assembled User Context │
│ - UserIdentity, DomainAccess permissions │
│ - Active projects, calendar, voice mode │
│ - Applicable rules and defaults │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ MUNINN (Memory Plane) - Long-term Memory & Learning │
│ - Episodic: Raw experiences (interactions, observations) │
│ - Semantic: Learned patterns and correlations │
│ - Knowledge: Entity graph (people, places, concepts) │
│ - Documents: RAG chunks for grounded responses │
│ - Vector embeddings for unified semantic search │
└─────────────────────────────────────────────────────────────┘

Software Stack

FastAPI Server (norns-agent:8000)
├── LangGraph Graph Processing
├── Database Services (PostgreSQL + pgvector)
│ ├── ContextService (identity, domains)
│ ├── MuninnContextService (memory assembly)
│ ├── EpisodicMemoryService (experience storage)
│ ├── SemanticMemoryService (pattern learning)
│ └── DocumentService (RAG ingestion & retrieval)
├── Cache Layer (Redis)
├── Embedding Service
│ ├── Ollama (primary - nomic-embed-text, 768 dims)
│ └── OpenAI (fallback - ada-002, 1536 dims)
├── LLM (Claude via Anthropic API)
└── Integrations (Slack, Calendar, Home Assistant)

The memory system provides unified search across 4 memory types:

TypeTablePurpose
episodicepisodic_memoriesPast interactions and observations
semanticsemantic_patternsLearned behavioral patterns
knowledgeknowledge_entitiesNamed entities and relationships
documentsdocument_chunksRAG document chunks

Endpoint: POST /api/memories/search

class MemorySearchRequest(BaseModel):
query: str # Text to search for
user_id: str # User UUID
memory_types: list[str] # ["episodic", "semantic", "knowledge", "documents"]
limit: int = 10 # Max results per type

All memory types use the same embedding model and vector similarity search, enabling seamless cross-memory retrieval.


Memory Layers

Episodic Memory

Stores raw experiences as searchable vector-embedded memories.

Table: episodic_memories

ColumnTypePurpose
memory_idUUIDPrimary key
user_idUUIDMemory owner
episode_typevarcharinteraction, observation
source_channelvarcharslack, voice, api
raw_contenttextFull message + response
content_embeddingvector[768]Semantic search vector
context_snapshotjsonbFull context at interaction time
importance_scorefloat0.0-1.0, heuristic-based
extracted_entitiesjsonbNER results
extracted_intentsjsonbIntent classification
access_countintUsage tracking
consolidated_to_semanticboolMarked for pattern extraction

Key Operations:

  • record_interaction() - Store chat with embeddings
  • record_observation() - Store system observations
  • search_memories() - Vector similarity search
  • mark_for_consolidation() - Flag for pattern extraction

Importance Scoring:

  • Base score: 0.5
  • Length factor: +0.1 each for >500 and >1000 chars
  • Entity factor: +0.05 per entity (max 0.2)
  • Action intent: +0.1 (create, update, complete, delete)
  • Question factor: +0.05 (learning opportunity)

Semantic Memory

Extracts patterns and learns from episodic experiences.

Table: semantic_patterns

ColumnTypePurpose
pattern_idUUIDPrimary key
user_idUUIDPattern owner
pattern_typevarcharbehavioral, temporal, causal, preference
pattern_categoryvarcharaction_frequency, activity_time, domain_focus
pattern_namevarcharHuman readable name
pattern_embeddingvector[768]For pattern search
confidence_scorefloat0.0-1.0
evidence_countintSupporting episode count
statusvarcharemergingactivedeprecated
derived_from_episodesuuid[]Source episodic memories

Table: semantic_correlations

ColumnTypePurpose
concept_a_typevarchardomain, time, entity
concept_a_valuevarcharSpecific value
concept_b_typevarcharCorrelation target type
concept_b_valuevarcharCorrelation target value
correlation_strengthfloat0.0-1.0

Table: action_outcome_mappings

ColumnTypePurpose
action_typevarchartask_scheduling, priority_assignment
action_contextjsonbContext of action
outcome_typevarcharsuccess, failure, partial
effectiveness_scorefloat0.0-1.0

Pattern Types:

  • Temporal: Peak activity hours, days of week
  • Behavioral: Repeated actions, frequent task creation
  • Domain: Which domains user focuses on
  • Preference: Consistent choices over time

Knowledge Entities

Entity graph for semantic understanding and disambiguation.

Table: knowledge_entities

ColumnTypePurpose
entity_idUUIDPrimary key
entity_typevarcharperson, place, project, concept
entity_namevarcharPrimary name
entity_aliasesvarchar[]Alternative names
entity_embeddingvector[768]For disambiguation
propertiesjsonbEntity-specific data
confidencefloat0.0-1.0

Admin UI: norns.ravenhelm.dev/knowledge

  • Create/edit entities with type, description, aliases
  • Set confidence scores
  • Add custom properties (JSON)

Document RAG

Upload and retrieve document chunks for grounded responses.

Table: document_chunks

ColumnTypePurpose
chunk_idUUIDPrimary key
document_idUUIDParent document
user_idUUIDDocument owner
document_namevarcharOriginal filename
document_typevarchargeneral, technical, personal, reference
chunk_indexint0-based position in document
chunk_contenttextRaw text of chunk
chunk_embeddingvector[768]Semantic search vector
metadatajsonb{original_size, total_chunks}
created_attimestamptzUpload timestamp

Admin UI: norns.ravenhelm.dev/documents

  • Drag-and-drop upload (.txt, .md, .pdf)
  • Select document type
  • Configure chunk size (100-5000 chars, default 1000)
  • View documents grouped by type
  • Delete documents with confirmation

RAG Pipeline

Document Ingestion Flow

┌─────────────────────────────────────────────────────────────────┐
│ Documents Admin UI (norns.ravenhelm.dev/documents) │
│ - Drag-and-drop or file browser │
│ - Document type: general, technical, personal, reference │
│ - Chunk size: 100-5000 chars (default 1000) │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Upload Endpoint: POST /api/documents/upload │
│ - Validate UTF-8 encoding │
│ - Generate document_id (UUID) │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Chunking: Word-based, character-bounded │
│ - Split text by whitespace │
│ - Accumulate words until chunk_size threshold │
│ - Respects word boundaries (never splits mid-word) │
│ - Each chunk gets sequential chunk_index │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Embedding: Ollama (nomic-embed-text, 768 dims) │
│ - Each chunk embedded individually │
│ - Fallback: OpenAI ada-002 (1536 dims) │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Storage: PostgreSQL + pgvector │
│ - INSERT into document_chunks │
│ - chunk_embedding as vector(768) │
│ - metadata: {original_size, total_chunks} │
└─────────────────────────────────────────────────────────────────┘

Document Retrieval at Runtime

User Question → Agent

┌─────────────────────────────────────────────────────────────────┐
│ Embed Query │
│ - Same embedding model as ingestion (Ollama/OpenAI) │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Vector Similarity Search │
│ SELECT chunk_id, document_name, chunk_content, │
│ 1 - (chunk_embedding <=> $query_vector) as similarity │
│ FROM document_chunks │
│ WHERE user_id = $user_id │
│ AND chunk_embedding IS NOT NULL │
│ ORDER BY similarity DESC │
│ LIMIT $limit │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Combine with Other Memory Types │
│ - Episodic memories (past interactions) │
│ - Semantic patterns (learned behaviors) │
│ - Knowledge entities (named entities) │
│ - Document chunks (RAG) │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Agent Context Assembly │
│ - MemoryWorkers integrate all memory types │
│ - Documents included as "RELEVANT DOCUMENTS" in prompt │
│ - Ranked by similarity score (0-1) │
└─────────────────────────────────────────────────────────────────┘

Agent Response (grounded in retrieved documents)

Chunking Algorithm

def chunk_document(text: str, chunk_size: int = 1000) -> list[str]:
chunks = []
words = text.split()
current_chunk = []
current_size = 0

for word in words:
current_chunk.append(word)
current_size += len(word) + 1 # +1 for space

if current_size >= chunk_size:
chunks.append(" ".join(current_chunk))
current_chunk = []
current_size = 0

if current_chunk:
chunks.append(" ".join(current_chunk))

return chunks

Muninn Context Service

The main memory interface that assembles context for agent processing.

ContextBundle Output:

@dataclass
class ContextBundle:
episodes: list[dict] # Recent relevant memories
patterns: list[dict] # Applicable semantic patterns
entities: list[dict] # Known entities
documents: list[dict] # Relevant document chunks
suggestions: list[str] # Memory-based recommendations
built_at: datetime
query_used: Optional[str]

Key Methods:

  • build_context() - Assemble full context bundle (all 4 memory types)
  • record_interaction() - Store interaction and trigger memory recording
  • recall_similar() - Explicit memory recall across all types
  • get_entity() / record_entity() - Entity management

Bifrost MCP Integration

Bifrost exposes Norns capabilities as MCP (Model Context Protocol) tools.

Claude Code / External Service

POST /mcp/tools/call (X-API-Key auth)

Bifrost API (tool registry, auth, audit)

Norns Agent execution

Response + tool_executions audit log

Key Tables:

  • tool_definitions - MCP tool registry
  • tool_executions - Audit log of tool calls
  • api_keys - Scoped authentication

Dataflows

Chat Interaction (with RAG)

USER MESSAGE (Slack)

POST /slack/events

┌─ CONTEXT ASSEMBLY (Huginn) ─────────────────────────────────┐
│ • Resolve user identity │
│ • Load domain permissions │
│ • Get active projects, calendar, work hours │
│ • Cache in Redis (5 min TTL) │
└─────────────────────────────────────────────────────────────┘

┌─ MEMORY CONTEXT (Muninn) ───────────────────────────────────┐
│ • Generate embedding: ollama.embed(message) │
│ • Search ALL memory types: │
│ - episodic_memories (past interactions) │
│ - semantic_patterns (learned behaviors) │
│ - knowledge_entities (named entities) │
│ - document_chunks (RAG documents) │
│ • Build unified ContextBundle │
└─────────────────────────────────────────────────────────────┘

┌─ LANGGRAPH EXECUTION ───────────────────────────────────────┐
│ • Initialize AgentState with messages + context │
│ • Include relevant documents in system prompt │
│ • Call Claude with grounded context │
│ • Execute tools if needed │
│ • Generate response citing document sources │
└─────────────────────────────────────────────────────────────┘

┌─ MEMORY RECORDING ──────────────────────────────────────────┐
│ • Generate embedding for (message + response) │
│ • Calculate importance_score │
│ • INSERT into episodic_memories │
│ • Add to episode_sequence (conversation grouping) │
│ • Record observation for pattern detection │
└─────────────────────────────────────────────────────────────┘

SLACK RESPONSE

Memory Recall

User: "Remember when I said I wanted to learn Spanish?"

recall_memories("spanish", user_id)

muninn.recall_similar(user_uuid, "spanish")

Parallel search across all memory types:
• episodic_memories → past conversations
• semantic_patterns → learning preferences
• knowledge_entities → "Spanish" as concept
• document_chunks → Spanish learning materials

Combine and rank by similarity

Return: [{type, id, content, similarity}, ...]

Pattern Consolidation (Background)

Runs periodically to extract patterns from aged memories.

semantic.consolidate_episodes(user_id, min_age_days=7)

SELECT unconsolidated episodic_memories from 7+ days ago

extract_patterns():
• _analyze_temporal_patterns() → Peak hours, weekday patterns
• _analyze_intent_patterns() → Frequent actions
• _analyze_domain_patterns() → Domain focus areas

For each pattern:
• If similar exists → update confidence (max 0.95)
• If new → generate embedding, INSERT

detect_correlations():
• Find co-occurring concepts (domain+time)
• INSERT into semantic_correlations

Mark episodes as consolidated

Bifrost Tool Invocation

External Service → POST /mcp/tools/call

Authenticate with X-API-Key

Look up tool_definitions row
├── Get http_config (endpoint, method, headers)
├── Get auth_config_ref
└── Validate input against schema

Transform MCP request → HTTP request

Call Norns endpoint

INSERT into tool_executions (audit log)

Return response to caller

Domain Model

Users interact with 8 life domains:

DomainIconDescription
hrafnhoard💰Personal Finance
ravenhelm⚔️Work & Business
idunns_garden🌸Family & Relationships
eirs_vitality❤️Health & Fitness
bragis_quill🖋️Writing & Creative
midgard🏠Home & Property
friggs_hearth🔥Household Operations
mimirs_legacy📚Digital Legacy

Memory is scoped to domains where applicable, allowing domain-specific pattern learning.


Key Files

FilePurpose
agent/main.pyFastAPI endpoints (upload, list, delete, search)
agent/memory/episodic.pyEpisodic memory CRUD
agent/memory/semantic.pyPattern extraction & learning
agent/memory/muninn_context.pyMemory assembly interface
agent/memory/embeddings.pyEmbedding provider abstraction
agent/agents/workers/memory_workers.pyRAG context integration
agent/graph.pyLangGraph state machine
agent/context_service.pyContext loading & caching
admin/app/(authenticated)/documents/page.tsxDocument admin UI
admin/app/(authenticated)/knowledge/page.tsxKnowledge entity admin UI
bifrost/api/main.pyMCP gateway server

Configuration

Embedding Settings

environment:
OLLAMA_URL: http://ollama:11434
OLLAMA_EMBED_MODEL: nomic-embed-text
EMBEDDING_PROVIDER: auto # auto|ollama|openai|mock

Supported Embedding Models

ProviderModelDimensions
Ollamanomic-embed-text768 (default)
Ollamamxbai-embed-large1024
Ollamaall-minilm384
OpenAItext-embedding-ada-0021536

Memory Thresholds

SettingValuePurpose
Similarity threshold0.5Minimum for semantic match
Default chunk size1000Characters per document chunk
Consolidation age7 daysMin age before pattern extraction
Pattern deprecation90 daysArchive unreinforced patterns
Redis context TTL5 minContext cache expiration
Redis identity TTL1 hourIdentity cache expiration

Quick Commands

# View memory-related logs
docker logs norns-agent 2>&1 | grep -i "memory\|muninn\|episodic\|document"

# Check embedding service
docker exec ollama ollama list

# Query all memory types
docker exec -i postgres psql -U ravenhelm -d ravenmaskos -c "
SELECT 'episodic' as type, COUNT(*) FROM episodic_memories
UNION ALL
SELECT 'semantic', COUNT(*) FROM semantic_patterns
UNION ALL
SELECT 'knowledge', COUNT(*) FROM knowledge_entities
UNION ALL
SELECT 'documents', COUNT(*) FROM document_chunks;
"

# Query document stats
docker exec -i postgres psql -U ravenhelm -d ravenmaskos -c "
SELECT document_type, COUNT(DISTINCT document_id) as docs, COUNT(*) as chunks
FROM document_chunks
GROUP BY document_type;
"

# Check consolidation status
docker exec -i postgres psql -U ravenhelm -d ravenmaskos -c \
"SELECT consolidated_to_semantic, COUNT(*) FROM episodic_memories GROUP BY 1;"

Troubleshooting

Memory Search Returns No Results

Symptoms: Agent doesn't recall relevant past interactions or documents

Diagnosis:

# Check embedding service
curl -s http://ollama:11434/api/tags | jq .

# Verify memories exist with embeddings
docker exec -i postgres psql -U ravenhelm -d ravenmaskos -c "
SELECT 'episodic' as type,
COUNT(*) as total,
COUNT(*) FILTER (WHERE content_embedding IS NOT NULL) as with_embedding
FROM episodic_memories
UNION ALL
SELECT 'documents',
COUNT(*),
COUNT(*) FILTER (WHERE chunk_embedding IS NOT NULL)
FROM document_chunks;
"

Solutions:

  1. Verify Ollama is running with nomic-embed-text model
  2. Check pgvector extension is installed
  3. Confirm memories/documents have embeddings (not NULL)
  4. Lower similarity threshold if too restrictive

Document Upload Fails

Symptoms: Error when uploading documents

Diagnosis:

docker logs norns-agent 2>&1 | grep -i "upload\|document\|error"

Solutions:

  1. Verify file is UTF-8 encoded text
  2. Check file size limits
  3. Ensure Ollama is responding for embeddings
  4. Check database connectivity

Pattern Consolidation Not Running

Symptoms: No semantic patterns being created

Diagnosis:

# Check for unconsolidated old memories
docker exec -i postgres psql -U ravenhelm -d ravenmaskos -c \
"SELECT COUNT(*) FROM episodic_memories
WHERE consolidated_to_semantic = FALSE
AND occurred_at < NOW() - INTERVAL '7 days';"

Solutions:

  1. Verify consolidation background job is scheduled
  2. Check for errors in agent logs
  3. Manually trigger consolidation via API

High Memory Latency

Symptoms: Slow responses when context building

Diagnosis:

# Check Redis connectivity
docker exec redis redis-cli PING

# Check vector indexes exist
docker exec -i postgres psql -U ravenhelm -d ravenmaskos -c "
SELECT tablename, indexname FROM pg_indexes
WHERE tablename IN ('episodic_memories', 'document_chunks', 'semantic_patterns');
"

Solutions:

  1. Verify Redis is responding quickly
  2. Add IVFFlat or HNSW index to embedding columns:
CREATE INDEX idx_doc_chunks_embedding ON document_chunks 
USING ivfflat (chunk_embedding vector_cosine_ops) WITH (lists = 100);
  1. Reduce similarity search limit

PDF Support & Adaptive Chunking

Added: 2026-01-03

Enhanced Document Upload

The document upload endpoint now supports:

  • PDF files via pdfplumber extraction
  • Adaptive chunking based on model context window
  • Page-aware chunking for PDFs with page number tracking
  • Document metadata extraction (title, author, subject)

Upload Endpoint Parameters

Endpoint: POST /api/documents/upload

ParameterTypeDefaultDescription
fileFilerequiredPDF or text file
user_idUUIDrequiredDocument owner
document_typestring"general"Category: general, technical, personal, reference
chunk_strategystring"adaptive"See chunking strategies below
chunk_sizeintautoOverride automatic chunk sizing (tokens)
model_contextstring"default"Model name for adaptive sizing

Chunking Strategies

StrategyDescriptionBest For
adaptive5% of model context window (256-4096 tokens)General use, balances retrieval granularity
page_basedRespects PDF page boundaries, tracks page numbersPDFs where page context matters
semanticParagraph-based, 512 tokensNarrative documents
fixedFixed 1000 tokensConsistent chunk sizes

Context Window Sizing

When using adaptive strategy, chunk size is calculated based on the model:

ModelContext WindowChunk Size (5%)
claude-3-5-sonnet200K4096 (capped)
claude-3-opus200K4096 (capped)
gpt-4-turbo128K4096 (capped)
llama-3.1-70b128K4096 (capped)
gpt-48K409
default8K409

Min: 256 tokens, Max: 4096 tokens, with 10% overlap between chunks.

Document Metadata Table

Table: documents

ColumnTypePurpose
document_idUUIDPrimary key
user_idUUIDDocument owner
filenamevarcharOriginal filename
content_typevarcharMIME type (application/pdf, text/plain)
file_size_bytesintFile size
total_pagesintPDF page count
titletextPDF metadata: title
authortextPDF metadata: author
subjecttextPDF metadata: subject
chunk_strategyvarcharStrategy used
chunk_size_tokensintActual chunk size
total_chunksintNumber of chunks created
embedding_modelvarcharModel used for embeddings
uploaded_attimestamptzUpload timestamp
processed_attimestamptzProcessing completion
metadatajsonbAdditional metadata

Page Number Tracking

Enhanced document_chunks columns:

ColumnTypePurpose
page_numberintPrimary page for this chunk (PDF only)
total_pagesintTotal document pages
content_typevarcharMIME type of source

When using page_based strategy, chunks maintain page boundaries. For other strategies, page numbers are approximated based on content position.

Example Usage

# Upload PDF with adaptive chunking for Claude
curl -X POST 'https://norns-pm.ravenhelm.dev/api/documents/upload' \
-H 'X-API-Key: <key>' \
-F 'file=@technical-spec.pdf' \
-F 'user_id=<uuid>' \
-F 'document_type=technical' \
-F 'chunk_strategy=adaptive' \
-F 'model_context=claude-3-5-sonnet'

# Upload PDF preserving page boundaries
curl -X POST 'https://norns-pm.ravenhelm.dev/api/documents/upload' \
-H 'X-API-Key: <key>' \
-F 'file=@manual.pdf' \
-F 'user_id=<uuid>' \
-F 'document_type=reference' \
-F 'chunk_strategy=page_based'

# Upload with custom chunk size
curl -X POST 'https://norns-pm.ravenhelm.dev/api/documents/upload' \
-H 'X-API-Key: <key>' \
-F 'file=@notes.txt' \
-F 'user_id=<uuid>' \
-F 'chunk_size=500'

Response Format

{
"document_id": "fc505fe0-c9c2-423c-a087-26e6fabacfdb",
"filename": "technical-spec.pdf",
"content_type": "application/pdf",
"total_pages": 42,
"chunks_created": 156,
"chunk_strategy": "adaptive",
"chunk_size_tokens": 4096,
"document_type": "technical",
"metadata": {
"title": "System Architecture Specification",
"author": "Engineering Team"
}
}

Key Files (PDF Support)

FilePurpose
agent/memory/document_processor.pyPDF extraction, adaptive chunking, file detection
agent/requirements.txtAdded: pdfplumber, python-magic, tiktoken
migrations/006_rag_pdf_support.sqlSchema updates for PDF metadata

Troubleshooting PDF Upload

PDF extraction fails:

# Check pdfplumber is installed
docker exec norns-agent pip list | grep pdfplumber

# Test PDF directly
docker exec norns-agent python3 -c "import pdfplumber; print('OK')"

Content type detection fails:

# python-magic requires libmagic
docker exec norns-agent python3 -c "import magic; print('OK')"

If missing, fallback uses file extension.

Vector dimension mismatch:

-- Verify 768-dimensional vectors
SELECT pg_typeof(chunk_embedding),
vector_dims(chunk_embedding)
FROM document_chunks LIMIT 1;

Should show vector type with 768 dimensions for Ollama.

Debugging RAG Issues

If documents are being found but not reflected in responses:

  1. Check similarity threshold: Default is 0.3, may need adjustment
  2. Check content preview length: Currently 800 chars per document chunk in prompt
  3. Verify documents in prompt: Check logs for "Memory prompt generated with X episodes, Y documents"

The spouse/family info requires 800 char preview because personal details appear later in the Identity chunk (around position 446+).