Documentation — FeLLAMA

Design Philosophy

FeLLAMA is built around core principles that guide every architectural decision. These aren't aspirational — they are enforced through code structure, hard rules, and review.

Separation of Concerns

The Gateway handles transport: WebSocket sessions, status fan-out, replay, and scheduling. The Butler handles orchestration: intake, objective routing, dispatch, and result ownership. These two layers never mix.

Worker Autonomy

Each agent creates its own LLM client, manages its own context window, and handles its own retries. The Butler controls whether to launch a worker, not how it talks to the LLM. No centralized LLM proxy.

Safety First

Skill packages are evaluated by the Prompt Safety Advisor before execution. Workers run with constraint isolation — restricted file access, script timeouts, and scoped resources.

Full Observability

Trace logging at every process and network boundary. Every LLM request, WebSocket message, and subprocess invocation is captured. Single-turn agents use global logs; multi-turn agents use session-scoped traces.

Composability

All agents follow shared patterns (Simple, Orchestrated, Web), use the same output envelope, the same error types, and the same CLI argument conventions. New agents slot in without framework changes.

Fault Tolerance

Stuck detection identifies repeated actions and stalled research. Time budgets enforce soft and hard limits. Failed tasks can retry with alternative agents. Sessions checkpoint for resume.

Architecture

FeLLAMA uses a layered architecture. The CLI provides the user interface, the Gateway manages transport and sessions, and the Butler orchestrates work by dispatching standalone worker processes.

fellama-cli

Terminal UI · Virtual Shell · TUI

WebSocket

Gateway

Sessions · Status Fan-out · Replay

Butler Orchestrator

Objective Analysis · DAG Planning

Wave Dispatch · Review Loop

OS processes

Agent A

Agent B

Tool C

…

Workers are standalone OS processes. They function correctly with or without a Gateway listener. Each worker owns its LLM lifecycle — the Butler dispatches but never proxies LLM calls.

Crate Structure

The workspace is organized into five crates, each with a single non-overlapping responsibility. If a function is useful to two or more crates, it belongs in fellama-core.

Crate	Role	Dependencies
`fellama-core`	Shared infrastructure: config, constants, OpenAI client, SQLite store, async dispatcher, IPC, orchestrator, simple agent	External crates only
`fellama-agents`	LLM-powered agents that call an LLM to achieve goals	fellama-core
`fellama-tools`	Deterministic tools that work without LLM calls	fellama-core
`fellama-memory`	History browsing, vector embeddings, semantic search	fellama-core
`fellama-cli`	Terminal UI, virtual shell, WebSocket client/server, orchestration	fellama-core

Butler Orchestrator

The Butler is the brain of FeLLAMA. It receives user objectives, decomposes them into dependency-aware task graphs, dispatches agents in parallel waves, reviews results, and retries on failure.

DAG Execution

Tasks declare depends_on relationships. Outputs wired via input_bindings. Independent tasks run in parallel within waves.

Review Loop

After each wave, an LLM reviews outputs and decides: Accept, Retry, or Continue (append new tasks).

Concurrency Control

A global semaphore limits concurrent agents. Round-robin dispatch ensures fairness across sessions.

Fault Recovery

Per-task alternatives allow failed tasks to retry with a different agent or skill. Review loop injects corrective tasks.

Agents

FeLLAMA ships with 9 LLM-powered agents. Each is a standalone binary that receives input via CLI arguments, communicates via NDJSON, and returns an AgentResponse envelope.

Agent	Purpose	Pattern
`fellama-smartweb-agent`	LLM-directed browser automation with CDP, stuck detection, quality gates, time budgets	Web Agent
`fellama-skill-worker`	Executes Agent Skill packages with tool-calling LLM loop, safety evaluation	Orchestrated
`fellama-content-transformer`	Faithful content transformation and output-shape conversion	Simple
`fellama-info-distiller`	Content distillation — extracts key facts from large text	Simple
`fellama-objective-analyzer`	Normalizes user requests into structured objective objects	Simple
`fellama-summarizer`	Text summarization with configurable output format	Simple
`fellama-validator`	Validates agent outputs against expected schemas and constraints	Simple
`fellama-prompt-safety-advisor`	Prompt risk scoring and safety evaluation before skill execution	Simple
`fellama-syslog-reviewer`	Analyzes system and application logs for issues	Simple

Tools

12 deterministic tools that perform work without LLM calls. They produce repeatable output and are dispatched by the Butler alongside agents.

Tool	Purpose
`fellama-pdf-extractor`	Extract text content from PDF files
`fellama-spreadsheet-extractor`	Parse XLSX, XLS, and CSV files
`fellama-data-extractor`	Structured data extraction from documents
`fellama-doc-extractor`	General document processing
`fellama-data-vault`	Secure data storage and retrieval with access control
`fellama-find`	File search utility
`fellama-cron`	Scheduled job execution
`fellama-housekeeper`	Session and memory cleanup
`fellama-install-skill`	Install skill packages from repositories
`fellama-notifier`	Event notification dispatch
`fellama-search-tools`	Tool and agent discovery
`fellama-time`	Time and scheduling utilities

Memory Systems

Vector DB (LanceDB)

Semantic search over documents using embedding vectors. Documents are automatically chunked with configurable overlap, embedded via your configured model, and stored in LanceDB.

Modes: --embedding, --file, --search, --get, --remove
Cosine distance similarity
UUID-based document lifecycle
Configurable chunk size and overlap

History (SQLite)

Browse and export past sessions and task results. Persists across server restarts. Available as both CLI and TUI interface.

Session browsing with filters
Task result export
TUI and CLI interfaces
SQLite-backed for reliability

IPC Protocol

All inter-process communication uses NDJSON (newline-delimited JSON). Each line before the final line is a ProgressEvent. The final line is always an AgentResponse envelope.

NDJSON output stream

{"event":"step_started","task_id":"abc-123","step":"extracting page"}
{"event":"token","task_id":"abc-123","text":"Processing "}
{"event":"stream_token","task_id":"abc-123","text":"analysis..."}
{"event":"step_done","task_id":"abc-123","step":"extracting page","result":null}
{"task_id":"abc-123","status":"success","output":{"content":"..."},"error":null}

Event	Purpose
`StepStarted`	A discrete step began (extraction, distillation, action)
`StepDone`	Step completed, optional result payload
`Token`	Streaming output token routed to main TUI pane
`StreamToken`	Streaming LLM token routed to TUI side panel

AgentResponse Envelope

Rust

AgentResponse {
    task_id:   Option<String>,
    parent_id: Option<String>,
    status:    AgentStatus,      // success | partial | error
    output:    Value,            // must have "content" key (HR-9)
    error:     Option<Value>,
}

Agent Patterns

FeLLAMA defines three reusable agent patterns. New agents implement one of these and automatically inherit the IPC protocol, error handling, and CLI conventions.

Pattern 1: Simple Agent

Single-turn LLM call

Sends one prompt, parses one response. Used by: objective-analyzer, summarizer, validator, info-distiller, prompt-safety-advisor, syslog-reviewer. Logs to ~/.fellama/<binary-name>.log.

Pattern 2: Orchestrated Agent

Multi-turn tool-calling LLM loop

Runs an LLM in a tool-calling loop: send prompt + tools, execute returned calls, feed results back, repeat until done or max steps. Used by skill-worker. Session dir: ~/.fellama/skill-worker/<uuid>/.

Pattern 3: Web Agent

Turn-based browser loop

Drives a browser through an LLM planning loop. The LLM is stateless per turn — the agent owns all session state and rebuilds a full input JSON each turn. Used by smartweb-agent. Session dir: ~/.fellama/<session-id>/.

Architecture& Design

Design Philosophy

Architecture

Crate Structure

Butler Orchestrator

Agents

Tools

Memory Systems

IPC Protocol

AgentResponse Envelope

Agent Patterns

Architecture
& Design