Design Philosophy
FeLLAMA is built around a set of core principles that guide every architectural decision. These aren't aspirational — they are enforced through code structure, hard rules, and review.
Separation of Concerns
The Gateway handles transport: WebSocket sessions, status fan-out, replay, and scheduling. The Butler handles orchestration: intake, objective routing, dispatch, and result ownership. These two layers never mix.
Worker Autonomy
Each agent creates its own LLM client, manages its own context window, and handles its own retries. The Butler controls whether to launch a worker, not how it talks to the LLM. No centralized LLM proxy.
Safety First
Skill packages are evaluated by the Prompt Safety Advisor before execution. Workers run with constraint isolation — restricted file access, script timeouts, and scoped resources. No unreviewed code execution.
Full Observability
Trace logging at every process and network boundary. Every LLM request, WebSocket message, and subprocess invocation is captured. Single-turn agents use global logs; multi-turn agents use session-scoped traces.
Composability
All agents follow shared patterns (Simple, Orchestrated, Web), use the same output envelope, the same error types, and the same CLI argument conventions. New agents slot in without framework changes.
Fault Tolerance
Stuck detection identifies repeated actions and stalled research. Time budgets enforce soft and hard limits. Failed tasks can retry with alternative agents. Sessions checkpoint for resume.
Architecture
FeLLAMA uses a layered architecture. The CLI provides the user interface, the Gateway manages transport and sessions, and the Butler orchestrates work by dispatching standalone worker processes.
Workers are standalone OS processes. They function correctly with or without a Gateway listener. Each worker owns its LLM lifecycle — the Butler dispatches but never proxies LLM calls.
Crate Structure
The workspace is organized into five crates, each with a single non-overlapping responsibility.
If a function is useful to two or more crates, it belongs in fellama-core.
| Crate | Role | Dependencies |
|---|---|---|
fellama-core |
Shared infrastructure: config, constants, OpenAI client, SQLite store, async dispatcher, IPC, orchestrator, simple agent | External crates only |
fellama-agents |
LLM-powered agents that call an LLM to achieve goals | fellama-core |
fellama-tools |
Deterministic tools that work without LLM calls | fellama-core |
fellama-memory |
History browsing, vector embeddings, semantic search | fellama-core |
fellama-cli |
Terminal UI, virtual shell, WebSocket client/server, orchestration | fellama-core |
Butler Orchestrator
The Butler is the brain of FeLLAMA. It receives user objectives, decomposes them into dependency-aware task graphs, dispatches agents in parallel waves, reviews results, and retries on failure.
DAG Execution
Tasks declare depends_on relationships. Outputs are wired between tasks via input_bindings.
Independent tasks run in parallel within dependency waves.
Review Loop
After each wave, an LLM reviews outputs and decides: Accept, Retry (with different agent/params), or Continue (append new tasks). Up to N review rounds before forced acceptance.
Concurrency Control
A global semaphore limits maximum concurrent agents to prevent resource exhaustion. Round-robin dispatch ensures fairness across sessions.
Fault Recovery
Per-task alternatives allow failed tasks to retry with a different agent or skill. The review loop can inject corrective tasks based on observed failures.
Agents
FeLLAMA ships with 9 LLM-powered agents. Each is a standalone binary that receives input
via CLI arguments, communicates via NDJSON, and returns an AgentResponse envelope.
| Agent | Purpose | Pattern |
|---|---|---|
fellama-smartweb-agent |
LLM-directed browser automation with CDP. Multi-turn research with stuck detection, memory management, quality gates, and time budgets. | Web Agent |
fellama-skill-worker |
Executes Agent Skill packages with tool-calling LLM loop, safety evaluation, and constraint isolation. | Orchestrated |
fellama-content-transformer |
Faithful content transformation and output-shape conversion. | Simple |
fellama-info-distiller |
Content distillation — extracts key facts from large text. | Simple |
fellama-objective-analyzer |
Normalizes user requests into structured objective objects. | Simple |
fellama-summarizer |
Text summarization with configurable output format. | Simple |
fellama-validator |
Validates agent outputs against expected schemas and constraints. | Simple |
fellama-prompt-safety-advisor |
Prompt risk scoring and safety evaluation before skill execution. | Simple |
fellama-syslog-reviewer |
Analyzes system and application logs for issues. | Simple |
Tools
12 deterministic tools that perform work without LLM calls. They produce repeatable output and are dispatched by the Butler alongside agents.
| Tool | Purpose |
|---|---|
fellama-pdf-extractor | Extract text content from PDF files |
fellama-spreadsheet-extractor | Parse XLSX, XLS, and CSV files |
fellama-data-extractor | Structured data extraction from documents |
fellama-doc-extractor | General document processing |
fellama-data-vault | Secure data storage and retrieval with access control |
fellama-find | File search utility |
fellama-cron | Scheduled job execution |
fellama-housekeeper | Session and memory cleanup |
fellama-install-skill | Install skill packages from repositories |
fellama-notifier | Event notification dispatch |
fellama-search-tools | Tool and agent discovery |
fellama-time | Time and scheduling utilities |
Memory Systems
Vector DB (LanceDB)
Semantic search over documents using embedding vectors. Documents are automatically chunked with configurable overlap, embedded via your configured embedding model, and stored in LanceDB.
- Modes:
--embedding,--file,--search,--get,--remove - Cosine distance similarity
- UUID-based document lifecycle tracking
- Configurable chunk size and overlap
History (SQLite)
Browse and export past sessions and task results. Persists across server restarts. Available as both a CLI and TUI interface.
- Session browsing with filters
- Task result export
- TUI and CLI interfaces
- SQLite-backed for reliability
IPC Protocol
All inter-process communication uses NDJSON (newline-delimited JSON). Each line before the
final line is a ProgressEvent. The final line is always an AgentResponse envelope.
{"event":"step_started","task_id":"abc-123","step":"extracting page"}
{"event":"token","task_id":"abc-123","text":"Processing "}
{"event":"stream_token","task_id":"abc-123","text":"analysis..."}
{"event":"step_done","task_id":"abc-123","step":"extracting page","result":null}
{"task_id":"abc-123","status":"success","output":{"content":"..."},"error":null}
| Event | Purpose |
|---|---|
StepStarted | A discrete step began (extraction, distillation, action) |
StepDone | Step completed, optional result payload |
Token | Streaming output token routed to main TUI pane |
StreamToken | Streaming LLM token routed to TUI side panel |
AgentResponse Envelope
AgentResponse {
task_id: Option<String>,
parent_id: Option<String>,
status: AgentStatus, // success | partial | error
output: Value, // must have "content" key (HR-9)
error: Option<Value>,
}
Agent Patterns
FeLLAMA defines three reusable agent patterns. New agents implement one of these patterns and automatically inherit the IPC protocol, error handling, and CLI conventions.
Pattern 1: Simple Agent
Single-turn LLM call
Sends one prompt, parses one response. Used by the majority of agents:
objective-analyzer, summarizer, validator, info-distiller, prompt-safety-advisor, syslog-reviewer.
Logs to ~/.fellama/<binary-name>.log.
Pattern 2: Orchestrated Agent
Multi-turn tool-calling LLM loop
Runs an LLM in a tool-calling loop: send prompt + tools, execute returned calls,
feed results back, repeat until done or max steps. Used by skill-worker.
Session dir: ~/.fellama/skill-worker/<uuid>/.
Pattern 3: Web Agent
Turn-based browser loop
Drives a browser through an LLM planning loop. The LLM is stateless per turn — the agent
owns all session state and rebuilds a full input JSON each turn. Used by smartweb-agent.
Session dir: ~/.fellama/<session-id>/.