Multi-Agent AI Orchestration

FeLLAMA

Forged in Rust. Powered by LLMs.

A persistent background gateway that decomposes objectives, dispatches specialized AI agents, and orchestrates complex workflows — all from your terminal.

Your AI Command Center

FeLLAMA is a multi-agent AI orchestration system written entirely in Rust. It runs as a persistent background service, receiving instructions via WebSocket from a terminal UI client. A central Butler orchestrator decomposes your objectives into a dependency-aware task graph, dispatches specialized worker agents, and assembles the results — with built-in safety evaluation, session persistence, and full observability.

Key Features

Multi-Agent Orchestration

The Butler orchestrator decomposes objectives into a DAG of tasks, dispatches agents in parallel waves, reviews outputs, and retries failures — all automatically.

LLM-Directed Browser Automation

The SmartWeb agent drives a real browser via CDP — navigating, clicking, extracting PDFs, taking screenshots — all guided by LLM planning with stuck detection and quality gates.

Skill Package System

Install and execute Agent Skills — self-contained packages with scripts, resources, and constraints. Each skill runs in isolation with safety evaluation before execution.

Vector Memory

LanceDB-powered semantic search with automatic document chunking, embedding, and retrieval. Persistent memory across sessions for context-aware interactions.

Safety-First Design

Built-in Prompt Safety Advisor scores risk before execution. Skill packages run with constraint isolation, timeouts, and scoped file access. No unreviewed code execution.

Session Persistence

Every session is checkpointed with full state — memory, progress, traces. Resume interrupted work, replay specific turns for debugging, and audit complete execution history.

How It All Fits Together

fellama-cli TUI, virtual shell
Gateway WebSocket, session registry, status fan-out
Butler intake, objective analysis, dispatch, review
fellama-smartweb-agent browser automation
fellama-skill-worker skill execution
fellama-info-distiller content distillation
fellama-objective-analyzer request normalization
fellama-summarizer summarization
fellama-validator output validation
fellama-syslog-reviewer log analysis
fellama-prompt-safety risk scoring
deterministic tools PDF, spreadsheets, vault...
9 LLM Agents

Smart workers that use LLMs to achieve goals — from web research to content transformation.

12 Deterministic Tools

Reliable utilities for PDF extraction, spreadsheets, search, scheduling, and secure data storage.

2 Memory Systems

Vector DB for semantic search and History for session browsing — persistent across restarts.

Three Steps to Start

1

Install

Clone the repo and run the setup script. It checks prerequisites, builds all binaries, and creates your config.

bash
git clone https://github.com/rexf/fellama.git
cd fellama
./setup.sh
2

Configure

Point FeLLAMA to your LLM endpoint. Any OpenAI-compatible API works.

~/.fellama/config.toml
endpoint = "http://localhost:8000/v1"
model = "your-model-name"
agent_temperature = 0.6
3

Run

Start the server and connect with the terminal client.

bash
# Start the server
cargo run --release --bin fellama-server

# In another terminal, connect with the CLI
cargo run --release --bin fellama

Why Rust?

Fe is the chemical symbol for Iron — and FeLLAMA is forged in Rust, the language named after the iron oxide that transforms metal. This isn't a coincidence.

Zero-Cost

Abstractions compile away. The orchestrator runs with minimal overhead, even with dozens of concurrent agents.

Memory Safe

No garbage collector, no null pointers, no data races. Long-running server processes stay stable.

Async Native

Built on Tokio for high-concurrency async I/O. WebSocket sessions, LLM calls, and browser automation run in parallel.