Quick Start
Go from zero to running FeLLAMA in minutes. Clone, build, configure, and run your first query.
Required
Rust 2024 edition. Install via rustup.rs.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Required
Any OpenAI-compatible API. Local (llama.cpp, vLLM, Ollama) or remote (OpenAI, Anthropic via proxy).
Optional — for web automation
Required only for the SmartWeb agent. Provides headless Chrome via CDP.
docker run -p 3000:3000 browserless/chrome
Optional — for vector search
Required only for the Vector DB. Any embedding endpoint serving the OpenAI-compatible /v1/embeddings API.
Clone the repository and run the setup script. It checks prerequisites, builds all release binaries, and creates your configuration directory.
git clone https://github.com/rexf/fellama.git
cd fellama
./setup.sh
The setup script will:
~/.fellama/ directoryconfig.tomlurl_rules.toml
Manual build: If you prefer not to use the script, run
cargo build --release directly. Binaries will be in target/release/.
Edit ~/.fellama/config.toml to point FeLLAMA at your LLM endpoint. This is the
minimum configuration needed to get started.
# Point to your OpenAI-compatible LLM API
endpoint = "http://localhost:8000/v1"
# Model name as known by your endpoint
model = "your-model-name"
# Controls LLM creativity (0.0 = deterministic, 1.0 = creative)
agent_temperature = 0.6
# Optional: enable browser automation
# browserless_endpoint = "ws://localhost:3000"
# Optional: enable detailed logging
# enable_trace_log = true
See the Configuration Reference for all available options.
FeLLAMA runs as a persistent background service. Start the server in one terminal:
cargo run --release --bin fellama-server
The server starts a WebSocket listener and waits for client connections. It manages sessions, dispatches agents, and coordinates the Butler orchestrator.
In a second terminal, launch the TUI client to connect to the running server:
cargo run --release --bin fellama
The CLI opens a terminal UI with a virtual shell. Type your request and FeLLAMA will decompose it, dispatch agents, and stream results back.
You can also run agents directly from the command line without the server:
Web research with the SmartWeb agent:
cargo run --release --bin fellama-smartweb-agent -- \
--request "Find the latest Rust async runtime benchmarks" \
--human
Summarize a document:
cargo run --release --bin fellama-summarizer -- \
--file report.pdf \
--output-format markdown \
--human
Extract data from a spreadsheet:
cargo run --release --bin fellama-spreadsheet-extractor -- \
--file data.xlsx \
--human
Machine-to-machine with JSON args:
cargo run --release --bin fellama-objective-analyzer -- \
--json_arg '{"request": "analyze server logs for errors", "output_format": "json"}'
Ensure you have a C compiler and OpenSSL dev headers installed.
On macOS: xcode-select --install.
On Ubuntu: sudo apt install build-essential libssl-dev pkg-config.
Verify your endpoint in config.toml is correct and the LLM server is running.
Test with: curl http://localhost:8000/v1/models
Ensure Browserless is running: docker run -p 3000:3000 browserless/chrome.
Set browserless_endpoint = "ws://localhost:3000" in config.toml.
Set enable_trace_log = true in config.toml. Logs appear in ~/.fellama/
— single-turn agents write to <binary-name>.log, multi-turn agents to
<session-id>/trace.log.
Explore the full architecture or configure advanced features.