Specimen Report · Elixir

lilbots-01

phiat/lilbots-01

Real-time multi-agent AI playground — games, debates, brainstorms, and creative writing with configurable presets, tools, and live cost tracking

Stars
★ 1
Forks
⑂ 0
Language
Elixir
Size
1,287 kB
Last Push
2mo ago
Forged
4mo ago
ai-agentsgamesllmmultiplayerrealtime
# Lilbots [![CI](https://github.com/phiat/lilbots-01/actions/workflows/ci.yml/badge.svg)](https://github.com/phiat/lilbots-01/actions/workflows/ci.yml) ![Dialyzer](https://img.shields.io/badge/Dialyzer-passing-green) ![Elixir](https://img.shields.io/badge/Elixir-1.19-purple?logo=elixir) ![Phoenix](https://img.shields.io/badge/Phoenix-1.8-orange?logo=phoenixframework) ![License: MIT](https://img.shields.io/badge/License-MIT-green.svg) ![Credo](https://img.shields.io/badge/credo-strict-blueviolet) ![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg) ![Lilbots Dashboard](docs/dashboard.png) A multi-agent LLM orchestration platform built with Elixir and Phoenix LiveView. Spawn AI agents, wire them together, run structured games and debates, and watch them interact in real-time with streaming responses and cost tracking. ## Features - **Agent Grid**: Spawn up to 12 concurrent agents, each with independent model, personality, and temperature - **Real-time Streaming**: Token-by-token response rendering via SSE with 50ms batched UI updates - **Agent Wiring**: SVG cable visualization connecting agents with topology-based message routing (broadcast, chain, ring, star) - **78 Personality Presets**: Historical figures, fictional characters, professionals, personality types, and fun personas across 5 categories - **11 Meta Presets**: Structured multi-agent interaction patterns with game boards, scoring, and tools - **30 Agent Tools**: Web search (Brave), memory, calculator, dice, randomness, timers, buzzers, scoring, voting, whisper, word tracking, rhyme checking, forbidden words, codenames board, tic-tac-toe board - **Cost Tracking**: Real-time LLM cost display with per-model pricing, sparkline graphs, and per-agent breakdowns - **Fleet Configuration**: Group agents, apply shared prompts, coordinate broadcast responses - **Shuffle Controls**: Randomize presets, models, or temperatures fleet-wide with category filters ($, $$, provider) - **22 LLM Models**: Groq (gpt-oss, llama, compound, qwen, llama-4), 9 Anthropic Claude models, and 4 Google Gemini models (2.5 Pro, 2.5 Flash, 2.0 Flash, 2.0 Flash Lite) - **CLI Games**: Run debates, telephone, and trivia directly from the terminal ## Quick Start ```bash # Install dependencies just setup # or: make setup # Create .env file (see env.example for all options) echo "GROQ_API_KEY=your_key_here" > .env # Start the server just server # or: make server ``` Visit [`localhost:4000`](http://localhost:4000) for the dashboard, or [`localhost:4000/meta`](http://localhost:4000/meta) for meta preset sessions. 4 agents spawn on startup by default (configurable via `LILBOTS_AGENT_COUNT`) using `openai/gpt-oss-20b`. All agents receive dynamic date context in system prompts. A health check is available at [`/health`](http://localhost:4000/health). ## Meta Presets Structured multi-agent interaction patterns with game-specific tools and visual boards: | Pattern | Agents | Description | |---------|--------|-------------| | **Debate** | 4 | Pro vs Con with moderator and adjudicator | | **Panel Discussion** | 4-6 | Moderated multi-perspective discussion | | **Brainstorm** | 3-5 | Structured ideation with critic and synthesizer | | **Writers Room** | 3-5 | Collaborative creative writing | | **Trivia** | 3-6 | Quiz show with buzzer and scoring tools | | **Rhyme Battle** | 3-5 | Poetry competition with rhyme-checking judge | | **Word Chain** | 3-8 | Sequential word association game | | **Telephone** | 3-10 | Message transformation chain (standard, lossy, creative) | | **Tic-Tac-Toe** | 2 | Classic game with visual SVG board | | **Taboo** | 3-4 | Word guessing with forbidden word enforcement | | **Codenames** | 4 | Team-based word association with visual board | ## Agent Tools | Tool | Description | |------|-------------| | `web_search` | Search the web via Brave Search API | | `memory_save` / `memory_recall` | Persistent key-value memory per agent | | `calculator` | Safe mathematical expression evaluation | | `dice` / `coin_flip` / `pick_random` / `shuffle` / `draw` | Randomness tools (d20, 2d6+3, etc.) | | `timer_start` / `timer_check` | Countdown timers for timed challenges | | `buzzer` | Buzz in for trivia/quiz games | | `add_score` / `get_score` / `leaderboard` | Scoring and rankings | | `rhyme_check` | Verify rhyming word pairs | | `track_word` / `list_used_words` | Word usage tracking for word games | | `set_forbidden` / `check_forbidden` | Forbidden word enforcement for Taboo | | `whisper` | Private messages between specific roles | | `create_poll` / `cast_vote` / `poll_results` | Multi-agent voting | | `codenames_generate` / `codenames_reveal` / `codenames_view` | Codenames board management | | `tictactoe_generate` / `tictactoe_move` / `tictactoe_view` | Tic-Tac-Toe board management | ## Architecture ``` Browser ──WebSocket──▶ Phoenix LiveView │ PubSub Topics ├─ "agents" (lifecycle, status, responses) ├─ "agent:<id>" (streaming tokens) └─ "costs" (cost tracking updates) │ ┌────────────┼────────────┐ ▼ ▼ ▼ AgentSupervisor CostTracker MetaPreset (DynamicSupervisor) (ETS) Engine │ │ ┌─────┼─────┐ Topology Router ▼ ▼ ▼ (broadcast/chain/ Agent Agent Agent ring/star/tree) (GenServer, each) │ TaskSupervisor (async LLM calls) │ LLM Module (Groq/Anthropic/Gemini SSE) ``` See [docs/architecture.md](docs/architecture.md) and [docs/dashboard-architecture.md](docs/dashboard-architecture.md) for detailed diagrams. ## Development Both [`just`](https://github.com/casey/just) and `make` are supported. All recipes share the same names — swap `just` ↔ `make` freely. `just` adds recipe parameters and auto `.env` loading; `make` works without extra dependencies. | Task | `just` | `make` | |------|--------|--------| | Show all commands | `just --list` | `make help` | | **Server** | | | | Start (background) | `just server` | `make server` | | Start (foreground) | `just server-fg` | `make server-fg` | | Stop / restart / status | `just stop` / `restart` / `status` | `make stop` / `restart` / `status` | | Tail logs / clear | `just logs` / `logs-clear` | `make logs` / `logs-clear` | | **Testing** | | | | Run all tests | `just test` | `make test` | | Run specific file | `just test test/lilbots/agent_test.exs` | — | | Engine tests | `just test-engine` | `make test-engine` | | E2E (Playwright) | `just test-e2e` | `make test-e2e` | | All checks | `just check` | `make check` | | **Code Quality** | | | | Format / lint | `just format` / `lint` | `make format` / `lint` | | Credo / Dialyzer | `just credo` / `dialyzer` | `make credo` / `dialyzer` | | **Games** | | | | Debate | `just debate "Should AI have rights?"` | `make debate` | | Telephone | `just telephone "Hello world"` | `make telephone` | | Trivia | `just trivia "History"` | `make trivia` | ## Testing 385 tests across three layers: - **Unit/integration** (`test/`) — Agent, LLM providers, tools, meta preset engine, cost tracking - **LiveView** (`test/lilbots_web/live/`) — DashboardLive and MetaPresetLive event handlers - **E2E** (`e2e/`) — Playwright browser tests against a running server Test support modules: | Module | Purpose | |--------|---------| | `LiveCase` | LiveView test template with MockLLM setup and agent cleanup | | `EngineCase` | Meta preset engine tests with debate helpers and prompt validation | | `MockLLM` | In-process mock LLM with configurable response functions | ```bash just test # All tests (or: make test) just test test/lilbots/agent_test.exs # Specific file (just only) just test-engine # Meta preset engine tests only just test-e2e # Playwright e2e tests (requires running server) just check # Format + credo + compile warnings + test ``` ## Project Structure ``` lilbots/ ├── lib/ │ ├── lilbots/ │ │ ├── agent.ex # Agent GenServer (chat, streaming, connections) │ │ ├── agent_supervisor.ex # DynamicSupervisor for agent lifecycle │ │ ├── cost_tracker.ex # Real-time LLM cost tracking (ETS) │ │ ├── fleet.ex # Fleet/group prompt management │ │ ├── llm.ex # LLM client with provider routing │ │ ├── llm/ │ │ │ ├── anthropic.ex # Anthropic Messages API provider │ │ │ ├── behaviour.ex # Provider behaviour contract │ │ │ ├── gemini.ex # Google Gemini API provider │ │ │ ├── tool_loop.ex # Recursive tool-use loop │ │ │ ├── sse_parser.ex # SSE parser (OpenAI-compatible / Groq) │ │ │ ├── sse_parser_anthropic.ex # SSE parser (Anthropic format) │ │ │ └── sse_parser_gemini.ex # SSE parser (Gemini format) │ │ ├── names.ex # Random name generator │ │ ├── orchestrator.ex # Multi-agent coordination │ │ ├── presets.ex # 78 agent personality presets │ │ ├── presets_batch_1.ex # 20 additional presets (fun, personality) │ │ ├── presets_batch_2.ex # 30 additional presets (fictional, professional, historical) │ │ ├── meta_presets/ │ │ │ ├── engine.ex # Session state machine │ │ │ ├── meta_preset.ex # Preset struct │ │ │ ├── registry.ex # Meta preset registration │ │ │ ├── topology.ex # Message routing rules │ │ │ └── patterns/ # 11 interaction patterns │ │ └── tools/ # 19 tool modules │ └── lilbots_web/ │ ├── components/ │ │ ├── agent_card.ex # Agent grid cards │ │ ├── agent_config_panel.ex # Sidebar config panel │ │ ├── cost_display.ex # SVG cost badges and sparklines │ │ ├── game_boards.ex # SVG game board rendering │ │ ├── new_agent_modal.ex # Agent creation modal │ │ ├── tutorial_modal.ex # Tutorial/mission briefing modal │ │ └── wire.ex # SVG wire visualization │ ├── helpers/ │ │ ├── markdown.ex # Markdown rendering with XSS sanitization │ │ └── status_helpers.ex # Shared agent status display helpers │ └── live/ │ ├── dashboard_live.ex # Main dashboard LiveView │ ├── dashboard_live.html.heex # Dashboard template │ ├── dashboard/ │ │ ├── fleet.ex # Fleet management handlers │ │ ├── shuffle.ex # Shuffle/randomization handlers │ │ ├── streaming.ex # Streaming response handlers │ │ └── tutorial.ex # Tutorial flow handlers │ └── meta_preset_live.ex # Meta preset session UI ├── test/ # 385 tests ├── e2e/ # Playwright e2e tests └── docs/ # Architecture and design docs ``` ## Configuration Copy `env.example` to `.env` and set your API keys: ```bash cp env.example .env # Edit .env with your GROQ_API_KEY (required) # Optionally add ANTHROPIC_API_KEY for Claude models # Optionally add GEMINI_API_KEY for Gemini models # Optionally add BRAVE_SEARCH_API_KEY for web search tool # Optionally set LILBOTS_AGENT_COUNT to change startup agent count (default: 4) ``` See `env.example` for all available environment variables including Anthropic, Gemini, and Brave Search configuration. See [docs/model-pricing.md](docs/model-pricing.md) for per-model token pricing and tier categories. ## License [MIT](LICENSE)
↗ GitHub