The Mission
A fully local agentic inference harness built to run on a 2021 MacBook Pro. Conductor gives a local LLM a ReAct loop, persistent memory, a sandboxed workspace, and a set of tools — file operations, URL fetching, shell commands — with no external API calls and no data leaving the machine.
System Architecture
graph TD
U[User Input] --> TUI[Rich TUI]
TUI --> RL[ReAct Loop]
RL -->|Think| LLM[Qwen3 via Ollama]
LLM -->|Act| TD2[Tool Dispatcher]
TD2 --> FS[File Ops]
TD2 --> WF[URL Fetch]
TD2 --> SH[Shell Commands]
TD2 --> MEM[Persistent Memory]
FS & WF & SH & MEM -->|Observe| RL
RL -->|Max 8 steps| OUT[Final Response]
Key Engineering Challenges
- Two-model architecture rejected: Benchmarked a dispatcher model (qwen2.5-coder:7b) against running Qwen3:9b directly for tool routing. Qwen3 outperformed the smaller model on coding tasks within the 16GB memory constraint, making the added complexity unnecessary.
- Workspace sandboxing: All file and shell tools operate within a scoped workspace directory. Paths are validated before execution to prevent traversal outside the sandbox, following OWASP guidelines for tool safety.
- ReAct loop cap: The loop is capped at 8 reasoning steps to prevent runaway inference chains, with graceful degradation that returns the best available partial answer if the cap is hit.