Conductor: Local Inference Harness

Personal Project / AI Infrastructure

Conductor: Local Agentic Inference Harness

Python Ollama Qwen3 ReAct Rich TUI View on GitHub →

The Mission A fully local agentic inference harness built to run on a 2021 MacBook Pro. Conductor gives a local LLM a ReAct loop, persistent memory, a sandboxed workspace, and a set of tools — file operations, URL fetching, shell commands — with no external API calls and no data leaving the machine.

System Architecture

Key Engineering Challenges

Two-model architecture rejected: Benchmarked a dispatcher model (qwen2.5-coder:7b) against running Qwen3:9b directly for tool routing. Qwen3 outperformed the smaller model on coding tasks within the 16GB memory constraint, making the added complexity unnecessary.
Workspace sandboxing: All file and shell tools operate within a scoped workspace directory. Paths are validated before execution to prevent traversal outside the sandbox, following OWASP guidelines for tool safety.
ReAct loop cap: The loop is capped at 8 reasoning steps to prevent runaway inference chains, with graceful degradation that returns the best available partial answer if the cap is hit.