An intelligent local LLM routing system that automatically selects the optimal model for each task. Built on Mac Studio M3 Ultra with 512GB unified memory.

The Stack

Hardware

Mac Studio M3 Ultra with 512GB unified memory

Software

Models

Intelligent Routing

The router classifies each request and selects the optimal model. Simple queries go to fast lightweight models, complex reasoning goes to the 70B model.

Model selection by task type:

Performance Results

Benchmark results:

Integration Status

The local brain works with any OpenAI-compatible tool:

Note: OpenClaw integration is not currently working. The framework doesn't yet support custom provider endpoints for local models. We're hoping for a fix soon.