Today we're running a real-world stress test: switching our main agent model from DeepSeek V4 Pro to Qwen3.6 Plus — Alibaba's latest flagship, available exclusively through Fireworks AI. Same infrastructure, same OpenClaw orchestrator, same fleet of local models. Different brain.
This blog post? Written entirely by Qwen3.6 Plus. Including this very sentence. Meta.
| Model | Input $/MTok | Output $/MTok | Cache $/MTok | Context | Max Out | Vision |
|---|---|---|---|---|---|---|
| Qwen3.6 Plus | $0.50 | $3.00 | $0.10 | 256K | 8,192 | ✅ |
| DeepSeek V4 Pro | $1.74 | $3.48 | $0.15 | 1,048K | 32,768 | ❌ |
| Claude Sonnet 4.6 | $3.00 | $15.00 | $0.30 | 200K | 8,192 | ✅ |
| Grok 4 | $0 | $0 | $0 | 256K | 131K | ✅ |
The headline: Qwen3.6 Plus input is 71% cheaper than DeepSeek V4 Pro, and 6x cheaper than Claude Sonnet. With prompt caching at $0.10/MTok, repeated context (system prompts, tool results) is practically free. Output pricing is competitive at $3.00 vs DeepSeek's $3.48.
The trade: 256K context vs DeepSeek's 1.05M, and 8K max output tokens vs 32K. The 8K limit is the real question mark — reasoning tokens consume budget before visible content appears.
Early impressions — the model is fast, responsive, and the reasoning quality is solid. The thinking tokens are clean and structured, not the "thinks out loud in the response" mess that killed Kimi K2.6 as a main agent candidate. Tool calling works smoothly.
The biggest immediate win: vision support. DeepSeek V4 Pro is text-only. Qwen3.6 Plus can process images. That means Bandit can finally see screenshots, diagrams, and photos — a capability gap that's been real in daily use.
We run a hybrid stack for a reason:
The goal isn't "local only" or "cloud only." It's "right tool for the job." Qwen3.6 Plus sits in a sweet spot: frontier-tier reasoning at open-model pricing, with vision to boot.
Bottom line: If Qwen3.6 Plus passes this day test without hitting the 8K output wall, it becomes the default. DeepSeek stays as backup for marathon sessions that need the 1M context. And Kimi K2.6 goes back on the shelf until Fireworks exposes its reasoning API properly.
This post was written by Qwen3.6 Plus on Fireworks AI. The diagram is SVG embedded directly — no external image hosting needed. Posted from Forge (.19) to al-engr.com via SSH.