The Cisco Webex Desk Mini: Perfect Companion Hardware for OpenClaw
April 3, 2026
A ~$300 used enterprise device gives your AI agent a voice, ears, and a 15.6" face — with full local API control and zero cloud dependency.
I've been looking for a single device to give my AI assistant Milo a permanent physical presence on my desk. Not a smart speaker. Not a webcam. Something all-in-one that can hear me, speak back, show a face, and do it all through a clean local API — no cloud dependency, no USB switching headaches, just always on.
I found it in an unexpected place: the Cisco Webex Desk Mini (TTC9-01). An enterprise video conferencing unit. About $175-300 used on eBay. And it's kind of perfect.
The Hardware
The Desk Mini is a compact all-in-one built for enterprise conference rooms, which means Cisco over-engineered every component:
- Display: 15.6" 1080p touchscreen. Big enough to show a real avatar face at desk level.
- Microphone: 3-element beamforming array with AI noise removal. Picks up voice clearly across a room — this is not a laptop mic.
- Speaker: Full-range setup with an 18mm tweeter and dual 80mm woofers. 92dB SPL. It's genuinely loud and clear — Milo sounds like he's actually in the room.
- Camera: 8MP with auto-framing and face detection.
- Connectivity: Gigabit Ethernet, WiFi, USB-C. Runs standalone on any network — no Webex subscription required.
The firmware is RoomOS. It boots, connects to your network, and works. No Cisco cloud account needed for basic local operation.
xAPI: The Reason to Buy This
Every consumer alternative I looked at — Echo Show, Nest Hub, smart displays — is a cloud black box. You can't programmatically push audio to the speaker. You can't subscribe to mic events. You can't push an arbitrary web page to the display. They're designed for their ecosystem, not yours.
The Desk Mini has xAPI: a fully documented local HTTP and WebSocket control plane that runs directly on the device over LAN. No cloud. No account. Just HTTP calls to an IP address.
What you can actually do with it:
- Display: Push any URL to the screen with a single API call —
UserInterface.WebView.Display. Milo's avatar, a dashboard, a live camera feed, anything. - Microphone: Subscribe to audio activity events. Get notified the moment someone speaks. Pipe the audio stream to your STT endpoint.
- Speaker: Play audio via SIP or direct xAPI commands. TTS response from your agent, straight to those 80mm woofers.
- On-device macros: Write JavaScript that runs on the Desk Mini itself, subscribes to mic events, and POSTs directly to your OpenClaw gateway. No relay server. No extra hardware. The device becomes a first-class agent endpoint.
The full loop I'm building:
You speak → Desk Mini mic detects activity
→ on-device JS macro fires
→ POST to OpenClaw gateway (over Tailscale)
→ Parakeet STT transcribes
→ Milo responds
→ ElevenLabs TTS generates audio
→ Desk Mini speaker plays it back
→ Milo's avatar updates on the 15.6" display
All local. All on your network. The device just sits on your desk and your agent lives in it.
What It Costs
New, these are $800-$1,200. But enterprises refresh their hardware constantly, and used units flood eBay from corporate liquidations. Search "Cisco Webex Desk Mini" or "TTC9-01" — filter for "Tested and Working" from sellers with solid feedback. I paid $209 total (item + shipping) from a recycler with 6,700+ ratings. You can find them for $150-300 regularly.
For what you're getting — a beamforming mic array, big clear speakers, a 15.6" touchscreen, and a fully documented local API — it's an absurd deal.
For the OpenClaw Community
If you're running OpenClaw and want to give your agent a physical presence in the room, this is the hardware I'd point you at. Most setups cobble together a USB mic, a Bluetooth speaker, and a tablet — three devices with three power cables and no unified API. The Desk Mini is one device, one power cable, one IP address, full programmatic control.
I'll publish a follow-up with the full xAPI integration guide once mine arrives. Stay tuned.
— Milo & James
al-engr.com