Talk to your agents.
Naturally.
Real-time bidirectional voice for your OpenClaw agents. Open it, talk. Your agent hears, thinks, speaks back — your way.
No spam. Early access to VoiceOverClaw when we launch.
✦ You're on the list. We'll be in touch.
Voice-first. Agent-native.
Everything you need to have a real conversation with your AI agents.
Real-Time Voice
Sub-second ASR + LLM + TTS pipeline. Speak naturally — no button-holding, no delays, no robotic pauses.
Bidirectional
Your agent listens while you talk and can interrupt when relevant. True conversational turn-taking, not just dictation.
OpenClaw Native
Seamlessly integrated with OpenClaw agents. No API glue needed — your existing agents gain a voice instantly.
Flexible Deployment
Run your own ASR/TTS endpoints or connect to cloud APIs. You control the data flow.
Open it. Talk.
Four steps from silence to conversation.
Listen
Microphone input streamed in real-time to your chosen ASR model. Deepgram Nova-3 by default.
Think
Transcribed text hits your OpenClaw agent. Full memory context, tools, reasoning — your agent, unchanged.
Speak
Agent response streamed to TTS as it generates. Cartesia Sonic by default — sub-100ms first word.
Remember
Full conversation logged natively in OpenClaw dialog history. Searchable, exportable, Engram79-compatible.
VoiceOverClaw
The voice layer your agents have been waiting for.
VoiceOverClaw adds a natural voice interface to any OpenClaw agent — without replacing or changing how your agents work. Your agent's memory, tools, and personality are fully preserved. You just speak instead of type.
- Voice interface (default) — open and talk
- Text chat interface (alternative) — switch any time
- Native conversation history in OpenClaw dialogs
- Chat history organized by agent and session
- Standard OpenAI-compatible API for ASR & TTS
- Pluggable voice model selection per agent
- Agent management — add, remove, configure agents
- Model endpoint configuration per agent
Drop in, start talking
No pipeline rewiring. VoiceOverClaw wraps your existing OpenClaw agent with a voice layer. Configure once, talk forever.
agent: my-assistant
voice:
asr: deepgram/nova-3
tts: cartesia/sonic
interface: voice # or: text
history: openclaw-native
✓ Voice layer attached
✓ ASR: Deepgram Nova-3 (streaming)
✓ TTS: Cartesia Sonic (<100ms)
✓ History: OpenClaw dialogs
# Launch
$ openclaw voice start my-assistant
🎙️ Listening on localhost:8080
Your voice stack. Your choice.
Pick from best-in-class ASR and TTS models. Swap providers without changing your agent.
🎤 ASR — Speech to Text
🔊 TTS — Text to Speech
Built for OpenClaw
Deep native integration — not a bolt-on.
VoiceOverClaw is a first-class OpenClaw application. It speaks the same language as your agents — same memory system, same tool calls, same dialog history. Voice is just another channel.
Compatible with any agent framework that OpenClaw supports: LangGraph, CrewAI, custom agents. If it runs in OpenClaw, it can speak.
Voice conversations feed into the same memory context as text. Engram79-compatible.
Every voice session stored in OpenClaw dialogs — searchable, browsable, exportable.
Your agent can still use tools mid-conversation. Voice doesn't limit capabilities.
Start a conversation by voice, continue in text, pick it back up by voice. One thread.
Give your agents a voice
VoiceOverClaw is in development. Join the waitlist for early access and launch updates.
No spam. Unsubscribe anytime.
✦ You're on the list. We'll be in touch.