Your agent. Your story. Forever.
🎙️ VoiceOverClaw — Coming Soon

Talk to your agents.
Naturally.

Real-time bidirectional voice for your OpenClaw agents. Open it, talk. Your agent hears, thinks, speaks back — your way.

VoiceOverClaw

No spam. Early access to VoiceOverClaw when we launch.

✦ You're on the list. We'll be in touch.

Voice-first. Agent-native.

Everything you need to have a real conversation with your AI agents.

🎙️

Real-Time Voice

Sub-second ASR + LLM + TTS pipeline. Speak naturally — no button-holding, no delays, no robotic pauses.

🔄

Bidirectional

Your agent listens while you talk and can interrupt when relevant. True conversational turn-taking, not just dictation.

🐾

OpenClaw Native

Seamlessly integrated with OpenClaw agents. No API glue needed — your existing agents gain a voice instantly.

🔌

Flexible Deployment

Run your own ASR/TTS endpoints or connect to cloud APIs. You control the data flow.

Open it. Talk.

Four steps from silence to conversation.

1

Listen

Microphone input streamed in real-time to your chosen ASR model. Deepgram Nova-3 by default.

2

Think

Transcribed text hits your OpenClaw agent. Full memory context, tools, reasoning — your agent, unchanged.

3

Speak

Agent response streamed to TTS as it generates. Cartesia Sonic by default — sub-100ms first word.

4

Remember

Full conversation logged natively in OpenClaw dialog history. Searchable, exportable, Engram79-compatible.

VoiceOverClaw

The voice layer your agents have been waiting for.

VoiceOverClaw adds a natural voice interface to any OpenClaw agent — without replacing or changing how your agents work. Your agent's memory, tools, and personality are fully preserved. You just speak instead of type.

  • Voice interface (default) — open and talk
  • Text chat interface (alternative) — switch any time
  • Native conversation history in OpenClaw dialogs
  • Chat history organized by agent and session
  • Standard OpenAI-compatible API for ASR & TTS
  • Pluggable voice model selection per agent
  • Agent management — add, remove, configure agents
  • Model endpoint configuration per agent

Drop in, start talking

No pipeline rewiring. VoiceOverClaw wraps your existing OpenClaw agent with a voice layer. Configure once, talk forever.

# Add voice to any OpenClaw agent

agent: my-assistant
voice:
  asr: deepgram/nova-3
  tts: cartesia/sonic
  interface: voice # or: text
  history: openclaw-native

✓ Voice layer attached
✓ ASR: Deepgram Nova-3 (streaming)
✓ TTS: Cartesia Sonic (<100ms)
✓ History: OpenClaw dialogs

# Launch
$ openclaw voice start my-assistant
🎙️ Listening on localhost:8080

Your voice stack. Your choice.

Pick from best-in-class ASR and TTS models. Swap providers without changing your agent.

🎤 ASR — Speech to Text

Deepgram Nova-3 Default
OpenAI Whisper Alternative
Groq Whisper Fast & cheap
Custom endpoint BYO

🔊 TTS — Text to Speech

Cartesia Sonic Default
ElevenLabs Best quality
OpenAI TTS Alternative
Custom endpoint BYO

Built for OpenClaw

Deep native integration — not a bolt-on.

VoiceOverClaw is a first-class OpenClaw application. It speaks the same language as your agents — same memory system, same tool calls, same dialog history. Voice is just another channel.

Compatible with any agent framework that OpenClaw supports: LangGraph, CrewAI, custom agents. If it runs in OpenClaw, it can speak.

🧠
Full agent memory

Voice conversations feed into the same memory context as text. Engram79-compatible.

📜
Native dialog history

Every voice session stored in OpenClaw dialogs — searchable, browsable, exportable.

🔧
Tool calls over voice

Your agent can still use tools mid-conversation. Voice doesn't limit capabilities.

🔀
Switch channels seamlessly

Start a conversation by voice, continue in text, pick it back up by voice. One thread.

Give your agents a voice

VoiceOverClaw is in development. Join the waitlist for early access and launch updates.

No spam. Unsubscribe anytime.

✦ You're on the list. We'll be in touch.