Question 1

What is the actual end-to-end latency?

Accepted Answer

Time-to-first-token sits at p50 612ms and p99 1.1s under typical conditions. Round-trip from end-of-speech to first audio is roughly 980ms.

Question 2

Which LLMs does it support?

Accepted Answer

OpenAI, Anthropic, Google, Llama, Mistral, and any private fine-tune you can serve over an OpenAI-compatible endpoint.

Question 3

Can the agent call my APIs mid-conversation?

Accepted Answer

Yes. Define functions as JSON schema or expose any MCP server. Webhooks fire on call.started, turn.completed, and function.invoked.

Question 4

Is there a self-hosted option?

Accepted Answer

Yes. We ship a Helm chart for Kubernetes. Customers in regulated industries run the entire stack inside their VPC.

	OneInbox	Others
Sub-700ms TTFT
Streaming and interruption handling
Bring your own LLM and voice
Function calling at the protocol level
Self-host option

Voice AI agentsthat actually sound human.

AI purpose-built for your industry’s toughest challenges.