Voice support that grounds every answer in the customer's actual account.
Real-time voice agent for banking and financial services. Customer profile loads from your Postgres at session start. PII redacted before TTS. No general 'ask the LLM' path.
No black box. Each step is a typed-frame node you can edit, monitor, and replace.
Caller speaks. Audio streams in over WebSocket.
STT transcribes. The agent reads the customer's profile and recent transactions from Postgres.
The LLM answers grounded in those rows. Nothing else.
Outbound TTS audio streams back. PII regex hits are redacted before TTS reads them.
Streaming graph contract, observability, and cost discipline come standard. The agent ships with a full test suite that runs in CI on every node version bump.
Customer profile and recent transactions are pulled from your Postgres at session start. The LLM is constrained to that context. No hallucinated balances, no invented merchants.
WebSocket in, WebSocket out. STT, LLM, and TTS run concurrently behind a typed-frame contract. Audio chunks ship as they're produced - no full-utterance waits.
Outbound responses pass through streaming-safety-filter before synthesis. SSN, email, US phone, account numbers - redacted, blocked, or dropped per your policy.
Conversation history stays under the model's context window via streaming-summariser-rolling. Long calls don't degrade. Cost per turn stays bounded.
Every turn writes spans to workflow_spans - node latency, frame bytes, queue depth, watermark. The canvas overlay shows what each node did, when, and why.
≈$0.02 per turn at the default models (whisper-1 + gpt-4o + tts-1). Swap any one of them in the install wizard. The graph stays intact.
Every dependency is a registered Strongly service or a model you control. Swap any one of them in the install wizard. The graph stays intact.
The marketplace template is the graph. Every customisation below is a config change or a single-node addition - never a rewrite.
Switch the llm node's model to gpt-4o-mini or a self-hosted llama-3.1-70b. The system prompt is short - a smaller model usually does.
Edit the prompt template. Common additions: scope ('checking and savings only'), tone, refusal rules ('never quote rates').
Replace streaming-db-reader with a join across multiple tables, or chain it with streaming-vector-retrieval. Variables flow into the prompt as {{name}} substitutions.
Insert streaming-confidence-router between stt and prompt. Branch on detected language. Send non-English speech down a parallel pipeline.
Swap websocket-response for telephony-response (SIP) or webrtc-response. Or chain into streaming-aggregator → kafka-producer for the data lake.
We don't leave until it runs. Talk to a forward-deployed engineer about deploying Voice Banking Assistant into your environment with your STT, your LLM, your TTS, your data.
Schedule a Demo