Strongly Certified · Streaming Workflow

Voice Banking Assistant

Voice support that grounds every answer in the customer's actual account.

Real-time voice agent for banking and financial services. Customer profile loads from your Postgres at session start. PII redacted before TTS. No general 'ask the LLM' path.

Talk to an Engineer See What It Does

≤1.5s

First audio response (p95)

≤4s

Turn complete (p95)

$0.02

Per-turn cost (default models)

What it does

The voice loop, end-to-end.

No black box. Each step is a typed-frame node you can edit, monitor, and replace.

Caller speaks. Audio streams in over WebSocket.

STT transcribes. The agent reads the customer's profile and recent transactions from Postgres.

The LLM answers grounded in those rows. Nothing else.

Outbound TTS audio streams back. PII regex hits are redacted before TTS reads them.

Capabilities

Built for production. Day Two-ready.

Streaming graph contract, observability, and cost discipline come standard. The agent ships with a full test suite that runs in CI on every node version bump.

Grounded answers

Customer profile and recent transactions are pulled from your Postgres at session start. The LLM is constrained to that context. No hallucinated balances, no invented merchants.

Postgres addonPer-session loadStrict context

Streaming voice loop

WebSocket in, WebSocket out. STT, LLM, and TTS run concurrently behind a typed-frame contract. Audio chunks ship as they're produced - no full-utterance waits.

WebSocketReal-timeADR-S11 frames

PII before TTS

Outbound responses pass through streaming-safety-filter before synthesis. SSN, email, US phone, account numbers - redacted, blocked, or dropped per your policy.

Regex presetsConfigurable actionPre-TTS

Rolling summary

Conversation history stays under the model's context window via streaming-summariser-rolling. Long calls don't degrade. Cost per turn stays bounded.

Token-budgetedPer-turnCost-stable

Live span tree

Every turn writes spans to workflow_spans - node latency, frame bytes, queue depth, watermark. The canvas overlay shows what each node did, when, and why.

ADR-S14Per-turnCanvas overlay

Cost line you can quote

≈$0.02 per turn at the default models (whisper-1 + gpt-4o + tts-1). Swap any one of them in the install wizard. The graph stays intact.

Fixed defaultsSwappable modelsPredictable spend

Built on

Real services. Your stack.

Every dependency is a registered Strongly service or a model you control. Swap any one of them in the install wizard. The graph stays intact.

Postgres addon

Customer profile, balances, recent transactions

STT model

whisper-1 default - swap any registered STT

LLM model

gpt-4o default - swap any registered chat model

TTS model

tts-1 default - swap any registered TTS

Five common customisations

Tune it. Don't fork it.

The marketplace template is the graph. Every customisation below is a config change or a single-node addition - never a rewrite.

Smaller LLM

Switch the llm node's model to gpt-4o-mini or a self-hosted llama-3.1-70b. The system prompt is short - a smaller model usually does.

Tighter scope

Edit the prompt template. Common additions: scope ('checking and savings only'), tone, refusal rules ('never quote rates').

Hybrid retrieval

Replace streaming-db-reader with a join across multiple tables, or chain it with streaming-vector-retrieval. Variables flow into the prompt as {{name}} substitutions.

Language routing

Insert streaming-confidence-router between stt and prompt. Branch on detected language. Send non-English speech down a parallel pipeline.

Different sink

Swap websocket-response for telephony-response (SIP) or webrtc-response. Or chain into streaming-aggregator → kafka-producer for the data lake.