Streams transcripts to Kafka. Archives audio + transcript to S3. Both in real time.
WebSocket audio in, STT, fan-out: per-turn JSON envelopes to a Kafka topic for downstream analytics, full session audio + transcript to S3 via the streaming-recorder. The classic cold-side data lake pipeline for voice traffic, without a separate post-call ingest.
No black box. Each step is a typed-frame node you can edit, monitor, and replace.
Caller speaks. Audio streams in over WebSocket at 16 kHz PCM mono.
VAD endpoints, STT transcribes. Each finalised TextFrame fans out to two sinks.
Sink 1: streaming-kafka-producer publishes a JSON envelope (session_id, turn_id, role, text, timestamp_ms) keyed by session_id. gzip + acks=all.
Sink 2: streaming-recorder buffers audio + transcript, uploads to your S3 bucket on session end under live-transcription/
Streaming graph contract, observability, and cost discipline come standard. The agent ships with a full test suite that runs in CI on every node version bump.
streaming-kafka-producer with aiokafka 0.12. JSON envelope by default with session_id / turn_id / role / text / timestamp_ms; raw-text mode available. Configurable key template + compression + acks.
streaming-recorder buffers audio + frame-level transcript and uploads on session end. Configurable bucket, key prefix, audio direction. S3-protocol compatible (real S3, R2, MinIO).
Both sinks consume the same STT text_out port via the multi-edge ports declared in ADR-S13. Kafka publish doesn't block S3 upload, and vice versa.
WebSocket trigger control_out feeds both sinks - StartFrame opens the Kafka producer + recorder, EndFrame flushes and closes both. No frames lost on session end.
Kafka records key off session_id by default so a session's transcripts land on one partition for ordered consumption. Switch to {{session_id}}-{{turn_id}} for parallel-per-turn analytics.
Kafka publish latency, S3 upload size, and producer record counts all land on per-turn spans. Filter the canvas overlay by sink to find the slow leg.
Every dependency is a registered Strongly service or a model you control. Swap any one of them in the install wizard. The graph stays intact.
The marketplace template is the graph. Every customisation below is a config change or a single-node addition - never a rewrite.
Pin kafka_out.config.topic per workflow instance - sales calls, support, onboarding all in their own topics.
Change kafka_out.config.key_template to {{session_id}}-{{turn_id}} so each turn lands on a fresh partition.
Flip kafka_out.config.include_metadata off when downstream Logstash-style consumers expect only the transcript text.
Add streaming-conversation-store between STT and Kafka so per-session canonical transcripts land in Mongo alongside the Kafka data lake feed.
Insert streaming-language-detect after STT and use streaming-conditional to route per-language to different Kafka topics.
We don't leave until it runs. Talk to a forward-deployed engineer about deploying Live Transcription + Analytics into your environment with your STT, your LLM, your TTS, your data.
Schedule a Demo