Transcription API
The UG Labs Interaction API allows you to stream audio and receive real-time transcription results via WebSocket.
Important
We support up to 30 seconds for each transcribe action. If you need more than that, you must paginate your request and call transcribe on every 30 seconds or less of audio.
Try it out
Test the Transcription API interactively using our STT Tester. Record audio from your microphone or upload an audio file to see transcription in action.
Authentication
Each connection requires a valid access token.
You can generate one following our Authentication Guide.
Connection Flow
- Connect to the WebSocket endpoint
- Send an
authenticatemessage with your access token - Stream audio in chunks via
add_audio - Send a
transcriberequest - Receive the final transcription response
Endpoint
Staging:
wss://pug.stg.uglabs.app/interact
Message Format
All client messages follow this format:
{
"type": "request",
"uid": "unique-client-id",
"kind": "authenticate | add_audio | transcribe",
"timestamp": "2025-10-05T12:00:00Z"
}
Example — Python
import base64
import json
import uuid
from datetime import datetime, timezone
from websocket import create_connection
AUDIO_FILE = "sample.mp3"
ACCESS_TOKEN = "<YOUR_ACCESS_TOKEN>"
URL = "wss://pug.stg.uglabs.app/interact"
LANGUAGE_CODE = "en"
UID = str(uuid.uuid4())
CHUNK_SIZE = 32000 # 32 KB
headers = [f"Authorization: Bearer {ACCESS_TOKEN}"]
def make_rpc(kind, **fields):
return {
"type": "request",
"uid": UID,
"kind": kind,
"timestamp": datetime.now(timezone.utc).isoformat(),
**fields,
}
ws = create_connection(URL, header=headers)
# Authenticate
ws.send(json.dumps(make_rpc("authenticate", access_token=ACCESS_TOKEN)))
print("Auth:", json.loads(ws.recv()))
# Send audio chunks
with open(AUDIO_FILE, "rb") as f:
while chunk := f.read(CHUNK_SIZE):
ws.send(json.dumps(make_rpc(
"add_audio",
audio=base64.b64encode(chunk).decode(),
config={"sampling_rate": 48000, "mime_type": "audio/mpeg"}
)))
print("Chunk:", json.loads(ws.recv()))
# Request transcription
transcribe_req = make_rpc("transcribe", language_code=LANGUAGE_CODE)
ws.send(json.dumps(transcribe_req))
# Receive transcription
while True:
res = json.loads(ws.recv())
if res.get("kind") == "transcribe":
print("Transcription:", res["text"])
break
ws.close()
Example — JavaScript (Node.js)
import WebSocket from "ws";
import fs from "fs";
import { v4 as uuidv4 } from "uuid";
const AUDIO_FILE = "sample.mp3";
const ACCESS_TOKEN = "<YOUR_ACCESS_TOKEN>";
const URL = "wss://pug.stg.uglabs.app/interact";
const LANGUAGE_CODE = "en";
const UID = uuidv4();
const CHUNK_SIZE = 32000;
const ws = new WebSocket(URL, {
headers: { Authorization: `Bearer ${ACCESS_TOKEN}` },
});
function makeRpc(kind, fields = {}) {
return { type: "request", uid: UID, kind, timestamp: new Date().toISOString(), ...fields };
}
ws.on("open", () => {
console.log("Connected");
ws.send(JSON.stringify(makeRpc("authenticate", { access_token: ACCESS_TOKEN })));
const buffer = fs.readFileSync(AUDIO_FILE);
for (let i = 0; i < buffer.length; i += CHUNK_SIZE) {
const chunk = buffer.subarray(i, i + CHUNK_SIZE);
ws.send(JSON.stringify(makeRpc("add_audio", {
audio: chunk.toString("base64"),
config: { sampling_rate: 48000, mime_type: "audio/mpeg" },
})));
}
ws.send(JSON.stringify(makeRpc("transcribe", { language_code: LANGUAGE_CODE })));
});
ws.on("message", (data) => {
const msg = JSON.parse(data);
if (msg.kind === "transcribe") {
console.log("Transcription:", msg.text);
ws.close();
}
});
ws.on("close", () => console.log("Connection closed"));
Example Response
Transcription Result
{
"type": "response",
"uid": "d1deb6ea-6b6a-4957-b59f-741bc70c5b8a",
"kind": "transcribe",
"client_start_time": null,
"server_start_time": "2025-10-05T09:20:03.188700Z",
"server_end_time": "2025-10-05T09:20:13.425967Z",
"text": "the amazon rainforest, ..."
}
Notes
- Audio chunk size: ≤ 32 KB per message
- Format: MP3, OGG or WAV (
audio/mpeg,audio/ogg, oraudio/wav) - Sample rate: 48kHz recommended
- Order: Always authenticate → add_audio → transcribe
- Response: Transcription is returned in the
"text"field