UG JS SDK

Orchestrates the entire conversation flow.
Manages state transitions (e.g., from playing audio to listening for the user).
Initializes the WebSocket connection and microphone access.
Provides a high-level API: initialize(), pause(), resume(), sendText(), etc.

Configuration: It's instantiated with a ConversationConfig object, which is crucial for defining its behavior. The hooks property within this config is the primary way the SDK communicates back to your application UI.

This manager handles all low-level WebSocket communication.

Establishes, maintains, and closes the WebSocket connection.
Handles authentication and initial configuration messages.
Sends user input (audio/text) to the server.
Receives assistant responses (audio, subtitles, metadata) and forwards them to the appropriate managers via events.

This manager is responsible for capturing everything the user says or types.

Initializes and manages the AudioRecorder to get raw audio data from the microphone.
Uses a VADManager, powered by the industry-standard Silero VAD model, to detect speech with high accuracy, automatically starting and stopping the recording process.
Packages audio data and text into the correct format to be sent over the network.
Implements a critical "barge-in" feature: when the assistant's audio playback is about to finish (within 1000ms), it proactively starts buffering the user's audio. This creates a seamless, responsive conversation by minimizing the delay between turns.

🎤 AudioRecorder: Interfaces with the browser's MediaRecorder or an AudioWorklet to capture audio chunks.
🤫 VADManager: Runs the lightweight Silero VAD model to determine if the user is speaking (Along with Server Side Smart Turn Detection Model)

This manager handles the rendering of the assistant's response.

Receives messages from Conversation Network and directs them to the correct player.
Coordinates the synchronized playback of audio, subtitles, and avatar animations.

🎵 AudioPlayer.ts: A robust audio player that handles chunked audio data, ensuring smooth, gapless playback of streamed audio.
📜 SubtitleManager.ts: Manages the display and timing of word-by-word or line-by-line subtitles.
🧒 AvatarManager.ts: Provides a simple API (playIdle, playTalk, playListen) to control high-level avatar animations. It emits events that a UI component can listen to in order to drive the actual animation system (e.g., Spine, Rive, Three.js).