# @koi-video/voice-realtime-sdk-beta

[中文文档](https://cdn.jsdelivr.net/npm/@koi-video/voice-realtime-sdk-beta@latest/README.zh-CN.md)

Browser SDK for real-time voice: REST session (`/start`, `/stop`) with Robot-Key / Robot-Token, plus WebRTC (mic, remote audio, stream messages). The Agora Web SDK is a normal dependency of this package.

---

## Installation

```bash
pnpm add @koi-video/voice-realtime-sdk-beta
```

```js
import {
  ChatClient,
  VoiceEvents
} from '@koi-video/voice-realtime-sdk-beta';
```

| Export | Description |
|--------|-------------|
| `ChatClient` | Main client. |
| `VoiceEvents` | Event name constants (see [Events](#events)). |

---

### Minimal flow

```js
import { ChatClient, VoiceEvents } from '@koi-video/voice-realtime-sdk-beta';

const client = new ChatClient(
  {
    robotKey: process.env.ROBOT_KEY,
    robotToken: process.env.ROBOT_TOKEN
  },
  {
    userName: 'demo-user-1'
  }
);

client.on(VoiceEvents.ERROR, err => {
  console.error(err.code, err.message);
});

client.on(VoiceEvents.ROBOT_MESSAGE, ({ sessionId, content, segmentId }) => {
  console.log('bot:', content, segmentId);
});

client.on(VoiceEvents.USER_MESSAGE, ({ content, segmentId }) => {
  console.log('user:', content, segmentId);
});

const sessionId = await client.startVoiceChat();

client.interrupt();
await client.setAudioEnabled({ bool: false });
await client.setAudioEnabled({ bool: true });
await client.stopVoiceChat();
```

`startVoiceChat` returns **`sessionId`** (same as backend `room_name`). One active session per `ChatClient`; call `stopVoiceChat()` (or wait for `SESSION_ENDED`) before starting again.

### Catch-all listener

```js
client.on(VoiceEvents.ALL, (eventName, data) => {
  console.log(eventName, data);
});
```

---

## Events

Subscribe with `client.on(VoiceEvents.XXX, handler)` or `client.on(VoiceEvents.ALL, (eventName, data) => ...)`.

### `VoiceEvents`

| Constant | When / payload |
|----------|------------------|
| `SESSION_CREATED` | After `/start` + RTC ready — `{ sessionId }`. |
| `SESSION_STARTED` | Immediately after `SESSION_CREATED`, same `{ sessionId }`. |
| `SESSION_ENDED` | Stop, RTC disconnect, remote presence timeout, or stop failure — `{ sessionId, reason }`. |
| `USER_MESSAGE` | Stream `topic === 'chat'`, user — `{ sessionId, content, segmentId?, timestamp? }`. |
| `ROBOT_MESSAGE` | Stream `topic === 'chat'`, assistant — same shape. |
| `AUDIO_MUTED` / `AUDIO_UNMUTED` | After `setAudioEnabled` — `{ sessionId }`. |
| `INTERRUPT` | Stream `topic === 'interrupt'` — `{ sessionId }`. |
| `ERROR` | API / RTC / parse errors — `code`, `message`, optional `sessionId`. |
| `ALL` | Every event: `(eventName, data)`. |

On `ALL`, you may also see `FLOW_DEBUG`, `SIDE_INFO`, `STREAM_MESSAGE` as the first argument.

Typical `ERROR.code` values: `HTTP_*`, `NETWORK`, `RTC_ERROR`, `STREAM_PARSE`, `CHAT_ERROR`, `RTC_STOP`, `STOP_API`, `REMOTE_JOIN_TIMEOUT` (no remote user joined within **15s** after local RTC join), `REMOTE_REJOIN_TIMEOUT` (after the **last** remote left, no remote rejoined within **30s**).

---

## Demo (this repo)

```bash
cd packages/voice-realtime-sdk
pnpm install
pnpm demo
pnpm demo:build
pnpm demo:preview
```

See `demo/` (Vite + optional HTTPS in `demo/vite.config.js`).