Realtime API

Overview

The Realtime API provides WebSocket-based bidirectional communication for applications requiring ultra-low latency, such as voice assistants and interactive agents.

Connection

wss://api.yuhuanstudio.com/v1/realtime

Authentication

Pass the API key as a query parameter or in the initial handshake:

const ws = new WebSocket(
  "wss://api.yuhuanstudio.com/v1/realtime?api_key=YOUR_API_KEY"
);

Message Protocol

All messages are JSON-encoded:

Send Input

{
  "type": "input_audio_buffer.append",
  "audio": "<base64-encoded-audio>"
}

Receive Output

{
  "type": "response.audio.delta",
  "delta": "<base64-encoded-audio-chunk>"
}

Example: Voice Chat

const ws = new WebSocket(
  "wss://api.yuhuanstudio.com/v1/realtime?api_key=YOUR_API_KEY"
);

ws.onopen = () => {
  // Configure session
  ws.send(JSON.stringify({
    type: "session.update",
    session: {
      model: "model-id",
      modalities: ["text", "audio"],
      voice: "alloy"
    }
  }));
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);

  switch (message.type) {
    case "response.text.delta":
      process.stdout.write(message.delta);
      break;
    case "response.audio.delta":
      // Play audio chunk
      playAudio(message.delta);
      break;
    case "response.done":
      console.log("\n--- Response complete ---");
      break;
  }
};

Event Types

Client Events

Event	Description
`session.update`	Configure session parameters
`input_audio_buffer.append`	Send audio data
`input_audio_buffer.commit`	Finalize audio input
`conversation.item.create`	Send text message
`response.create`	Trigger response generation

Server Events

Event	Description
`session.created`	Session initialized
`response.text.delta`	Partial text response
`response.audio.delta`	Partial audio response
`response.done`	Response complete
`error`	Error occurred

The Realtime API requires models that support real-time interaction. Check model capabilities for realtime support.

Overview

The Realtime API provides WebSocket-based bidirectional communication for applications requiring ultra-low latency, such as voice assistants and interactive agents.

Connection

wss://api.yuhuanstudio.com/v1/realtime

Authentication

Pass the API key as a query parameter or in the initial handshake:

const ws = new WebSocket(
  "wss://api.yuhuanstudio.com/v1/realtime?api_key=YOUR_API_KEY"
);

Message Protocol

All messages are JSON-encoded:

Send Input

{
  "type": "input_audio_buffer.append",
  "audio": "<base64-encoded-audio>"
}

Receive Output

{
  "type": "response.audio.delta",
  "delta": "<base64-encoded-audio-chunk>"
}

Example: Voice Chat

const ws = new WebSocket(
  "wss://api.yuhuanstudio.com/v1/realtime?api_key=YOUR_API_KEY"
);

ws.onopen = () => {
  // Configure session
  ws.send(JSON.stringify({
    type: "session.update",
    session: {
      model: "model-id",
      modalities: ["text", "audio"],
      voice: "alloy"
    }
  }));
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);

  switch (message.type) {
    case "response.text.delta":
      process.stdout.write(message.delta);
      break;
    case "response.audio.delta":
      // Play audio chunk
      playAudio(message.delta);
      break;
    case "response.done":
      console.log("\n--- Response complete ---");
      break;
  }
};

Event Types

Client Events

Event	Description
`session.update`	Configure session parameters
`input_audio_buffer.append`	Send audio data
`input_audio_buffer.commit`	Finalize audio input
`conversation.item.create`	Send text message
`response.create`	Trigger response generation

Server Events

Event	Description
`session.created`	Session initialized
`response.text.delta`	Partial text response
`response.audio.delta`	Partial audio response
`response.done`	Response complete
`error`	Error occurred

The Realtime API requires models that support real-time interaction. Check model capabilities for realtime support.

Overview

Connection

Authentication

Message Protocol

Send Input

Receive Output

Example: Voice Chat

Event Types

Client Events

Server Events

On this page

Realtime API

Overview

Connection

Authentication

Message Protocol

Send Input

Receive Output

Example: Voice Chat

Event Types

Client Events

Server Events

On this page