API reference

Select your platform

No SDKs available

No versions available

Overview

AlertViewHUD

Fusion

AvatarBehaviourFusion

AvatarSpawnerFusion

CustomMatchmakingFusion

CustomNetworkObjectProvider

FusionBBEvents

FusionNetworkBootstrapper

LipSyncPhotonFix

PlayerNameTagFusion

PlayerNameTagSpawnerFusion

TransferOwnershipFusion

VoiceSetup

IOVRAnchorComponent

InlineLinkAttribute

OpenAIProvider Class

Extends AIProviderBase, IUsesCredential, IChatTask, ISpeechToTextTask, ITextToSpeechTask

OpenAI provider implementing chat (Responses API), speech-to-text, and text-to-speech.

Supports optional image inputs for multimodal models when SupportsVision is enabled.

Guides: https://platform.openai.com/docs/⁠ Used by UI and samples via IChatTask, ISpeechToTextTask, and ITextToSpeechTask.

Properties

string IUsesCredential. ProviderId[Get]

Unique identifier for this provider type (e.g., "OpenAI", "LlamaApi").

Used to store and retrieve credentials from the central CredentialStorage.

bool IUsesCredential. OverrideApiKey[Get]

When true, this provider asset uses its own API key instead of the central storage.

bool IChatTask. SupportsVision[Get]

bool SupportsVision[Get]

Indicates whether this provider can handle vision inputs (images) alongside text during chat.

When true, ChatAsync will package ImageInput items using the OpenAI Responses format and, depending on settings, inline or resolve remote URLs. Toggle this for models like GPT-4o or any multimodal model that supports image understanding.Controlled by the serialized supportsVision field and reported via IChatTask.SupportsVision. See also inlineRemoteImages and resolveRemoteRedirects to influence how remote images are prepared.

override InferenceType DefaultSupportedTypes[Get]

Declares the default inference location supported by this provider: cloud.

Used by AIProviderBase to filter capability and route tasks. This provider does not advertise local/edge execution out of the box; if you need on-device models, use a provider that returns InferenceType.OnDevice, InferenceType.Cloud or InferenceType.LocalServer.

Member Functions

async Task< ChatResponse > ChatAsync

( ChatRequest req,

IProgress< ChatDelta > stream,

CancellationToken ct )

Sends a chat turn to OpenAI's Responses API and returns the assistant's text reply.

Validates apiKey and model, builds a single input message per the Responses schema, and POSTs to {apiRoot}/v1/responses. The method extracts output_text if present, or falls back to the first text item in the first output message. OpenAI Responses docs: https://platform.openai.com/docs/guides/responses/⁠ See alsoAIProviderBaseIChatTask.

Parameters

req

The user message and optional ImageInput list. If SupportsVision is enabled, images are serialized as input_image items; remote images can be inlined or have redirects resolved depending on inlineRemoteImages and resolveRemoteRedirects.

stream

Optional incremental callback for partial text via ChatDelta. This implementation reports the final text once per call (non-streaming HTTP). Use to update UI progressively.

Cancellation token for image preparation and HTTP. Cancels the request if the operation is aborted.

Returns

A ChatResponse containing the assistant text and the raw JSON payload for debugging or downstream parsing.

async Task< string > TranscribeAsync

( byte[] audioBytes,

string language,

CancellationToken ct )

Transcribes an audio clip using OpenAI audio/transcriptions and returns plain text.

Honors sttResponseFormat (e.g., json or text) and optional sttTemperature. Requires valid apiKey and model. POSTs to {apiRoot}/v1/audio/transcriptions. OpenAI Transcriptions: https://platform.openai.com/docs/guides/speech-to-text⁠ See also ISpeechToTextTask.

Parameters

audioBytes

Raw audio data (e.g., WAV). Throws if null or empty. The content is sent as multipart/form-data.

language

Optional ISO language override (for example, "en", "de"). If null/empty, falls back to sttLanguage or lets OpenAI auto-detect.

Cancellation token for the HTTP request.

Returns

Transcript text. If sttResponseFormat is "text", the raw body is returned.

Exceptions

ArgumentException

Thrown when audioBytes is null or empty.

InvalidOperationException

Thrown if apiKey or model is missing.

IEnumerator SynthesizeStreamCoroutine

( string text,

string voice,

Action< AudioClip > onReady )

Coroutine that synthesizes speech with OpenAI audio/speech and yields a Unity AudioClip.

Selects AudioType based on ttsOutputFormat (e.g., WAV/MP3). Requires valid apiKey and model. POSTs to {apiRoot}/v1/audio/speech, then streams the response into a AudioClip via the internal HTTP helper. OpenAI TTS: https://platform.openai.com/docs/guides/text-to-speech⁠ See also ITextToSpeechTask.

Parameters

text

Input text to speak. Logs and exits if empty. Combined with ttsVoice and optional ttsInstructions to control style, plus ttsSpeed for playback rate.

voice

Optional voice name override. If null/empty, uses ttsVoice.

onReady

Callback invoked with the created AudioClip once download/decoding completes.