API reference
API reference
Select your platform
No SDKs available
No versions available

UnityInferenceEngineProvider Class

Provider for on-device AI inference using Unity Inference Engine.
Supports object detection and text-only LLM chat (SmolLM, Qwen, Phi, GPT-2). Optimized for low latency with GPU compute for real-time XR scenarios.
See the "Model Conversion, Serialization, and Quantization" guide in the docs for preparing .sentis assets: https://developers.meta.com/horizon/documentation/unity/unity-ai-unity-inference-engine

Properties

Indicates that this provider supports only InferenceType.OnDevice execution, meaning all inference runs locally on the headset (using Unity Inference Engine backends).
bool SupportsVision[Get]

Member Functions

Performs object detection on any Texture input (e.g., Texture2D or RenderTexture) using the Unity Inference Engine.
The model runs entirely on-device, producing bounding boxes, scores, and class IDs, which are filtered via GPU-based Non-Maximum Suppression (NMS) and returned as a compact binary result.
Parameters
src
The source Texture to process. Must be readable on GPU.
ct
Optional CancellationToken to abort inference if needed.
Returns
A binary-encoded byte array containing filtered detections in the format: [count][x,y,w,h,score,classId,label] per detection.
Overload of DetectAsync(Texture, CancellationToken) that accepts a RenderTexture.
This avoids an unnecessary GPU blit by forwarding the call to the Texture overload directly.
Parameters
src
Source RenderTexture to analyze.
ct
Optional CancellationToken to abort the operation.
Returns
A binary-encoded byte array containing filtered detection results: [count][x,y,w,h,score,classId,label] per detection.
(Not yet implemented) Performs object detection on a raw image byte array (JPG or PNG format) and returns results as a JSON string.
Intended for CPU-based or cloud provider implementations.
Performs on-device text-only LLM chat inference using Unity Inference Engine.
Supports streaming token generation with proven approach from SmolLM, Qwen, and Phi models.
Parameters
req
Chat request containing text prompt (images are ignored for text-only models).
stream
Optional progress reporter for streaming token-by-token responses.
ct
Cancellation token to abort inference.
Returns
Complete chat response with generated text.