API reference

UnityInferenceEngineProvider Class

Provider for on-device AI inference using Unity Inference Engine.
Supports object detection and text-only LLM chat (SmolLM, Qwen, Phi, GPT-2). Optimized for low latency with GPU compute for real-time XR scenarios.
See the "Model Conversion, Serialization, and Quantization" guide in the docs for preparing .sentis assets: https://developers.meta.com/horizon/documentation/unity/unity-ai-unity-inference-engine

Protected Properties

DefaultSupportedTypes : override InferenceType
[Get]
Indicates that this provider supports only InferenceType.OnDevice execution, meaning all inference runs locally on the headset (using Unity Inference Engine backends).
Signature
override InferenceType DefaultSupportedTypes

Properties

SupportsVision : bool
[Get]
Signature
bool SupportsVision

Methods

ChatAsync ( req , stream , ct )
Performs on-device text-only LLM chat inference using Unity Inference Engine.
Supports streaming token generation with proven approach from SmolLM, Qwen, and Phi models.
Signature
async Task< ChatResponse > ChatAsync(ChatRequest req, IProgress< ChatDelta > stream=null, CancellationToken ct=default)
Parameters
req: ChatRequest  Chat request containing text prompt (images are ignored for text-only models).
stream: IProgress< ChatDelta >  Optional progress reporter for streaming token-by-token responses.
ct: CancellationToken  Cancellation token to abort inference.
Returns
async Task< ChatResponse >  Complete chat response with generated text.
DetectAsync ( src , ct )
Performs object detection on any Texture input (e.g., Texture2D or RenderTexture) using the Unity Inference Engine.
The model runs entirely on-device, producing bounding boxes, scores, and class IDs, which are filtered via GPU-based Non-Maximum Suppression (NMS) and returned as a compact binary result.
Signature
async Task< byte[]> DetectAsync(Texture src, CancellationToken ct=default)
Parameters
src: Texture  The source Texture to process. Must be readable on GPU.
ct: CancellationToken  Optional CancellationToken to abort inference if needed.
Returns
async Task< byte[]>  A binary-encoded byte array containing filtered detections in the format: [count][x,y,w,h,score,classId,label] per detection.
DetectAsync ( src , ct )
Overload of DetectAsync(Texture, CancellationToken) that accepts a RenderTexture.
This avoids an unnecessary GPU blit by forwarding the call to the Texture overload directly.
Signature
async Task< byte[]> DetectAsync(RenderTexture src, CancellationToken ct=default)
Parameters
src: RenderTexture  Source RenderTexture to analyze.
ct: CancellationToken  Optional CancellationToken to abort the operation.
Returns
async Task< byte[]>  A binary-encoded byte array containing filtered detection results: [count][x,y,w,h,score,classId,label] per detection.
DetectAsync ( imageJpgOrPng , ct )
(Not yet implemented) Performs object detection on a raw image byte array (JPG or PNG format) and returns results as a JSON string.
Intended for CPU-based or cloud provider implementations.
Signature
Task< string > DetectAsync(byte[] imageJpgOrPng, CancellationToken ct=default)
Parameters
imageJpgOrPng: byte[]
ct: CancellationToken
Returns
Task< string >