API reference

OnDeviceLlmConfig Class

Configuration for on-device text-only LLM inference.
Contains model parameters, tokenizer, and chat template formatting.
IMPORTANT: Different models have different architecture parameters. You must configure these values to match your specific model, e.g.:
Qwen2.5-0.5B:
  • maxLayers: 24
  • numKeyValueHeads: 2
  • headDim: 64
  • eosTokenId: 151645
  • vocabSize: 151936
Create separate config assets for each model with appropriate parameters.

Fields

chatTemplateFormat : string
Signature
string chatTemplateFormat
defaultSystemMessage : string
Signature
string defaultSystemMessage
eosTokenId : int
Signature
int eosTokenId
headDim : int
Signature
int headDim
inferenceExecutionMode : InferenceExecutionMode
Signature
InferenceExecutionMode inferenceExecutionMode
maxLayers : int
Signature
int maxLayers
maxNewTokens : int
Signature
int maxNewTokens
maxPromptLength : int
Signature
int maxPromptLength
mergesFile : TextAsset
Signature
TextAsset mergesFile
mergesFileContentId : ulong
Signature
ulong mergesFileContentId
numKeyValueHeads : int
Signature
int numKeyValueHeads
stepsPerFrame : int
Signature
int stepsPerFrame
tokenizerConfigFile : TextAsset
Signature
TextAsset tokenizerConfigFile
tokenizerConfigFileContentId : ulong
Signature
ulong tokenizerConfigFileContentId
vocabFile : TextAsset
Signature
TextAsset vocabFile
vocabFileContentId : ulong
Signature
ulong vocabFileContentId

Properties

Tokenizer : Gpt2Tokenizer
[Get]
Signature
Gpt2Tokenizer Tokenizer

Methods

ApplyChatTemplate ( userPrompt , systemMessage )
Signature
string ApplyChatTemplate(string userPrompt, string systemMessage=null)
Parameters
userPrompt: string
systemMessage: string
Returns
string
InitializeTokenizer ()
Signature
void InitializeTokenizer()
Returns
void