OVRPlugin hand bone tracking| Scene | What it demonstrates | Key concepts |
|---|---|---|
MainScene | Complete app entry point with all core systems (tracking, voice, lessons, UI flow) | Visual Scripting state machine, system initialization order, Application Variables, world anchor management |
GymScene | Development/debugging scene for testing MRUK room scanning and camera object tracking | MRUK room data display, object tracking debug visuals, camera permission flow, Inference Engine preloading |
SelectScene | Language selection UI where users choose their target language | Language selection flow, UI localization |
LoadingScene | Initial loading screen and system initialization | Loading state, async system setup |
ArtScene | Environment art assets for the immersive experience | Visual environment, 3D art integration |
| Scene | What it demonstrates | Key concepts |
|---|---|---|
LlamaAPISample | Basic Llama API calls: chat conversation, word cloud generation, translation, example sentences, image understanding | LlamaRestApi.StartNewChat(), ContinueChat(), ImageUnderstanding(), base64 image encoding |
AssistantAISample | Higher-level AssistantAI wrapper for word cloud generation, sentence complexity, transcription evaluation | Prompt engineering, JSON response parsing, phonetic similarity evaluation |
ObjectRecognitionSample | YOLO object detection on a static image using Unity Sentis, with bounding box overlay | Sentis model loading, YOLO output parsing (8400 detections x 84 values), Non-Maximum Suppression via FunctionalGraph |
CameraImageSample | Passthrough Camera API image capture and display on mesh | WebCamTextureManager, camera resolution configuration, camera pose/orientation |
WordCloudSample | 3D word cloud lesson interaction: berry spawning, activation/deactivation, lesson completion flow, voice transcription trigger | Lesson3DInteractor usage, proximity-based activation, CameraTrackedTaxon with sample data |
VoiceSynthesizeSample | TTS synthesis in 12 languages via wit.ai, with button-per-language UI | VoiceSynthesizer async API, AudioClip caching, multilingual TTS with romanized fallbacks |
TextToSpeech | Text-to-speech functionality with UI controls | TTSSpeaker integration, SSML prosody markup |
SpeechToText | Speech-to-text microphone input and transcription display | STT language toggling, microphone input handling |
TranscriptionSample | Voice transcription pipeline with language switching | VoiceTranscriber lifecycle, partial vs full transcription events |
CharacterSample | Golly Gosh character animation, emotion, movement, and gaze behavior | Bezier curve movement, MaterialPropertyBlock sprite animation, gaze tracking with lazy follow |
ActivitySample | Lesson activity data model and interaction patterns | Activity lifecycle, lesson data structures |
AudioSample | Audio system and sound effect playback | Meta XR Audio integration, spatial audio |
FindSpawnPositionsSample | MRUK floor placement for spawning objects on user’s floor | MRUK room readiness polling, FindSpawnPositions building block |
LanguageSelectSample | Language selection UI flow | Language selection state machine, UI event handling |
LessonFlowSample | End-to-end lesson flow with state transitions | FlowController state machine, FlowState lifecycle hooks |
PassthroughHighlighting | Passthrough environment highlighting technique | Passthrough API material manipulation |
VATSample | Vertex Animation Texture (VAT) shader technique for animations | Shader-based animation, VAT playback |
MRUK to find a floor position. Golly Gosh, the character guide, appears and prompts you to plant a seed that grows into a language tree. After selecting your target language, you look around your room. The app identifies objects in real time (chair, laptop, bottle, etc.) and spawns 3D word clouds near them. When you approach a word cloud, it activates and Golly Gosh speaks the vocabulary in your target language. You repeat the words, the app transcribes your speech and evaluates it using Llama, and on success, a berry flies to your tree, which grows through three tiers as you complete lessons.m_taxonTracker = new CameraTaxonTracker(
m_environmentRaycastManager,
m_cameraTextureManager,
m_imageObjectClassifier);
SpatialLingoApp.cs.JsonUtility to parse the response after stripping markdown code fences:var response = await m_llamaAPI.ContinueChat(chat, request.ToString()); responseText = PrepareJsonStringForParsing(response.Message.Text); cloud = JsonUtility.FromJson<WordCloudData>(responseText);
AssistantAI.cs.AppDictationExperience with auto-relisten on microphone timeout and separate events for partial and full transcription. The sample supports 12 languages for both TTS and STT. For TTS implementation, see VoiceSpeaker.cs.protected override IEnumerator Await(Flow flow) {
m_isDone = false;
OnEnter(flow);
yield return new WaitUntil(() => m_isDone);
OnExit();
yield return m_targetControlOutput;
}
SkippableUnit.cs.OVRPlugin hand bone tracking. When the user grabs an object, the sample reads bone positions for all five fingertips, computes the average distance from the palm center, and uses the ratio to the starting distance as a squeeze factor:var thumb = positions[(int)OVRPlugin.BoneId.XRHand_ThumbTip]; var index = positions[(int)OVRPlugin.BoneId.XRHand_IndexTip]; var currentDistance = AverageDistanceFingersCenter(m_selectingHand); var ratio = currentDistance / m_startSelectDistance;
SqueezableHandInteraction.cs.