Feb 23, 2023

Project Flowerbed: A WebXR Case Study

We are thrilled to announce we’re open sourcing⁠ an immersive VR garden-building experience developed by Meta using WebXR: Project Flowerbed⁠. At Connect ‘22, we unveiled Project Flowerbed ⁠to demonstrate best practices for developers building high-quality WebXR experiences. Now, as an open source project, it’s even easier for developers to learn about our architecture, asset pipeline, and game mechanics.

In this post, we’ll dive deep into the best practices and learnings from Project Flowerbed. There’s a lot to cover, but we think these are valuable insights that every WebXR dev should know!

What Is Project Flowerbed?

Project Flowerbed⁠ is an immersive, meditative VR gardening experience that runs in the Meta Quest Browser.

In the experience, you explore a tranquil island where you can plant and grow flowers, trees, and more.

You can also take pictures of your garden as it comes to life and share them with friends.

We made Flowerbed to serve two purposes:

Show the power of WebXR on the Meta Quest Platform by creating a high-quality immersive experience that runs in Meta Quest Browser with no installation required

Make the codebase available to developers to help them learn how to build their own immersive experiences using WebXR

To that end, we’ve open sourced the entire Project Flowerbed codebase. You can find the homepage for Project Flowerbed here⁠, and the source code is available on GitHub⁠. The rest of this article will go into more detail about how the experience was put together, and we encourage you to look at the code as you go along.

How We Made Project Flowerbed

Basic Architecture

Project Flowerbed was built on top of Three.js, and we used ECSY as our game logic layer.

We chose Three.js both for its support for WebXR and for the vibrant community that surrounds it; because Project Flowerbed is a case study app for WebXR and Meta Quest Browser, it was important for us to build on top of a community-driven engine like Three.js so we could really understand how other developers are building their experiences today.

Three.js’s WebXR features were fairly straightforward to set up and use; with the help of some third-party extensions such as three-mesh-ui and three-mesh-bvh, we were able to rapidly spin up initial prototypes for game mechanics and iterate as we finalized mechanics and assets. However, Three.js by itself is focused almost entirely on rendering and is designed to be composed with other code, or a framework, to provide logic and gameplay features.

Thus, ECSY! We decided to use an entity-component system (ECS) for our logic, as it allowed us to decouple different mechanics and allow developers to work on features in isolation and still have them work together as a whole. We chose ECSY over its competitors both because it was one of the more mature JavaScript ECS libraries at the time, and for a few specific features:

Temporary state could be stored within Systems, reducing the number of one-off components.

It was very straightforward to serialize data into JSON and then later deserialize it back into entities and components, which formed the backbone of how we saved gardens.

Unfortunately, ECSY has been deprecated by its creators and isn’t currently supported or maintained. There are a number of alternative ECS libraries, and we encourage you to find one that best suits your needs.

Pretty much all of the logic in Flowerbed runs through one of our ECSY systems, which can be found in the src/js/systems⁠ directory. Components, likewise, can be found in the src/js/components⁠ directory.

Asset Pipeline

Before we get into how we run things inside Flowerbed, we have to talk about how we got our art into the engine. Here, we’ll go into how we handled our 3D assets, from their creation to loading them and using them inside of Project Flowerbed:

1. Authoring levels and game objects in Blender

All of our 3D assets, including the main environment, were created in Blender.

Blender has a robust scripting engine, which meant that we could add gameplay data to models (like colliders!) from within Blender and use it as our level editor. We assigned Custom Properties to individual meshes from within Blender, which allowed us to attach JSON data directly to nodes in the glTF model and then read those properties in-engine later to attach gameplay behaviors. So things like where ambient sounds would play, where to place linked objects, and which objects have colliders could all be defined from within Blender.

Gazebo with two collider meshes. Note the collider custom property in the lower right.

This, at a larger scale, is also how we created the main environment. The environment had a lot of objects that were defined in other files and objects that needed to have colliders, and we didn’t want to create them all by hand. So we created a script to generate a bunch of those gameplay objects and create those links!

Screenshot of the environment as authored in Blender.

Screenshot of the environment after running the script to create colliders and other features.

The environment, as well as the script we used to generate colliders and other features in that environment, can be found in content/models/environment.

2. Model Creation and Processing

We used the glTF file format to handle 3D models. glTF is an open format with robust support in Three.js which can be easily exported from Blender (and other 3D modeling applications). There are a handful of glTF previewers on the web as well (gltf viewer, using Three.js, and Babylon.JS’s sandbox⁠), which we used to verify that our models were exporting correctly and to debug any problems with materials, rigs, etc.

Textures were compressed into KTX2 basis textures; KTX2 textures are GPU Compressed textures, which means that the textures can remain compressed on the GPU, compared to PNGs or JPGs that need to be decompressed first before being loaded into GPU memory. This means that KTX2 textures are much smaller in memory, faster to render, and usually smaller in file size, too.

There are a handful of resources that get into more detail on KTX2 textures and why they are good. Here⁠ is the formal announcement for their support in glTF loaders, and here ⁠is a good high-level rundown of how they work. For more detail about creating and fine-tuning them, this page is a good resource⁠.

We created a command line-based pipeline to compress all of our glTFs and to create the KTX2 textures. This pipeline is powered by gltf-transform, an open source library which provides a variety of utilities to process glTF data, which we used to prune excess data, run compression, and create our KTX2 textures. This pipeline can be found at the asset_pipeline directory in the source and is run on every model in the content/models folder to produce the result that is imported in game.

3. Loading the Models into Flowerbed

Inside of Flowerbed, all of the loading is handled by the AssetLoadingSystem, which interacts with the singleton component AssetDatabaseComponent. There, individual asset types are loaded through specific 'databases' that store the assets in memory and process them with any in-game changes needed (e.g. generating colliders and materials).

The Mesh Database handles all our .gltf files and performs the following operations:

Loads the files via THREE's GLTFLoader, with extensions to handle compressed files

Traverses each of the nodes of the file, separating all colliders from the 'visible' meshes and storing the colliders and visual meshes in different dictionaries

Applies the shadow map and any materials needed to the visible meshes

At this point, once all the meshes are loaded, any system from within Flowerbed could fetch (a clone of) a model and add it to the scene, with the AssetReplacementSystem handling most use cases.

NOTE: We only talk about the 3D model pipeline here because it’s the most complex one, but there are tools and scripts in the project for handling all of the other asset types too, such as audio, images, and fonts. These pipelines can all be found in the scripts folder in the source.

Collisions

Three.js doesn't come with built-in collisions or physics, so our main options were to use a separate physics engine or code up our own collisions library.

We decided, for Project Flowerbed, to roll our own collisions—both to gain more insight into that process and also to avoid more common problems with physics engines (namely, the fact that all collisions in a physics engine have to be physically modeled, when we weren’t looking for physically-accurate collision dynamics). Flowerbed also doesn’t have a lot of collisions between moving objects; most collisions are player against static objects, or rays against UI, so physics engines seemed more complex than necessary.

We used a KDTree ⁠to handle the broad-phase of collisions (where we filter out any objects too far away to possibly be involved in the collision) and three-mesh-bvh⁠ for the narrow phase (where we are checking individual objects to see if they collide). Three-mesh-bvh comes with efficient raycasting and shapecasting and allows us to check intersections between arbitrary meshes and spheres. This meant that we could author collision meshes directly in Blender and import them into the experience without needing to make specific shapes out of them.

Player physics are handled by the PlayerPhysicsSystem, which is loosely based off of three-mesh-bvh’s CharacterController demo⁠. Several times a frame, it checks whether the player is intersecting with an obstacle and, if so, pushes the player out of the obstacle.

Audio

Project Flowerbed uses howler.js⁠ to manage audio. We opted to use howler.js instead of THREE's PositionalAudio node for more control purposes: howler.js lets us set fades to sounds, allows us to set positions independently of an Object3D's position, and manages audio pooling on its own.

Audio is loaded into an AudioDatabase instance and then played by attaching the OneShotAudioComponent or LoopingAudioComponent to entities—or creating temporary entities to attach them to. One shot audios are destroyed as soon as they’re played, and looping audio plays until the component (or entity it’s attached to) is destroyed.

Designing the Experience

Visuals

We wanted Flowerbed to be a rich and immersive visual experience. Because Flowerbed is intended to be a case study of the power of WebXR, it was important that we used high-end rendering techniques—PBR materials with normal maps and varying metallic/roughness, high-quality geometry, and real-time lighting are possible on the platform, and we needed to show how to implement that.

Flowerbed is also about designing and growing your garden, so we wanted dynamic shadows that change as your plants grow, variation of plants as you grow them, and fauna and water to make the world feel alive.

Plant Growth System

Plants are the core characters of Flowerbed. We knew we needed to support large numbers of plants, but we also wanted it to feel like users are actually gardening—that plants have their own identity and vary from plant to plant. We also wanted plants to have pleasant animations as they were planted and grew from seed.

To support large numbers of plants, we knew we needed to use instanced meshes for our plant rendering. Out of the box, Three.js doesn’t support any kind of animation on instanced meshes—this made it tricky to design a way to animate plants as they grew.

Our technical designer prototyped a plant growing with skeletal animation—when we looked into what he did, we realized it was mostly scaling on different bones that made the growth look good. This was the seed of our plant growth and animation system. We could make a simplified skeletal mesh system that supported only bone scaling and get a good result.

To instance the meshes efficiently, we needed to limit the amount of additional per-instance data we needed to use on each draw call. Because we were only supporting a uniform scale per bone, each bone’s state can be represented with a single scalar value. GPUs naturally use four channel types, and we already have a scale encoded in the root transform of each instance, so by packing four additional scalar values into a component, we can set five separate bone scales per mesh instance.

These values are driven entirely from the javascript. To animate the plants as they grew, we used a set of PD controllers (a control loop⁠ mechanism, simplified form of PID controllers⁠) to animate the individual scale values, and hand tuned the targets for each value.

This animation system also lets us permanently scale parts of plants differently. By adding a little random offset to each bone scale, we get a little bit of geometry variety in our plants for free.

Each plant also has a set of texture variations. Right now these aren’t implemented in a clever manner—each texture variation is simply a separate draw call. If we wanted to optimize this further, we could either switch these textures to an array or add some kind of hue shifting to implement this.

The plant geometry are all standard glTF models. Foliage is tough on mobile GPUs—layers of alpha geometry can burn through shader budgets in no time. We initially tried to avoid this by setting low targets for the number of layers of geometry—trying to design models with four or fewer layers of geometry from any viewpoint. We also used masked alpha instead of blended alpha to try to avoid an ‘aggregated geometry’ style. Unfortunately, this didn’t really work out in practice and was still too slow for a few of the models. Our later meshes were created fully opaque—the rose bush, for example, was higher poly than many of the other flowers but rendered faster because it was fully opaque. The overall performance of Flowerbed would be better if we had authored all our plants as fully opaque meshes, as the cost of additional polygons is generally easier to deal with than the blended overdraw.

Fauna

In addition to the plants, we wanted to add some additional life to the flowerbed garden. This took the shape of animals that populate the island. We settled on a few different kinds of fauna: birds and butterflies in the sky, fish and ducks in the ponds and river, and squirrels and rabbits on the ground.

Initially, all these animals were created and animated with skeletal animation. Unfortunately, we quickly discovered that this technique didn’t scale to the number of fauna we were hoping to populate our scene with. After a little investigation, we decided to move some of the fauna over to morph targets, which require considerably less CPU resources to run than skeletal animations.

The seagulls, fish, ducks, and butterflies were all moved over to morph targets, each using a simple animation scheme that moved between a base pose and two separate target poses.

The fish mesh from Project Flowerbed, with Twist_Left and Twist_Right being the target poses for the swimming animation.

The rabbits and squirrels needed to stay as skeletal meshes and so are present in much smaller numbers than the rest of the fauna.

Because the morph targets are applied on the GPU, the CPU only needs to update a few scalar values which tell the GPU how much of each target to blend with the base pose. Instead of updating potentially dozens of matrices per instance per frame, each morph target instance simply needed to update their root transform and a few additional scalar values.

In addition to the lowered CPU cost, the move to morph targets also made it much easier to use instancing to render batches of fauna in a single draw call. By extending the mesh instancing code to render morph targets, we could update the target weights as part of the instance buffers and render each type of fauna in a single draw call.

Camera System

In Project Flowerbed, we aimed to create an immersive and interactive WebXR experience that includes a camera feature for taking pictures of the virtual garden. To achieve this, we implemented three key pieces of functionality: the camera holding mechanic, the camera screen, and the mechanism of taking a photo.

The camera holding mechanic allows users to hold the camera in their virtual hand, just as they would hold a real camera, with the trigger positioned as a virtual shutter release button. This provides a more immersive and intuitive experience for the user. We used the WebXR API to detect the user's controller and track its position and orientation in the virtual environment and created a virtual camera object that is positioned and oriented relative to the controller.

The camera screen provides a preview of the picture that the user is about to take and helps the user frame the picture effectively. We used a plane geometry with a dynamically updated texture as the screen mesh. The camera screen was positioned and oriented to match the virtual camera, and a snapshot of the scene was rendered to the target texture.

Of course, rendering the entire scene an extra time can be very expensive! To mitigate this cost, we render the camera live view at a lower resolution (150 x 100 pixels) and at a fixed, lower framerate (30 frames per second) than the main scene. We also took care to reuse the static shadow map for the scene so we don’t need to do an additional shadow update.

Our initial implementation of the camera used multiple canvases, and the browser’s .toDataUrl function to copy the canvas data into a texture. Unfortunately this had multiple drawbacks: each canvas needed a separate context (and thus a separate renderer, which required a separate compilation of all the shaders we used), and moving the texture between canvases required the texture to be resolved to the CPU before the GPU could use it. We eventually switched from canvas-based textures to a RenderTarget for the preview camera, allowing the same renderer to render both views and drastically reducing the cost of showing the preview texture. We still use a separate canvas for the high-resolution photo, but because we save it directly from the data URL we don’t have to block the VR renderer while waiting for it to copy.

Finally, the mechanism of taking a photo is a key part of the experience and provides a fun and interactive way for users to capture memories of their virtual garden. The Polaroid-style camera and printed photo are skeuomorphic in nature, resembling a real-life Polaroid camera and providing an intuitive experience for users. The animation and sound effects of the camera printing the photo add excitement and anticipation, making the process of taking a photo a memorable and engaging experience. The printed photo is also interactable, allowing the user to grab it and take a closer look, further enhancing the immersive experience.

Overall, the implementation of the in-experience camera feature in Project Flowerbed involved a combination of WebXR, web development technologies, and creative design to create a unique and immersive experience for users.

Hand Animations

We aimed to enhance the immersion and interaction of the experience by implementing hands in the virtual environment. To create natural and convincing hand animations based on limited controller inputs, we followed a unique approach of defining hand states and interpolating between them.

We grouped the fingers into three separate categories and defined states for each group individually, creating a divide-and-conquer strategy that simplified the authoring process and reduced the complexity of hand states. This approach allowed us to interpolate between hand states based on controller inputs, such as the trigger and grip, creating a more intuitive and convincing hand animation.

To create the overriding hand poses for different interaction modes, we carefully considered the user experience and designed animations that were specific to the object being held, such as the camera. This added level of detail helped to create a more immersive and convincing experience for the user.

Overall, the implementation of controller-based hands in Project Flowerbed involved a unique approach to defining hand states and interpolating between them based on controller inputs, and a creative approach to enhancing the immersion and interaction of the experience by creating overriding hand poses for different interaction modes.

NUX & Settings

We needed to have a 2D panel-based UI that could display text (and videos and images) to build out the new-user experience (NUX) and the settings menu. In a non-VR Three.js experience, this is most commonly implemented with the DOM by putting an HTML element on top of the Three.js canvas and rendering the UI there.

Unfortunately, that isn’t quite possible in WebXR, so we had to find an alternate method of displaying our UI. We looked into the possibility of rendering our UI to an HTML element and then copying it as a texture to a plane in WebXR, and we also contemplated using external canvases or images as a texture for a plane; both of these approaches suffered from blurry panels due to how pixel sampling in VR works.

Instead of trying to render the UI as 2D HTML or images, we opted instead to have all the pieces of the UI be 3D objects. To do that, we used three-mesh-ui to handle panel and text rendering, which has a robust API to control layout, insert images and videos into UI, and more.

We also wrote a small importer where we could define UI panels in JSON and then have it automatically create the three-mesh-ui objects based on the JSON configuration; this allowed us to create and modify panels and preview them in Project Flowerbed without needing a full rebuild and let us keep our UI definitions all in one place. The JSON importer can be found in src/js/components/UIPanelComponent⁠, and the UI panel definitions in content/ui⁠.

Performance & Optimization

Performance is one of the biggest challenges of shipping XR experiences. In order to get a comfortable user experience, XR applications need to render two high-resolution views of the world at high framerates. You can learn more about this topic by checking out the documentation.

Target Framerate

The Meta Quest Browser supports the updateTargetFramerate⁠ method so developers can change their target framerate. Framerate consistency is as important to user comfort as average framerate. Many applications don’t (or can’t) run consistently at 90 or more FPS, so the target framerate API allows developers to adjust the target framerate down to 72 FPS where they can ensure a more consistent appearance. Project Flowerbed’s target framerate is 72 frames per second—meaning we have just under 14 milliseconds to construct and render each frame.

General Optimization

At a high level, to get good performance on a mobile GPU WebXR app, you need to minimize the number of WebGL calls you use each frame, carefully manage your shader complexity, and minimize overdraw and avoid any operations that require large texture resolves such as postprocess effects.

To minimize your WebGL calls, the first thing you want to do is optimize the content in your scene. Be sure you’re combining meshes where possible, and ensure that you aren’t binding unused textures or repeatedly setting uniforms that don’t have a noticeable impact on your scene quality. Be sure to use API features such as VAOs (vertex array objects), UBOs, (Uniform Buffer Objects), instanced drawing, and multidraw wherever possible.

Three.js has foveation on by default for all WebXR experiences. This significantly reduces the number of pixels that need to be rendered and gives an instant boost to any shader-limited application.

Multiview

On Meta Quest hardware, the OCULUS_multiview extension is exposed for web developers to use; this can almost halve the number of calls it takes to render a scene, so it can be one of the most impactful optimizations any WebXR app can make! Read more about implementing multiview on the web at /documentation/web/web-multiview/.

Flowerbed uses our own fork of Three.js which has multiview implemented within three—we’re hoping to get this integrated into mainline Three.js for wider usage, but in the meantime we encourage developers to check out our OCULUS_multiview pull request⁠ to incorporate this into their own experience.

Matrix Updates

The Flowerbed scene has a lot of objects in it. A decent number of these objects are static—all of the island and garden and any plants that aren’t currently being watered or growing are all stationary each frame. By default, Three.js automatically updates all the matrices of every object each frame—this is the safest way to do things because it ensures nothing is drawn with stale data. Because so much of our scene was static, this automatic update was wasting a lot of time doing nothing. To fix this, we turn off Three’s DefaultMatrixAutoUpdate (a static on the Object3D class) so nothing updates by default. Of course, once you do that, nothing updates, and everything breaks! So we manually turn on autoupdate on things that we know will update each frame (such as the user camera) and then manually update the matrices on things that occasionally update such as plants. We added an updateMatrixRecursively helper function in object3DUtils.js which you can search for to find locations where we manually update a matrix and its children.

Instanced Meshes

We also use Instanced Meshes for a lot of our scene rendering—all of our plants, much of our fauna, and some of our environment (anything repeating) uses instancing to render in a single draw call. Three.js’s instanced meshes are pretty simple out of the box—they don’t support frustum culling, or LODs, and are difficult to use with custom materials.

We used troika’s three-instanced-uniforms-mesh⁠ component to make it easier to build instanced custom materials. This package helps us define uniforms that automatically work with instanced mesh buffers, enabling us to build our plant scaling system and instanced morph target animation with little additional work.

To add frustum culling to our instanced meshes, we extended the Three.js instanced meshes into what we call a ManagedInstancedMesh. This class manages a list of instances of a different type and can update and reorder the instance buffers. It adds a little indirection to each instance update (code that uses it can’t directly address the instance matrix anymore), but it can frustum cull meshes and make it a little more efficient to add and remove new instances.

On top of the ManagedInstancedMesh, we implemented LODs for instanced meshes. The LODInstancedMesh class works much like standard Three.js LODs, but it creates a new ManagedInstancedMesh for each LOD and enables and disables instances between each of those based on which LOD is active. This didn’t turn out to be very performance intensive for our use-case—if it did, we were planning on adding a quantization to the LOD updates and only update when a player moves a certain distance, but that didn’t turn out to be required.

Shader Performance

The best way to manage shader performance is to be aggressive about cutting features that aren’t significantly contributing to your scene’s visual quality. This means avoiding any wasted work! We made sure we weren’t paying any overhead costs for features we aren’t using. For example, the standard pixel shader for Three.js was calculating light probe contributions even though we didn’t have probes in the scene.

We also turned off features we didn’t need based on each asset type—plants and the ground are set to maximum roughness and zero metallic, the ground under the water uses simplified lighting instead of the standard shader, and our sky uses the basic material. These kinds of optimizations are application specific—each developer should look at the visual contributions of the shaders they’re using and make a judgment call about what’s necessary.

Performance Tools

We used a variety of tools to measure performance and find bottlenecks.

We activated the OVR Metrics tool almost the entire time we were developing Flowerbed—the overlay gives an instant and informative HUD that shows your framerate, GPU and CPU usage, and other key metrics. This is an easy way to move around the scene and check where performance hotspots are.

Once we found a hotspot or wanted to dig deeper into the overall performance, we needed to know if we were checking CPU or GPU performance.

For CPU performance, the remote Chrome dev tools performance profiler gave us all the information we needed—you can see your frame breakdowns and where your application is spending time in its Javascript execution.

On the GPU, the RenderDoc Meta Fork is the most robust and useful tool to understand your GPU usage. RenderDoc can record an entire frame of WebGL commands and show you how your scene is being rendered and what the most expensive draw calls are.

Along with RenderDoc and the performance profiler, we also used Spector.js⁠ to quickly evaluate our WebGL usage—by pairing Spector.js with the immersive web emulator, you can capture scenes on a desktop browser and get a quick idea of your WebGL calls and usage in each frame.

In Conclusion

We hope you’ve gained some valuable insight into how Project Flowerbed was built and came away with some ideas of how to build your own great WebXR experiences! If you haven’t visited yet, be sure to check out the Flowerbed source code and assets on GitHub⁠ and stay tuned to our Twitter⁠ and Facebook⁠ pages for more updates, tools, and development tips.

Design

Optimization

Quest

RenderDoc

WebXR

Did you find this page helpful?

Explore more