Mar 26, 2018

Oculus SDK support for NVIDIA VRWorks Lens Matched Shading

James Hughes, Oculus Engineer Blog Hero Image

With PC SDK 1.19 we introduced native compositor support for NVIDIA VRWorksTM Lens Matched Shading (LMS). The algorithm speeds up ALU hungry applications making VR framerates more attainable.

LMS is a multi-resolution method introduced as part of the NVIDIA® PascalTM generation of GPUs. The technique reduces GPU load by effectively lowering resolution in the periphery of rendered scenes. This reduction yields better one-to-one texel per-panel-pixel mappings making an LMS rendered scene difficult to distinguish from a traditionally rendered scene. While beneficial, there are a couple of points developers should bear in mind when integrating LMS into their applications. Such considerations are discussed later under “Usage Considerations”.

Oculus’ native LMS compositor support provides speed and quality improvements over using client-side LMS. Previously, client-side 'unwarping' needed to be performed on multiresolution LMS textures before being submitted to the runtime. Clients now have the option to submit LMS textures without modification. The compositor samples directly from the LMS texture when performing distortion, avoiding an extra sample and copy.

Technical Description

This section provides a high-level technical description of LMS. Further discussion and implementation details can be found in “NVIDIA Multi-Projection SDK Programming Guide” as part of the VRWorks SDK⁠.

Fig 1. Upper-left clip-space quadrant ‘flattening’ to rectilinear coordinates

LMS assigns W-warp factors to each clip-space quadrant resulting in a distinctive octagonal texture layout (Figure 1). Four viewports, one for each quadrant, are given different warp factors and scaled linearly based on those warp factors. Using scissoring, only a small portion of each viewport actually contributes to the final output (see Figure 4). To efficiently render to four viewports with one draw call NVIDIA exposes the ‘FastGS’ geometry shader.

Improved Panel Mappings

To highlight the difference between LMS and typical EyeFOV layers, the following video depicts a TimeWarped head pitch through 180 degrees with EyeFOV and LMS rendered side-by-side. The legend in the video has units of texels-per-panel-pixel (TPP).

One immediate result is that LMS appears to enjoy a wider area of 1 to 2 TPP. This indicates that an LMS texture is better mapped to the headset’s panel post-distortion as the following plot depicts:

Fig 2. Simulation data from TimeWarp head pitch

The vertical error was calculated based off of the actual texture region sampled by the compositor. LMS’ error reduction can be attributed to better 1-1 mappings in the periphery. Depending on LMS parameters used, error may increase at the lens-center due to oversampling. This oversampling was encountered in the data presented above and can be seen as “crosshairs” in Figure 3 below. LMS oversampling is dependent on parameters used and applications can achieve near 1-1 mappings at lens center and still gain the benefits of the Safe Zone (see the next section for the definition of Safe Zone).

Fig 3. Left, EyeFOV layer. Right, LMS

Figure 3 presents the most common TimeWarp pose for EyeFOV and LMS layers: aligned along the optical axis. The white boundary indicates the extents of a single 1344x1600 texture under the given sampling paradigm. Note how LMS’ extents more closely resemble the actual distortion sampling performed during composition.

In Figure 2’s plot, error was calculated only within these white texture extents. This explains the behavior of the plot in extreme cases where a smaller portion of the actual texture was sampled by distortion.

Area Savings

Fig 4. Upper-left quadrant diagram of right eye from Figure 1.

LMS’ performance characteristics are tied to the ‘Safe Zone’. The Safe Zone is the area inside the scissor rectangle that is not covered by triangles A and B in Figure 4. It represents the area the application need not render. 𝑊𝑙 and 𝑊𝑢 are the warping parameters for the upper-left quadrant. In this case, warp-left and warp-up.

To better understand the percent area savings LMS yields with specific warp factors, a derivation of a closed-form area savings equation based on Figure 4 is given below. One important property is that LMS transforms points on lines through the origin (the origin can be a bit tricky to define, but bear with us) to points on the same line. Any pre-division point on the line connecting the origin to the scissor corner and viewpoint corner (think unwarped Z plane for now) will remain on the that same line after the warped W-division has been performed. This means we can calculate the location of C as it falls on the diagonal line through the origin and the viewport’s corner.

To justify claims made below, the following paragraph presents a proof that LMS sends points on lines through the origin to points on the same line. While the statement is critical, the details of the proof are not. Let a be any point represented as a vector from the origin. Let

be the parametric equation of the line through a and the origin such that

satisfies

where

is a unit vector. Per LMS,

represents a after the W-Warp transform where

and

are the x, y clip space coordinates of a. W is the homogeneous coordinate for a and

are the warp factors defined by LMS. These W and c are all scalars fixed alongside the choice of a, so we assign the scalar

. Therefore,

. Therefore b is on the same line through the origin as a since we found a solution for

Let’s return to the problem of finding C in the diagram above. Consider the top-left viewport point

pre-LMS. We know this point must be visible in the resulting LMS-output and no points farther away from the origin can be visible. Since v_c and C lie on the same line and represent corresponding maximal points along the line in each space, C must be the LMS image of

where

’s homogeneous coordinate and

and

are the x, y clip-space coordinates for

. Since

From the diagram,

which implies

. The logic is the same for

Therefore, the area of A and B are

To calculate area savings we subtract the area of the scissor rectangle from the area of A and B, wh - (A+B) and we end up with:

There are two practical problems with using this equation: Wv and the clip-space coordinates of vc. However, we can simplify this equation by making the following assumptions. Assume the near-plane falls at NDC z=0 and restrict ourselves to a single z-plane parallel to the frustum’s near and far planes. This allows us to fix W. Furthermore, we choose the near-plane which fixes W=1 and allows us to avoid changes to z under LMS W-warp (see Usage Considerations). With these assumptions we can fully quantify LMS area savings on the near-plane. For LMS, quantifying the area savings on the near-plane is equivalent to quantifying area savings for any other z-slice through the frustum. (Note: To be fair, the diagram above assumes a flat plane which is the same assumption we are making now. This assumption fails for every z-slice except the one we have chosen).

Our assumption that

allows us to set

. LMS operates on clip-space coordinates and 𝑣𝑐is an extent of the viewport on the near-plane, so the absolute value of its 𝑥 and 𝑦 clip-space coordinates must be 1. For simplicity we assume 𝑊𝑢 and 𝑊𝑙 are both positive and therefore 𝑐𝑥and 𝑐𝑦 are also positive (practically, this would correspond to the upper right quadrant in clip-space).

The term on the right after wh is the percentage area savings of the LMS Safe Zone. You can apply this equation to any quadrant given appropriate warping parameters. The w and h variables are the appropriate ‘sizing parameters’ for the LMS quadrant (see ‘SizeLeft’, ‘SizeRight’, ‘SizeUp’, and ‘SizeDown’ in the Oculus CAPI ovrTextureLayoutOctilinear structure).

Performance

A “Safe Zone” (see above) must be used to realize LMS’ performance advantages. The Safe Zone is actualized through the use of early-z depth rejection or stencil buffers. When using the Safe Zone only, we’ve seen performance improvements as high as 23% for ALU bound applications. In contrast, we have seen a performance hit for applications that do not tax the ALU. When deciding whether to use LMS an application developer should understand whether their application performance characteristics align with the strengths of LMS.

As we saw above, applications may be able to realize bandwidth savings by decreasing the size of the eye buffers without appreciably harming texel-per-panel-pixel mappings. This may result in oversampling near the lens center depending on LMS parameters used.

Oculus’ native LMS support also provides a performance boost and a slight quality improvement. LMS textures are accepted as-is and sampled directly by the distortion renderer avoiding an extra copy and unwarping step.

Oculus CAPI

The Oculus runtime supports LMS textures natively through the octilinear multiresolution CAPI extension. At a high-level, here are the general steps to follow:

Enable the ovrExtension_TextureLayout_Octilinear extension.

Render your scene using LMS.

Fill out the ovrTextureLayoutOctilinear structure for both eyes.

Populate ovrLayerEyeFovMultires layer description and submit to the runtime. Be sure to set TextureLayout to ovrTextureLayout_Octilinear.


C++ Code

...

session = ovr_Create(...)

...

ovr_EnableExtension(session, ovrExtension_TextureLayout_Octilinear); // Call only once.

…

/* Render LMS scene */

…

ovrTextureLayoutDesc_Union  layout;

layout.Octilinear[ovrEye_Left].SizeLeft = …;

layout.Octilinear[ovrEye_Left].SizeRight = ...;

layout.Octilinear[ovrEye_Left].SizeUp = …;

layout.Octilinear[ovrEye_Left].SizeDown = ...;

layout.Octilinear[ovrEye_Left].WarpLeft = …;

...

/* Then set appropriate octilinear parameters for the right eye */

...

ovrLayerEyeFovMultires mr;

/* Set appropriate layer info */

mr.TextureLayout = ovrTextureLayout_Octilinear;

mr.TextureLayoutDesc = layout;

...

ovr_EndFrame(...)

....

Usage Considerations

As to be expected, the advantages of LMS come with a couple caveats. It’s recommended to use a projection matrix which has a far plane at infinity. Practically, this goes beyond a recommendation to a necessity in real-world applications. LMS Z-precision loss is very apparent when using standard projection matrices with far greater than infinity. To understand this precision loss, here is a comparison between the NDC coordinates using a standard W-divide and an LMS W-divide:

Fig 5. Planes of the same size rendered from near to far plane. On the left, standard W-Division. On the right, LMS W-Division. Near plane (Z=0) aligns with red (X) and green (Y) lines in both representations. DirectX NDC, with Z in [0,1]

Note LMS’ progressive warping in the Z-dimension. At Z=0, the near-plane in our example, there is only the expected octagonal X-Y warping. Z-warping magnifies as the absolute value of Z grows. Z-warping also increases for points further away from the origin of the plane under consideration (in the periphery of the scene). This warping strongly biases points in the periphery towards the near-plane. Periphery near-plane pull-in is potentially one cause of LMS precision-loss. Additionally, this pull-in has post-processing implications.

When post-processing, the warping of X,Y and Z must be properly accounted for. Nvidia has in-depth documentation on how to account for this warping in their VRWorks SDK.⁠

Did you find this page helpful?