Since September, I’ve worked hard with the software team at
Oculus VR on the SDK for the Rift. Tracking head orientation with
as little latency and error as possible was a key challenge to
making it work well. From a math and engineering perspective, it is
an old problem. People have wanted to track the orientation of sea
vessels and land vehicles for millenia. Over the past century,
clever sensing systems have been developed to track aircraft,
spaceships, missiles, robots, VR headsets, and smart phones. I’ve
spent many years as a robotics professor, thinking
hard about localizing robots and a host of other interesting
problems that mix sensors, motors, and computers. I jumped at the
chance to make tremendous impact through available source code and
a developer community that is supercharged about VR gaming. I am
thrilled to be at Oculus!
In the rest of this post, I will explain how our tracking method
works, what challenges we faced, and why various design decisions
were made. One overriding theme throughout our development has been
to keep the method simple so that it is easier to understand its
behavior, to optimize its performance, and to make future
enhancements. We could approach the problem using standard
sledgehammers, such as the Kalman filter [4]
or particle filters
[1], but these require significant modeling assumptions and
adjustment to reach their theoretical benefits. For example, the
Kalman filter is the optimal estimator for linear systems with
linear measurements and Gaussian noise, but their performance
outside of that range is not guaranteed. Furthermore, the method is
appropriate for systems with a lower sampling rate and a high
degree of predictability due to a stronger motion model. Particle
filters are more suited for problems in which the world state is
enormous, which might include, for example, models of the
surrounding obstacles (think about robots mapping their
environment). Another alternative, which arises from classical
linear filtering theory, is the complementary filter, which
combines high-pass filtering of gyroscope data with low-pass
filtering of accelerometer data. For a comparison of these
approaches to Kalman filters, see [3]. Our approach is similar in
spirit to complementary filtering, but due to the power of modern
computers, we can run algorithms in each iteration that are
specific to our tracking problem.
Integrating gyroscope readings
The gyroscope inside of the Rift measures the head’s orientation
change at a rate of 1000 times a second. The software needs to
compute the current head orientation, given the previously “known”
head orientation and the latest gyroscope measurement. Imagine
trying to figure out how far a car has traveled by reading the
speedometer. The update would look like this:
Current distance = Previous distance
+ Time difference Observed speed.
As the time difference shrinks to zero, the formula gives the
exact distance, assuming the speedometer is 100% accurate. In
reality, the time difference is not zero and the speedometer
imperfect, causing drift error, which will be discussed
later.
Now turn to the problem of tracking a human head, which has
three rotational degrees of freedom. The orientation of a 3D rigid
body is often described by yaw, pitch, and
roll angles. They are convenient for making figures like
the one on the left, but later cause a lot of trouble due to
numerical singularities (see gimbal lock) and a
huge variety of alternative, incompatible definitions (see
Euler
angles–pronounced by Americans as “oiler angles”). We therefore
use quaternions
internally for representing orientation.
Suppose that the head is rotating about the axis only, observed by the
sensor to be an angular velocity of radians per second.
Assuming 1000 sensor readings per second, an angular version of the
previous update formula is:
Current orientation = Previous
orientation + 0.001 .
This is exactly how the update works in the Rift SDK, but it is
slightly more complicated so that it handles any combination of
yaw, pitch, and roll. The gyroscope provides angular velocity with
respect to all three of these, producing a 3D vector:
.
It is known from mathematics that every 3D
orientation can be nicely described by a rotation of degrees about some
axis poking through the origin. The rotating head can then be
thought of a spinning top that keeps changing speed and axis.
Amazingly, is exactly the
rotation axis (though you might want to normalize it). Furthermore,
its length is the angular speed of rotation about that axis. So,
the update equation is simply
Current quaternion = Previous
quaternion Quat(axis, angle),
in which Quat(axis, angle) means a unit quaternion that
represents rotation by angle about the given axis.
Unit quaternions are used because it is easy to convert them to and
from the axis-angle description, and their multiplication operation
combines orientations in a way that is equivalent to multiplying
out their corresponding 3 by 3 rotation matrices. We also avoid
numerical singularity issues associated with yaw, pitch, and roll
angles.
This method is simple and fairly accurate. It relies on some
prefiltering of gyroscope readings, which occurs in hardware. More
complicated numerical integration formulas could be tried (the one
above is called Euler integration;
see 4th-order
Runge-Kutta for a better alternative), but we did not find any
need when operating at 1000 measurements per second with
prefiltered data. Some predictive filtering is also applied, which
I plan to talk about in a later post.
Drift correction
After many thousands of updates, the true orientation will drift
away from the calculated orientation. Therefore, other sensors are
needed to bring the orientation back into correct alignment. Drift
in the pitch and roll angles is called tilt error, which
corresponds to confusion about which way is up. Drift in the yaw
angle is called yaw error, which is confusion about which
way is North, or at least which way you are facing relative to when
you started. The discussion of yaw error correction is planned for
a future post due to the complications of using a magnetometer;
some of the ideas below, however, apply to that case as well.
To handle tilt error, let’s think about what “up” actually
means. Our perception of “up” is based entirely on gravity. It is
in the direction of a ray that starts at the center of the Earth
and pokes through your body. We have been taught that the
acceleration due to gravity is 9.81, but it actually varies
up to a half of a percent depending on your location on the Earth
(you are actually lighter at the equator—imagine being on the edge
of a huge merry-go-round!).
Gravity is expressed as an acceleration vector, so it seems
natural to use an accelerometer to measure it. While
standing on the earth, it is as if we are riding on a rocket that
is constantly accelerating upward, which is why we are stuck to the
ground. A three-axis accelerometer measures this vector, but it
unfortunately measures any additional accelerations of the sensor.
When placed in the Rift, it measures the linear accelerations due
to head motions, in addition to gravity. To handle this, we want to
have high confidence that gravity vector is being measured in
isolation. Because the drift error grows slowly, we wait for two
simple conditions to be met over a few tens of milliseconds:
The length of the observed acceleration is reasonably close to
9.8.
The gyroscope reports that the rotation is very slow.
In addition, all accelerometer readings are filtered by a simple
moving average. If the conditions are met, then it assumed that the
accelerometer is correctly reporting the direction of “up”.
Although a standard method [2], it is clearly flawed because you
can accelerate the sensor downward to cancel off part of gravity,
while laterally accelerating to bring the magnitude back up to 9.8.
Nevertheless, it is simple and works well enough for us.
Now suppose that an error angle has been detected between
what is currently believed to be “up” and the acceleration vector
measured by the
accelerometer. The sensor fusion system then needs apply a
corrective rotation. The angle is , but what is the rotation
axis? It must lie in the horizontal, plane and be perpendicular to
both and the axis.
Simply project into the horizontal
plane, to obtain . A
perpendicular vector that remains in the horizontal plane is
, which is
the tilt axis. Imagine grabbing on to the tilt axis and
twisting to bring back into alignment
with the axis.
Once tilt error is detected, the remaining issues are
when to perform the corrective rotation and how much.
This is actually an ongoing research topic, to which developers
working with the Rift may bring new insights. If a player notices
the tilt correction while staring in one direction, then the effect
could be nauseating. On the other hand, if they turn their head
quickly, perhaps all of the needed corrections can be performed
without them even noticing.
We currently take the following approach. In the first few
seconds after the Rift is turned on, if there is a huge tilt error,
then we rotate by the entire . A common situation is
that the Rift could be sitting on your lap or on a table upon
startup. At this point, a large correction needs to be made.
Otherwise, a tiny correction is applied in each cycle of the sensor
fusion. The rotation axis may frequently change while corrections
are being performed. When the system knows that tilt correction
needs to be performed, a critical issue is to perform it at a time
and rate that the player will tend not to notice. We experimented
with several alternatives, but it seems to be a matter of personal
preference.
Can you think of a better way to handle sensor fusion? Hack it
up and give it a try!
References
[1] Doucet, A., De Freitas, N., and Gordon, N.J., Sequential
Monte Carlo Methods in Practice. Springer, 2001.
[2] Favre, J., Jolles, B.M., Siegrist O., and Aminian, K.,
Quaternion-based fusion of gyroscopes and accelerometers to improve
3D angle measurement, Electronics Letters, Volume 32, Issue 11, pp.
612-614, 2006.
[3] Higgins, W. T., A Comparison of Complementary and Kalman
Filtering, IEEE Transactions on Aerospace and Electronic Systems,
Volume 11, Issue 3, pp. 321-325, 1975.
[4] Stengel, R. F., Optimal Control and Estimation, Dover,
1986.
Rift
Did you find this page helpful?
Explore more
Everything (and we mean everything!) We Covered at GDC’s Developer Summit
Hear about everything we covered at our first ever Developer Summit at GDC.
Overcoming Graphics and Design Challenges in Across the Valley
We are FusionPlay, a small Indie studio, based in Germany, that has been developing VR games since 2016. We released our first title, “Konrad the Kitten”, for Oculus Rift in 2018, followed closely by the “Konrad’s Kittens” update in 2019. Lately, we’ve been working on our newest title for Meta Quest, “Across the Valley.”
How Developer Teams Are Using App Lab to Meet Their Goals
Learn how you can use App Lab to build a sustainable developer business with Meta Quest, and get inspired with real examples of apps and developers who have used App Lab to meet their goals and grow an audience.