Design

Grab

Updated: Mar 13, 2026

Explore how to hold on to interactables, those who are close or far. These guidelines can help ensure an empowering experience for your users.

Grab interaction using pinch and palm grab gestures to hold virtual objects.

Usage

Grabbing is the most fundamental way to manipulate the virtual world by providing a direct spatial link between the user and an interactable. It allows users to intuitively pick up, move, and rotate 3D content.

Anatomy

These are the different parts, characteristics, and frequently used terminologies that you should be familiar with:

Indirect targeting (conecasting)	For indirect grab: A cone-shape projection from the input modality to the target. Depending on the design, a ray or line can be visible, such as for Hand and Controller input. There are two sizes, a narrow one for hovering and a wider one for unhovering.
Direct targeting (collider)	For direct grab: A collider placed at the grabbing point in order to detect intersecting interactables. It is important that the collider is placed with precision (for example between the pinching fingers) and has an expected radius.
Hover effect	Visual effects can be applied either to the object itself or to the input method. An object effect might involve a hover state that causes the object to glow when targeted. In contrast, an input effect could highlight the fingers on the input hand that are about to perform a grab action.
Selection	For selecting interactables, the following input modalities can be used: Hands and Controllers.
Indirect interaction	An interaction pattern where the user engages with an object from a distance rather than through direct contact.
Direct interaction	An interaction pattern where the user engages with an interactable through physical contact within their immediate reach, rather than using a secondary pointer or ray.

Variants

These variants define how a user maintains a hold and how the interactable responds to hand movement.

Input modalities

There are two primary ways to initiate and maintain a grab, depending on the hardware being used:

Hands: This utilizes hand tracking to calculate if the hand is performing different grip postures (for example, distance between the index and thumb tips for pinching, or curl value of the fingers for palm grab).

Controllers: Grabs are triggered by pressing actuators in the controller that simulate a hand grip, such as the index trigger for smaller objects or the middle-finger grip button for bigger objects.

Grab mechanics: Pinch vs. palm

Depending on the size and weight of the interactable, you can define which gesture is required:

Pinch grab: Uses the thumb and index finger. Best for small, precise objects like a key, a pen, or a UI handle.

Palm (power) grab: Uses the entire hand or fist. Best for larger or “heavier” objects like a sword, a basketball, or a flashlight.

Note: We recommend allowing “gesture switching,” where a user can start with a pinch and transition to a palm grab without dropping the interactable.

Kinematic vs. physics-based

This variant defines how the held object reacts to the environment:

Kinematic (1:1): The interactable follows the hand exactly and can pass through other virtual objects. This is standard for UI panels and simple tools.

Physics-based (heavy): The interactable has mass and velocity. It can collide with the environment and may feel “heavy” if the user moves their hand faster than the physics engine can update the object’s position.

Types

There are different mechanisms to differentiate when a user intends to grab an interactable. You can mix and match these components to define the specific behavior and “feel” of the grab.

Hovering mechanism

The initial state when a user’s hand or controller is targeting an interactable but has not yet engaged it.

Direct / Indirect: Feedback triggers based on either proximity to the grabbing point or cone intersection.

Continuous feedback: Visual or haptic signals that persist as long as the hand remains within the interactable zone and increase as the selection is committed.

Disambiguation: The system’s ability to identify which object the user intends to grab when multiple interactables are close together.

Selection mechanism

The specific interaction and logic used to confirm the grab.

Input methods: Supports pinch, palm (power grab), multi-finger pinch, or controller trigger.

Advanced logic: Includes airgrab, velocity-based, or friction-based grabbing to assist the user in making a successful connection.

Align

Defines how the interactable and the hand reposition themselves relative to one another at the moment of the grab.

Grab interaction using pinch and palm grab gestures to hold virtual objects.

Object to hand: The interactable snaps or moves to the hand’s current position.

Hand to object: The virtual hand representation moves to the interactable’s position to maintain its world-space placement.

Pose adoption: The virtual hand automatically adopts a specific hand grab pose or aligns with a surface for a more realistic visual.

Manipulate

The behavior of the interactable while it is being held and moved through space.

Grab interaction using pinch and palm grab gestures to hold virtual objects.

Move 1:1: The object follows the hand’s translation and rotation exactly.

Physics-based: Adds “weight” to the object, where it may lag slightly or react to collisions with the environment (heavy grab).

Constraints: Limits movement to specific axes, such as pulling a lever or sliding a drawer.

Use while holding: The ability to trigger a secondary action while maintaining a grab. For example, using a palm grab to hold a tool and a pinch gesture to activate its primary function.