Abstract
This disclosure presents a method for gesture-based virtual tool interaction on arbitrary surfaces within an Extended Reality (XR) environment. Utilizing a head-worn system with egocentric cameras, supplemented by wearable sensors, the system tracks three-dimensional (3D) hand poses and coordinates via a tracking system implementing a neural network or machine learning model. Unlike prior contact-detection methods, this approach employs a dual-network architecture: a static network classifies specific hand grasps (e.g., tripod or wide grips) to instantiate and spatially anchor corresponding virtual tools (e.g., pencils or erasers) with six-degrees-of-freedom (6DoF) precision; simultaneously, a secondary temporal neural network analyzes sequence-based micro-gestures for context-aware tool control, such as adjusting line width or color. By prioritizing gesture-driven tool selection over traditional menus, the system transforms any physical plane into a dynamic digital interface, reducing the need for specialized hardware or touch-sensitive surfaces.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Ahuja, Karan; Gonzalez, Eric Jordan; Gonzalez Franco, Mar; Cheng, Andrew Chunmye; and Liang Xu, Vasco Miguel, "Methods to Generate Virtual Interactions Based on Gestures", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/9429