Abstract
This disclosure describes techniques to automatically detect annotations such as freeform sketches and text captions provided by users during online video conferences and generate formalized representations of the annotations for in-meeting display as well as for subsequent retrieval. The techniques can be implemented for video conferences, including where participants participate via extended reality (XR) devices. Implementation of the techniques enables users to create high-fidelity sketches and annotations while participating in a video conference (virtual meeting). With user permission, an artificial intelligence (AI) module is utilized that detects user speech, a targeted first-person view, and user sketches provided during a virtual meeting. Suitable prompts for a machine learning (ML) model are generated using the detected information and are used to command an image generation model to generate corresponding formalized illustrations. The formalized illustrations maintain spatial relationships with respect to the first-person view of the virtual meeting and are displayed during the XR meeting and with user permission, saved for subsequent use for retrieval, meeting transcript, etc.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Qian, Xun and Du, Ruofei, "AI-based Detection and Formalization of In-Session Annotations in Virtual Meetings", Technical Disclosure Commons, (December 19, 2024)
https://www.tdcommons.org/dpubs_series/7664