Abstract
This publication describes a mobile software architecture and method for real-time animation of a virtual avatar over a live or recorded video stream captured from a mobile device. The system concurrently utilises a back-facing for environmental video capture, a front-facing camera for facial tracking, and an audio input for lip synchronisation. A synchronisation and buffering mechanism aligns video, audio, and animation data streams in near real time, ensuring temporally coherent avatar animation over live or recorded video. These heterogeneous inputs are dynamically layered and synchronised to drive a two-dimensional (2D) or three-dimensional (3D) avatar while supporting both local recording and live network streaming.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Bednarz, Krzysztof P., "Dynamic Input Layering to Animate a Virtual Avatar over a Mobile Video Capture Using Concurrent Front-Camera Tracking, Audio, and UI Commands", Technical Disclosure Commons, (May 25, 2026)
https://www.tdcommons.org/dpubs_series/10222