Inventor(s)

Abstract

Techniques are described for a low-bandwidth call mode that replaces camera video transmission with streaming of audio-derived avatar animation parameters. During a real-time call, endpoints exchange capability information and negotiate an operating tier specifying parameter formats, update rates, and synchronization expectations. In response to user input or automatically upon degraded network conditions (e.g., low uplink bandwidth, loss, or jitter), a sender disables video frame transmission and processes speech audio in short windows using an on-device model to produce a time-stamped parameter stream including visemes and optionally expression and gesture parameters with confidence scores. The stream is adapted in rate, precision, or tier, and may fall back to neutral or lip-sync-only when confidence is low. A receiver renders an avatar locally, synchronizes animation to audio using timestamps and buffering, and applies packet-loss concealment behaviors.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS