Techniques are provided to simultaneously infer approximately where a speaker is looking and the speaker's emotion during a conversation. Due to privacy concerns, only the speaker's approximate facial features may be estimated. The inferred face may be converted into a cartoon face that retains the main facial features. This may enhance user interaction experience even when a speaker does not turn on video in teleconference.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.