Abstract

Image generation models enable users to generate images by providing instructions. However, such models cannot be invoked with voice commands and are also unable to update a prior image based on the user instruction. This disclosure describes techniques that enable users to obtain and refine images by iteratively interacting with an image generation model in real time, e.g., via voice commands to a virtual assistant. Implementation of the techniques can enable users to use their voice and imagination for artistic visual expression. The techniques can be provided via a virtual assistant available via a smart speaker, smartphone, or other device. The techniques incorporate appropriateness checks for the input query and/or the output image, thus ensuring that the interactive experience is safe and trustworthy.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Singh, Sarvjeet and Damiba, Bertrand, "Iterative Image Generation via Voice Interaction with an Image Generation Model", Technical Disclosure Commons, (December 12, 2022)
https://www.tdcommons.org/dpubs_series/5549

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Iterative Image Generation via Voice Interaction with an Image Generation Model

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Iterative Image Generation via Voice Interaction with an Image Generation Model

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information