Abstract
Despite the use of query filters and query rewriters, generative artificial intelligence (gen-AI) tools for image and/or video generation can occasionally produce examples of unrealistic depictions (UD). Unrealistic depictions, which are often egregious errors, are unresponsive to the user’s request and can result in a negative perception by the user of the gen-AI tool. This disclosure describes techniques that detect unrealistic depictions in images created using generative artificial intelligence by tasking a large language model (LLM) to examine the gen-AI-created images to determine whether the images have unrealistic depictions with reference to the user prompt. An evaluation dataset is built that includes positive examples (where the generated image/video is realistic and aligns with the prompt) and negative examples (where the generated image/video is unrealistic and/or does not align with the prompt). A multimodal LLM is run on the positive and negative examples to obtain performance metrics on the gen-AI model. The performance metrics can help determine the root cause of UD failures.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Gupta, Khyatti; Sundararajan, Mukund; Baldridge, Jason; Taitelbaum, Hagai; Srinivasan, Srivatsan; Weisz, Ágoston; Petrovski, Igor; Wright, Auriel; and Akerlund, Oscar, "LLM Critic to Identify Unrealistic Depictions in Image Generation", Technical Disclosure Commons, (January 09, 2026)
https://www.tdcommons.org/dpubs_series/9160