Abstract
While photographs capture visual information, they lack depth and interactivity. Contextual information, memories, stories, and related data associated with a photograph are external to the image file itself, and are scattered across applications, cloud services, or within the user’s mind. This disclosure describes techniques to enhance a digital image by automatically generating interactive annotations using a large language model (LLM) and incorporating the annotations into the image. Entities within a photo are identified and an LLM is instructed to generate an annotation object for the recognized entities. The annotation object is a self-contained package of data and instructions that specifies an interactive experience associated with an entity identified within the image. The annotation object is incorporated into the image file. When a user views the image, the viewer application can read and render the annotation objects. For example, if the user taps on an entity within the image, the corresponding annotation object is made available for interactive engagement with the user.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Labzovsky, Ilia and Karmon, Danny, "Generating Interactive, LLM-driven Annotations for Inclusion in Digital Image Files", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/8994