Abstract
This disclosure describes techniques that leverage a large language model (LLM) to generate custom narrations when arranging sets of photos in the form of a visual experience for users to reminisce on thereby personalizing and adding depth to the reminiscing experience. Per the techniques, a set of photos from a user’s photo library to be featured in a visual experience is identified. Photo pixels, metadata, and other relevant data are used to populate a prompt that is to an LLM. The response from the LLM is parsed and associated with photos to be featured in the visual experience. The LLM-generated captions are burned into individual photos that are displayed in the visual experience. Optionally, the set of photos is itself provided with an LLM-generated caption that captures the theme. Captioned photos are merged into a visual experience such as a slideshow, video, story, etc. that is displayed to the user. Visual experiences can be surfaced to the user at appropriate times - at periodic intervals such as monthly , at milestones, or at other opportune times when displaying a visual experience is appropriate.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Kim, Ji Hun; Khan, Abdulrahman; Meaney, Tommy; Zhu, Tim; and Zhou, Zhou, "Augmenting Photo Reminiscing Experiences with LLM-generated Narrations", Technical Disclosure Commons, (August 20, 2025)
https://www.tdcommons.org/dpubs_series/8492