Inventor(s)

D ShinFollow

Abstract

Photo storage and viewing applications enable users to query their photos and view matching results. However, such applications do not currently provide text summaries based on photos and cannot answer semantic queries. This disclosure describes the use of machine learning techniques to leverage a user’s photo library to answer semantic queries. With user permission, an image-text decoder is utilized to generate and store a text summary of individual photos in the user’s photo library. When a user query is received, matching text summaries are identified and used to prompt a large language model that generates an answer to the query. The matching text summaries can be identified by temporally filtering the photo library.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS