Inventor(s)

D ShinFollow

Abstract

Organizing photos into folders, albums, or other collections requires human input and is tedious. This disclosure describes the use of a large visual model and a large language model to automatically identify and recommend appropriate folders (or photo buckets) for photographs. With user permission, a photograph (in a user’s library) is provided as input to a large visual machine-learning model which generates a text summary of the photograph. The text summary is provided to a large language model (LLM) which is tasked with performing a semantic match of the text summary to (user-specified) folder names to generate recommendations to organize the image under one or more folders, or to suggest creation of a new folder. While the foregoing description refers to separate LVM and LLM, folder name identification can also be performed by a single multimodal model that takes a photo and existing folder names as input, and outputs a suggested folder for the photo.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS