Abstract

Organizing photos into folders, albums, or other collections requires human input and is tedious. This disclosure describes the use of a large visual model and a large language model to automatically identify and recommend appropriate folders (or photo buckets) for photographs. With user permission, a photograph (in a user’s library) is provided as input to a large visual machine-learning model which generates a text summary of the photograph. The text summary is provided to a large language model (LLM) which is tasked with performing a semantic match of the text summary to (user-specified) folder names to generate recommendations to organize the image under one or more folders, or to suggest creation of a new folder. While the foregoing description refers to separate LVM and LLM, folder name identification can also be performed by a single multimodal model that takes a photo and existing folder names as input, and outputs a suggested folder for the photo.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Shin, D, "Automatically Photo Library Folder Organization Using Photo Visual Semantics", Technical Disclosure Commons, (July 25, 2024)
https://www.tdcommons.org/dpubs_series/7233

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Automatically Photo Library Folder Organization Using Photo Visual Semantics

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Automatically Photo Library Folder Organization Using Photo Visual Semantics

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information