Abstract

Media library applications allow users to organize their media such as photos and video into collections. Dynamic, self-updating collections are implemented in some applications, but are limited to narrow criteria such as a particular person label, a particular object type, etc. Current dynamic collection generation mechanisms do not support complex natural language criteria that a user may want to express, e.g., “all my photos after a workout; include screenshots from my fitness tracker.” This disclosure describes techniques that use a large language model (LLM) or other suitable model to transform complex user-specified criteria into a vector representation. Media assets in the user’s library are also transformed into vector representations and a vector-distance based matching process used to determine the different collections for individual assets. The process is computationally inexpensive and scalable to large numbers of users, and provides dynamically updating collections as new media assets are added to a user’s library.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS