Video search interfaces enable users to enter keywords or phrases in a search box to search for videos. However, when initial search results do not meet the user’s needs, users have to re-initiate the search with a modified set of keywords with corresponding new results, which is inefficient and does not utilize information from the previous search. This disclosure describes techniques that enable users to perform interactive video search with iterative use of a large language model (LLM) to identify matching search results. If initial query results are unsatisfactory, the user can provide additional clarifying input in the search box. The large language model uses the initial query embedding and one or more subsequent interactive query embeddings to positively and iteratively enforce or omit portions of the search space until a target video is narrowed down as the top result.

This work is licensed under a Creative Commons Attribution 4.0 License.