It is difficult to make a note of a fact, look up an entity, or perform other actions on audio content in the moment to enable remembering things for later. This disclosure describes techniques to create bookmarks for audio content such as podcasts, audiobooks, etc. with easy to perform gestures without interrupting the listening session. In response to the user performing the gesture, a capture flow is executed to transcribe, save, and make the audio content available for later search, reference, consumption, sharing, or browsing, e.g., via a bookmark. A large language model (LLM) can be utilized for various purposes such as to help the user search and revisit saved bookmarks via natural language queries to a conversational agent; to automatically generate titles for the bookmarked audio snippet; to summarize the audio snippet; to extract entities from the audio snippet; etc.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.