Free samples of books and other long-form text content typically includes the initial part of the content. This approach may not always result in the most engaging sections being selected as the sample and may not be suitable to give the best impression of the content. This disclosure describes techniques that determine page engagement metrics for a book (or other long-form text content) and utilize such metrics to generate a sample for the content. Page engagement scores for each page are generated using a trained machine learning model. The scores are based on user-permitted engagement data obtained from users that have read the text. Based on the page engagement scores, portions of the book that are most engaging are identified. A large language model (LLM) is utilized to identify a specific chunk of the book from the identified portions of the book that makes for a comprehensible (coherent) sample. The generated sample enables a viewing user to obtain a more representative impression of the book content and can lead to increased purchase metrics.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Shin, D, "Generating Book Sample Based on Page Engagement Metrics", Technical Disclosure Commons, (March 14, 2023)