Abstract
This disclosure describes techniques for semantic indexing and retrieval of web pages. Per techniques of this disclosure, a web browser includes a semantic bookmark system that indexes the web page content and enables subsequent retrieval of the web page via a semantic search. A content extraction model is applied to extract the content of a web page that is bookmarked. The extracted content is semantically indexed, linked with the corresponding uniform record locator (URL), and stored. During web page content retrieval, bookmarks corresponding to a user search query are retrieved and provided to the user. Semantic indexing and retrieval enables users to retrieve a web page that includes content semantically linked to the search query, even when the query does not contain words that are explicitly included in the title or content of the web page.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Tran, Duc-Hieu, "Semantic Indexing and Retrieval of Web Pages in a Web Browser", Technical Disclosure Commons, (May 23, 2022)
https://www.tdcommons.org/dpubs_series/5152