Abstract

CaRRIE 2.0 extracts key insights/summaries from unstructured data such as papers, news

document, notes, blog scrapped website, and RFP (Request for Proposal) data. The user gains

faster understanding and insights to make informed decisions. The engine groups/ranks

collections of documents based on keywords of interest presented within the documents. The

user can then focus on a handful of documents aligned with their interests, reducing the time

needed to read each document and create insights. He/she can also understand the larger data

trends without needing to read the entire stack of documents, one by one. We accomplish those

tasks by stacking and modifying natural language processing (NLP) algorithms, as well as

creating a couple of new NLP algorithms. On the high-level, what we are doing is very similar to

how a search engine algorithm works. Our solution is divided into three main parts: (i) Topic

Modeling, (ii) Scoring and Ranking, and (iii) Insight extractions.

Creative Commons License

This work is licensed under a Creative Commons Attribution-Share Alike 4.0 License.

Recommended Citation

INC, HP, "CARRIE 2.0: INSIGHT EXTRACTION FROM UNSTRUCTURED DATA USING PROXIMITY CLUSTERING ENGINE", Technical Disclosure Commons, (February 04, 2021)
https://www.tdcommons.org/dpubs_series/4055

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

CARRIE 2.0: INSIGHT EXTRACTION FROM UNSTRUCTURED DATA USING PROXIMITY CLUSTERING ENGINE

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

CARRIE 2.0: INSIGHT EXTRACTION FROM UNSTRUCTURED DATA USING PROXIMITY CLUSTERING ENGINE

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information