HP INCFollow


CaRRIE 2.0 extracts key insights/summaries from unstructured data such as papers, news

document, notes, blog scrapped website, and RFP (Request for Proposal) data. The user gains

faster understanding and insights to make informed decisions. The engine groups/ranks

collections of documents based on keywords of interest presented within the documents. The

user can then focus on a handful of documents aligned with their interests, reducing the time

needed to read each document and create insights. He/she can also understand the larger data

trends without needing to read the entire stack of documents, one by one. We accomplish those

tasks by stacking and modifying natural language processing (NLP) algorithms, as well as

creating a couple of new NLP algorithms. On the high-level, what we are doing is very similar to

how a search engine algorithm works. Our solution is divided into three main parts: (i) Topic

Modeling, (ii) Scoring and Ranking, and (iii) Insight extractions.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 4.0 License.