Abstract

Feedback from stakeholders connected to a software product is crucial for product development. Feedback can be used for a variety of purposes, such as continuous improvement, rapid prototyping, experimentation, resource allocation, improving user experience, etc. However, for products with a large set of features and a large volume of feedback (which may be in the form of freeform text), analyzing the feedback to generate useful insight is difficult. This disclosure describes the use of a vector database formed based on unsupervised learning to cluster a set of seed feedback data, tested by partitioning the seed data into training and test datasets. The vector database stores cluster IDs and embeddings and can classify and generate embeddings for new feedback data as it is received. Contents of the vector database can be provided as inputs to a large language model (LLM) to obtain insight regarding the product.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Han, Zhaoying and Siruguppa, Sathya Anurag, "Automated Text Clustering and Classification of Feedback Data Using LLMs", Technical Disclosure Commons, (July 16, 2024)
https://www.tdcommons.org/dpubs_series/7199

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Automated Text Clustering and Classification of Feedback Data Using LLMs

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Automated Text Clustering and Classification of Feedback Data Using LLMs

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information