Defensive Publications Series

Data Distillation Pipeline for Principle-Guided Personalization in Large Language Models

Abstract

Personalized large language models often exhibit defects such as the overuse of irrelevant user information, unnatural data formatting, and intrusive or repetitive name usage. These issues typically arise from a failure to prioritize relevant context or to integrate user data naturally into responses. A data distillation pipeline is disclosed to address these limitations by generating high-quality supervised fine-tuning examples. The method utilizes a “persona bank” to generate user prompts and profiles, followed by the generation of multiple responses guided by specific personalization principles. These guided responses are evaluated alongside a baseline unguided output using a side-by-side comparison. Training data is only retained when the principle-guided output demonstrates a clear improvement over the baseline. This process ensures the model learns to handle personal data judiciously, leading to more natural, contextually relevant, and less intrusive interactions without the need for extensive manual prompt engineering.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Lopes, Hugo; Zheng, Yang; Andreica, Mugurel-Ionut; Dudek, Jeffrey; and Li, Cheng, "Data Distillation Pipeline for Principle-Guided Personalization in Large Language Models", Technical Disclosure Commons, (April 07, 2026)
https://www.tdcommons.org/dpubs_series/9737

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Data Distillation Pipeline for Principle-Guided Personalization in Large Language Models

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Data Distillation Pipeline for Principle-Guided Personalization in Large Language Models

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information