Abstract

This submission presents a technique to evaluate the proximity of the response provided by different Large Language Models (LLMs) in a specific service context to ensure that swapping one model for another does not change the user experience. The techniques include translating and weighting input-level latent states to ensure that the mapping between LLM inputs and outputs remain uniform globally. By dynamically injecting modified prompts, the techniques force-align regional model reasoning with deterministic standards to eliminate "interpretive divergence" in real-time.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS