Inventor(s)

Neel MohbeFollow

Abstract

Reliability verification for generative artificial intelligence models may be constrained by a reliance on post-hoc, token-level analysis, which can be computationally inefficient. Described are systems and methods for a pre-decoding verification architecture that can operate on the internal latent states of a generative model before output tokens are sampled. The architecture can use a hierarchical cascade of checks, for example, beginning with a fast linear classifier, escalating ambiguous inputs to a non-linear density-based evaluator, and routing complex cases to specialized expert models. A feedback mechanism can propagate classifications from the expert models backward to update and refine the faster, earlier tiers. This approach can help decouple the computational cost of reliability verification from the length of the generated output and can operate on semantic intent to improve resilience.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS