Abstract
This document describes systems and methods for analyzing the internal state of neural networks, such as large language models, during production-level inference. Some existing analysis techniques may rely on extrinsic, black-box analysis of model outputs, which can be slow, costly, and may not detect certain performance degradations. The described technology can involve an offline process to identify and calibrate sparse characteristic regions within a model's architecture, where specific activation patterns may correlate with high-level internal states. During runtime, lightweight counters can be instrumented into the model's computational graph to measure activations within these regions, potentially with low performance overhead. This approach can generate low-latency, real-time metrics that represent the model's internal operational state, which may facilitate early detection of failure modes, support granular diagnosis, and allow for integration with existing analysis infrastructure for automated interventions.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Start, Johannes and Lunney, John, "Real-Time Intrinsic State Analysis of Neural Networks via Instrumented Characteristic Regions", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/9183