Inventor(s)

Abstract

A hardware-assisted Temporal Context Cache (TCC) system is described, including dedicated recorder and loader, integrated with each CPU core and interacting with a global context cache, as well as a controller component of which there may be several instances per SOC. During execution the recorder captures critical microarchitectural state, which may include L2 cache line addresses, L1 instruction cache addresses, branch prediction metadata, TLB state, prefetcher metadata, storing it in a local cache. When a thread yields or is preempted, the recorder saves this data to a global context cache asynchronously. The kernel scheduler then informs the hardware controller of the next thread’s impending execution and provides a predicted run duration based on fleet-wide historical profiles. In case a fleet wide profile is not available or does not match the current machine and workload conditions, local tracking history can be formed either in software (by the kernel scheduler) or in hardware (by the controller), and invoked once enough reliable local data has been collected. Utilizing this predictive data, the loader proactively prefetches the saved microarchitectural context for the upcoming thread into the core’s L2 cache and other structures, timing the restoration to complete just before execution of the next thread begins. This approach effectively “warms up” the core, reducing initial cache and branch prediction penalties to improve Instructions Per Cycle (IPC) and overall system performance in environments characterized by frequent context changes. Keywords: Hardware-assisted context cache, microarchitectural state, predictive prefetching, kernel-guided context restoration, cache pollution mitigation, thread migration optimization.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS