Abstract

This work presents a defensive disclosure of a novel GRU-based pipeline to predict the execution runtime of AI model computational graphs on TPU hardware. Efficiently predicting the runtime of AI model operations is essential for optimizing deployment and hardware utilization. This paper proposes a novel pipeline that integrates opcode-based runtime estimation, graph edge de pendency embeddings, configurable node feature adjustments, and a Gated Recurrent Unit (GRU) neural network to predict operation runtimes on computational graphs. The approach is applied to the TPUGraphs dataset from the ”Google- Fast or Slow? Predict AI Model Runtime” Kaggle competition, which involves predicting runtimes of Tensor Processing Unit (TPU) computations based on graph and configuration features. Experimental results demonstrate the model’s ability to capture complex graph and configuration dependencies, enabling accurate runtime predictions that can guide compiler optimization heuristics. The disclosed pipeline is designed to assist in compiler optimization and runtime prediction, and is released for public use to establish prior art in configuration-aware AI compiler techniques.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Mallapragada, Suma, "Runtime Prediction of AI Model Operations Using a GRU-Based Neural Network", Technical Disclosure Commons, (July 07, 2025)
https://www.tdcommons.org/dpubs_series/8310

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Runtime Prediction of AI Model Operations Using a GRU-Based Neural Network

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Runtime Prediction of AI Model Operations Using a GRU-Based Neural Network

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information