A machine learning (ML) model often comprises many floating point numbers (model weights) and the operations applied on them (the computation graph). If a model is derived from another, the derived model often has similar weights and is of the same size. It is a waste of storage to store copies of the similar weights and computation graphs for two related models. This document describes techniques to obtain a sparse representation, referred to herein as thresholded model diff, that can be applied to a base model to reconstruct a version of a derived model. Differences between weights of the base model and a derived model obtained by fine-tuning the base model are identified and reduced with reversible operations. The reduced differences are (optionally) subjected to thresholding to obtain the thresholded model diff. A reconstructed model is obtained by selectively applying the thresholded model diff to the base model. The reconstructed model is evaluated to ensure that it can adequately perform the task that the derived model is fine-tuned for. The thresholding and application of the diff to the base model is adjusted based on the evaluation.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.