Providing accurate estimates of time required to perform code reviews can enable practitioners to prioritize reviews and can help improve productivity. Currently, factors such as the number of lines of code to be reviewed, or the number of code files, are provided to the reviewer. Such factors are relatively poor predictors of the actual effort required. This disclosure describes the use of machine learning techniques to generate effort estimates of the actual human time required to perform a code review. The machine learning model is trained on a dataset that is generated based on historical code modifications and corresponding reviews. The target variable for the machine learning model is to predict the actual time taken for code review by a practitioner. Multiple data points from each historical code review are used to obtain features that are used to train the machine learning model. The model is periodically retrained to ensure accuracy of predictions.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Ivanković, Marko and Petrovic, Goran, "Use of Machine Learning To Generate Estimates of Code Review Time and Effort", Technical Disclosure Commons, (December 24, 2020)