Software products are built from a number of interacting binaries that are rolled out asynchronously, at differing schedules and paces. When there is a problem with a software product such as a failure or sub-par performance, identifying the binaries that are the source of a product failure or sub-par performance is difficult with end-to-end tests. Additionally, an apparently failing test can be a false positive that does require any code changes to fix, yet that demands human attention. This disclosure describes techniques to automatically determine, from end-to-end tests of a software product, code binaries that are at the root of product failure or sub-par performance. For each failure signal, a history is maintained of post-triage manual feedback. The history is used to generate a confidence metric, referred to as a cube score, of a new failure being a true positive. Probable false positive test failures can be filtered out, reducing developer effort to address the apparent test failure.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.