Abstract

A bug deduplicator identifies independently discovered bugs that have the same underlying cause. Deduplication of bugs reduces toil for the software team by reducing the number of bugs that developers need to examine. However, if a bug deduplicator incorrectly classifies a bug as a duplicate, human developers might ignore the bug, allowing it to escape to production. A tradeoff exists between toil reduction and risk tolerance. This disclosure describes techniques that enable a software team to trade off the effort to remove bugs (e.g., auto-close bugs so that humans save toil and time) against the risk of errors in a bug deduplicator. Custom settings and a confidence level that a bug is a duplicate are used to determine whether to log a particular bug, to log it with comments, etc. The techniques enable the embedding of a bug deduplicator at suitable locations within a software development toolchain. The performance of the bug deduplicator can be fine-tuned in real-time by an analysis of its true negative and false positive metrics.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS