Inventor(s)

Puneet MahajanFollow

Abstract

Employing artificial intelligence (AI) for evaluating AI applications poses a number of challenges, including the detection and mitigation of various biases. This disclosure describes a multi-faceted approach for dynamically identifying and mitigating biases in AI-based evaluation. Potential biases can be detected by analyzing inconsistencies and discrepancies between the outputs of the primary and auxiliary evaluator models via an ensemble learning framework, optionally assisted by human evaluators. Specific adversarial examples can be generated and used to identify hidden biases in the primary AI evaluator. Detected biases can be mitigated by employing reinforcement learning to make continuous real-time adjustments to counteract the detected biases with an appropriate dynamic reweighting mechanism. Domain-specific auxiliary models can be seamlessly integrated to focus on specific application domains. The described approach can provide robust and fair evaluations at scale to ensure consistent, reliable, and unbiased assessment across diverse AI applications.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS