Abstract

As artificial intelligence systems become embedded in critical decision-making workflows, security research has largely focused on defending models at runtime. Attention has centered on prompt injection, inference-time abuse, and output manipulation, implicitly assuming that models arrive in production as trustworthy artifacts. This assumption is increasingly fragile. Modern AI systems are assembled through complex and opaque supply chains involving datasets, pre-trained models, fine-tuning pipelines, dependency ecosystems, and automated update mechanisms. Each stage introduces opportunities for adversarial influence that remain largely unexplored by conventional security testing. This paper examines AI security from a supply-chain perspective, arguing that many impactful compromises occur upstream, long before models are deployed into production environments. We introduce the concept of AI supply-chain pentesting as a systematic approach to identifying and evaluating vulnerabilities introduced during training, updating, and dependency integration. Unlike traditional pentesting, which probes live systems for exploitable behaviours, supply-chain pentesting focuses on how model behaviour can be shaped, biased, or backdoored through manipulation of data sources, model reuse practices, update channels, and third-party components. The paper analyzes key attack surfaces across the AI lifecycle, including data poisoning, pre-trained model reuse, automated fine tuning workflows, and continuous model updates. Through realistic threat modeling and scenario-driven analysis, we demonstrate how adversaries can influence model behaviour without directly interacting with production systems. These attacks are often stealthy, durable, and difficult to detect through runtime monitoring, as malicious behaviour may only manifest under rare or context-specific conditions. We further argue that existing security methodologies are poorly suited to detect supply chain compromises in AI systems. Traditional defenses assume static code, clear trust boundaries, and observable failures. In contrast, AI models internalize patterns from their inputs, allowing compromised upstream components to persist across deployments and versions. Addressing these risks requires rethinking AI security as a lifecycle problem rather than a deployment problem. By framing the AI supply chain as a primary attack surface, this work highlights a critical gap in current AI security practice and proposes a foundation for developing more robust, adversarially informed testing methodologies. The findings underscore the need for greater scrutiny of training pipelines, update mechanisms, and dependency ecosystems as AI systems continue to proliferate across high-stakes domains.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS