Development-Time Threats

Securing the supply chain and training pipeline against poisoning and theft.

Development-Time Threats: Poisoning & Supply Chain

Attacks against AI often happen before the model is even deployed. OWASP Development-Time Threats focuses on the integrity of the data, the model, and the supply chain.

If an attacker compromises the training phase, the model is permanently backdoored. No amount of runtime firewalling can fix a model that has learned to be malicious.

1. Data Poisoning

The Threat: An attacker injects malicious samples into the training data to manipulate the model's behavior.

  • Availability Poisoning: Degrades overall accuracy (DOS).
  • Integrity Poisoning (Backdoor): The model works normally, except when a specific "trigger" is present.
    • Example: A self-driving car stops at all stop signs, except ones with a yellow sticky note on them.

Controls:

  • Data Quality Control: Automated statistical checks for outlier distribution in training sets.
  • Provenance Tracking: Cryptographic signing of data sources (Data Lineage).
  • Poison Robust Models: Training techniques that are statistically resilient to outliers.

2. Supply Chain Model Poisoning

The Threat: You download a pre-trained model (e.g., from Hugging Face) that contains a backdoor or malware.

  • Pickle Bombs: The model file (.pkl) contains a remote code execution payload that runs when you load the model.
  • Transfer Learning Attacks: The foundation model has a hidden bias that survives fine-tuning.

Controls:

  • Model Scanning: Use tools like ModelScan or Fickling to detect malware in model weights.
  • Safe Formats: Prefer safetensors over pickle.
  • Vendor Vetting: Only use models from verified publishers.

3. Development-Environment Model Theft

The Threat: Attackers steal the model weights or training data directly from your engineering environment.

  • Model weights are high-value IP (costing millions to train).
  • Training data often contains PII or trade secrets.

Controls:

  • Confidential Computing: Train models inside Trusted Execution Environments (TEEs).
  • RBAC & Segregation: Data scientists should not have direct access to raw PII; use anonymized views.
  • Environment Hardening: Treat the ML training cluster as a critical production asset, not a sandbox.

CISO Takeaway

Trust nothing from the internet. Treat a downloaded model like a downloaded binary executable. Scan it, sandbox it, and monitor its behavior.


Continue to the next section: Runtime Application Security