What Is an AIBOM? The AI Bill of Materials, Explained
What an AI Bill of Materials is, what it inventories, how to generate one, and why the EU AI Act is about to make it mandatory.

- An AIBOM is a machine-readable inventory of every model, dataset, and component inside an AI system — an SBOM extended for AI.
- Most AI risk lives where a normal SBOM is blind: the model weights, the training data, and the model's lineage.
- Two standards cover it: CycloneDX ML-BOM (OWASP) for CI/CD, and the SPDX 3.0 AI Profile (Linux Foundation) for regulatory filings.
- The OWASP AIBOM Generator builds a CycloneDX AIBOM from a Hugging Face model in seconds, and scores what's missing.
- The EU AI Act is the forcing function: GPAI transparency obligations began in August 2025, and Annex IV is what auditors ask for.
Your team just shipped a feature built on a fine-tuned open-source model pulled from Hugging Face. Quick question: what was that model fine-tuned from? What data trained the base? What license governs the weights — and can you legally use it in a commercial product? For most teams shipping AI right now, the honest answer to all three is a shrug.
An AI Bill of Materials is what turns that shrug into a document. An AIBOM (AI Bill of Materials) is a machine-readable inventory of every component that makes up an AI system — the models, the datasets, the frameworks, and the relationships between them. If a software bill of materials tells you which packages you ship, an AIBOM tells you what's actually inside the AI.
What is an AIBOM, exactly?
An AIBOM is a structured, machine-readable record of an AI system's makeup. It extends the familiar SBOM with the things that are unique to AI: model weights and their provenance, training and fine-tuning data, hyperparameters, the base model a fine-tune descends from, and the licenses attached to each. A tool produces it by reading model and dataset metadata, and every component it finds becomes an entry you can query, diff, and audit.
One important boundary: an AIBOM documents the structure, provenance, and relationships of an AI system — not the raw weights or the proprietary algorithm itself. It's the manifest, not the model. That's what makes it shareable with a customer, an auditor, or a procurement team without giving away your IP.
An SBOM answers 'what code did we ship?' An AIBOM answers 'what model, trained on what data, descended from what — and are we allowed to use it?' Same idea, much harder questions.
Why a normal SBOM isn't enough
Here's the trap teams fall into: they run an SBOM on their AI service, see the Python packages and the container image, and assume they're covered. They're not. An SBOM is blind to exactly where AI risk concentrates.
- The weights. A model is a binary blob of learned parameters. Where did it come from? Was it tampered with? A poisoned or backdoored model passes every dependency scan you own — because it isn't a dependency, it's data.
- The training data. Bias, copyrighted material, PII, and licensing land-mines all live in the dataset, not the code. An SBOM never looks there.
- The lineage. Modern models are fine-tunes of fine-tunes. If the base model three hops upstream has a non-commercial license or a known flaw, you inherit it — and an SBOM can't see the chain.
“A backdoored model passes every dependency scan you own — because it isn't a dependency, it's data.”
What goes inside an AIBOM?
Guidance from CISA, NIST, OWASP, and the EU AI Act's Annex IV has converged on a similar picture. Think of it in six buckets:
- Models — name, version, architecture, base-model lineage (what it was fine-tuned from), a weights identifier, license, and acceptable-use restrictions.
- Datasets — source, collection method, preprocessing steps, train/validation/test splits, license, and known biases or limitations.
- Code — the frameworks, libraries, dependencies, and container images around the model (this is the part a normal SBOM already covers).
- Hardware — the training and inference requirements (GPUs, accelerators).
- Data pipelines — the training, validation, retrieval, and orchestration steps that move data through the system.
- Governance — approval history, change log, evaluation results, and compliance attestations.
Notice how much of this is about data and provenance, not code. That's the whole point: AI's attack surface and its compliance surface both live upstream of the application.
AIBOM vs SBOM: how they relate
An AIBOM doesn't replace your SBOM — it wraps around it. The code and container parts of an AI system are still a normal SBOM; the AIBOM adds the model and data layers that the SBOM can't describe. Both are built on the same Bill of Materials standards, so they slot into the same tooling.
| SBOM | AIBOM | |
|---|---|---|
| Question it answers | What components do I ship? | What model, data, and lineage power this AI? |
| Core entries | Packages, versions, licenses | Models, datasets, weights provenance, training data |
| Main risks it surfaces | Vulnerable / outdated dependencies | Model poisoning, data bias, license & copyright exposure |
| Standards | CycloneDX / SPDX | CycloneDX ML-BOM / SPDX 3.0 AI Profile |
The two standards: CycloneDX ML-BOM and SPDX 3.0
You don't have to invent a format — the same two communities behind the SBOM standards have extended them for AI.
CycloneDX ML-BOM (OWASP)
CycloneDX added machine-learning support in version 1.5 (June 2023) and has refined it since. Its ML-BOM captures models, datasets, and configurations — architecture, training and inference setup, dataset references, performance metrics, and the bias and safety considerations attached to a model. Because it's the same CycloneDX framework (ECMA-424) as the SBOM, it drops into the CI/CD pipelines you already run.
SPDX 3.0 AI Profile (Linux Foundation)
SPDX 3.0, released in April 2024, introduced an AI Profile and a Dataset Profile. The AI Profile describes a model's type, training methods, data handling, explainability, limitations, and even energy consumption; the Dataset Profile covers how data is processed, stored, and managed. SPDX is ISO-standardized, which gives it extra weight in formal, regulator-facing documentation.
| CycloneDX ML-BOM | SPDX 3.0 AI Profile | |
|---|---|---|
| Steward | OWASP | Linux Foundation |
| AI support since | v1.5 (June 2023) | SPDX 3.0 (April 2024) |
| Best for | Internal CI/CD generation, security automation | Vendor filings, regulatory & audit documentation |
| Edge | Compact, tooling-rich, same as your SBOM | ISO-standardized, formal compliance weight |
The pragmatic move that's emerging: generate CycloneDX ML-BOM internally where tooling matters, and require SPDX 3.0 from vendors where regulatory weight matters. Translators like Protobom and BomCTL convert between the two, so picking one doesn't lock you out of the other.
How to generate an AIBOM
As with SBOMs, producing the file is the easy part — the discipline is in doing it on every model you ship and acting on what it shows.
| Tool | Steward | What it does |
|---|---|---|
| OWASP AIBOM Generator | OWASP | Builds a CycloneDX AIBOM from a Hugging Face model's metadata, and scores which fields are missing |
| GUAC | OpenSSF | Aggregates supply-chain metadata across sources |
| Protobom / BomCTL | OpenSSF | CLI tooling that translates between SPDX and CycloneDX in CI/CD |
If your team builds on Hugging Face models, the OWASP AIBOM Generator — launched at RSAC 2025 — is the fastest way to see a real AIBOM. It reads the model's metadata, emits CycloneDX, and gives you a completeness score so you can see exactly which provenance fields the upstream model never documented. That gap, by the way, is often the most useful output: it tells you what to demand from the vendor.
- Generate at build time. Produce the AIBOM in your pipeline whenever you ship or update a model, so the inventory matches what's actually in production.
- Capture lineage, not just the top model. Record the base model and the training datasets, not only the final fine-tune — that's where licensing and poisoning risk hides.
- Score completeness and chase the gaps. Where the upstream model is silent on provenance, that's a question for procurement, not a field to leave blank.
- Feed it into review. Match licenses against your policy, flag non-commercial or unknown-provenance weights, and keep the AIBOM as the evidence an auditor will ask for.
Why now: the EU AI Act
AIBOMs moved from nice-to-have to procurement requirement fast, and the clearest reason is European law. The EU AI Act puts AI transparency on a legal clock.
Transparency and documentation obligations for general-purpose AI models began applying on 2 August 2025, with providers expected to maintain technical documentation, transparency, and risk mitigation. The Act's Article 50 transparency rules — covering AI-generated and manipulated content — apply from 2 August 2026. And the technical documentation auditors will actually ask for is spelled out in the Act's Annex IV, which maps closely to what an AIBOM contains: the model, its data, its training, and its limitations.
An AIBOM is the most efficient way to produce EU AI Act Annex IV documentation — because the inventory you build for security is most of the documentation a regulator wants anyway.
The standards bodies are converging in the same direction: CISA's minimum elements for an SBOM for AI, NIST's AI risk work, ISO/IEC 42001, OWASP, and the Linux Foundation have all landed on a similar field set. When the guidance agrees this much, the requirement tends to follow.
The bottom line
AI is assembled from models and data you didn't build and often can't see inside. An AIBOM is how you get that visibility — the inventory that makes model provenance, data licensing, and AI supply-chain risk something you can actually manage instead of hope about. The standards exist, the tools are free, and the regulatory clock is already running. Generate one for your next model and read what it tells you; the gaps it surfaces are the work.