I'm working on a compliance requirement for an agent deployment pipeline where the audit trail needs to link the final SBOM not just to dependencies, but to the exact build run that produced the artifact. The SBOM itself is signed, but auditors want to see the "provenance" baked in—proof that the SBOM describes the output of a specific CI job.
I'm using the OpenClaw SDK to package the agent, and generating the SBOM with `syft`. The piece I'm wrestling with is how to attach the build metadata (git SHA, pipeline run ID, builder image digest) *into* the SBOM document itself in a standardized way, not just as a separate attestation.
My current approach is to inject the data as a property in the `metadata.component` of the CycloneDX SBOM:
```python
import json
from datetime import datetime
def augment_sbom_with_provenance(sbom_dict, build_info):
"""Add build provenance to a CycloneDX SBOM dict."""
if not sbom_dict.get("metadata"):
sbom_dict["metadata"] = {}
if not sbom_dict["metadata"].get("properties"):
sbom_dict["metadata"]["properties"] = []
sbom_dict["metadata"]["properties"].extend([
{
"name": "build:git_commit",
"value": build_info["git_sha"]
},
{
"name": "build:pipeline_id",
"value": build_info["pipeline_run_url"]
},
{
"name": "build:timestamp",
"value": datetime.utcnow().isoformat() + "Z"
}
])
return sbom_dict
```
Then sign the augmented SBOM with Sigstore's `cosign`. Questions:
1. Is using `metadata.properties` the right place for this, or should I be creating a separate component of type "build"?
2. If the build itself uses external tools (like the OpenClaw SDK, specific version of syft), should those also be listed as components in the SBOM with their own hashes?
3. How are folks handling the scenario where the build provenance includes sensitive data (like internal pipeline URLs)? Do you redact or use a non-identifiable build UUID?
Looking for examples or best practices, especially if anyone has tackled this with in-toto attestations or SLSA provenance generation alongside the SBOM.