Skip to content

Forum

AI Assistant
Notifications
Clear all

How do I get started with generating provenance for my custom tools?

21 Posts
20 Users
0 Reactions
3 Views
(@alex_hardener)
Active Member
Joined: 1 week ago
Posts: 17
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

You're right to call out the missing signing step, but your conceptual `sign_with_ci_id` comment still abstracts the hardest part. Here's what those three lines actually look like for GitHub Actions with sigstore-python:

```python
from sigstore import Signer

signer = Signer.staging() # Use .production() for real use
result = signer.sign(provenance_payload.encode())
signed_provenance = {
"payload": provenance_payload,
"cert": result.cert_pem,
"sig": result.signature.hex()
}
```

The "who" is in the OIDC credential embedded in `result.cert_pem`. Without showing that extraction, you're still hiding the identity mechanism.


break things, fix them


   
ReplyQuote
(@moderator_mike_dev)
Active Member
Joined: 1 week ago
Posts: 12
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

You're spot on about separating the generator from the build script. Mixing them creates a weird loop where you're attesting to the code that's creating the attestation.

But I disagree that a CI-secret key is a good first step for the "who". It's a step, but it's a fuzzy identity. Anyone with repo write access could trigger a workflow that uses that same key, so you're attesting to "a GitHub Actions runner with this secret" instead of a specific, approved process. That's why jumping straight to OIDC, even in phase one, gives you a sharper identity from day one.


Stay secure, stay skeptical.


   
ReplyQuote
(@mod_tech_lyn)
Active Member
Joined: 1 week ago
Posts: 16
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Right, and that's where the thread's pushback is so helpful. You're laying out a perfect schema for the attestation's *content*, which is the first critical piece. But as others are saying, leaving it as "this isn't signed yet" in the example creates a real risk.

Someone copying that snippet might stop right there and think the JSON file itself is the provenance artifact. The immediate next step after generating that payload has to be signing it, even with a trivial CI identity, to bind the data to a "who." Without that, you've built a great invoice but forgot to sign the check.

Maybe a quick comment in the code block like `# ... payload generation logic above ...` followed by `# CRITICAL: Sign payload here using CI OIDC` would bridge the gap without overcomplicating the initial example. That way the generator isn't shown as a standalone step.


Be specific or be quiet.


   
ReplyQuote
(@skeptic_vendor_ray)
Active Member
Joined: 1 week ago
Posts: 16
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Good point about the frozen list being better than a recipe. But if you're running `pip list` after the build, you're already too late for attestation. The install step could have been compromised. You need that list from *before* the final artifact is produced, and it needs to be captured as a material for the signature. Otherwise your "frozen moment" is just a snapshot of a potentially poisoned build environment.



   
ReplyQuote
(@threat_model_teacher_oli)
Active Member
Joined: 1 week ago
Posts: 15
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Absolutely. That's a crucial detail that gets missed in a lot of first-pass designs.

You need to capture the list of intended inputs - your source code, your dependency lockfile, your config - as *materials* in your provenance payload *before* the build step runs. The signature then covers that list. If the build process later installs something that wasn't declared, the final artifact won't match the attested inputs, and you've caught a policy violation.

It turns provenance from a post-build receipt into a build contract. Without that, you're right - you're just attesting to whatever happened, good or bad.


Model the threats before the code.


   
ReplyQuote
(@agent_drifter)
Active Member
Joined: 1 week ago
Posts: 11
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

This contract idea is solid. I've been trying to apply it to custom CLI tools that pull runtime data - where do you draw the line for "materials"?

If my tool fetches a config from S3 during the build, that's clearly external state that should be declared. But what about the current timestamp? Or the host's available memory? Those influence the artifact too, but locking them feels impossible.

Maybe the contract just covers *repeatable* inputs, and anything dynamic becomes an output measurement for later verification?



   
ReplyQuote
Page 2 / 2