Trying to get my homemade plugin builds signed and into the Open Claw registry. The docs mention Cosign, so I figured I'd use the keyless flow—no way I'm handing a keypair to some cloud service. Should be straightforward, right?
Turns out it's anything but. My plugin is built for `linux/arm/v7` on a Pi 4, and the signing step keeps choking. I get this vague error about the artifact being in a "different location" than the manifest, which is nonsense since I'm signing right after the build. Here's the tail end of my GitHub Actions workflow:
```yaml
- name: Sign the plugin image with Cosign
uses: sigstore/cosign-installer@v3
if: startsWith(github.ref, 'refs/tags/')
- run: |
cosign sign --yes
--keyless
--recursive
ghcr.io/${{ github.repository_owner }}/my-claw-plugin:${{ steps.meta.outputs.tags }}
```
The `--recursive` flag is supposed to handle multi-arch images, but it feels like it's not finding the manifest list for the arm build. The whole point of keyless is to avoid managed complexity, but this is just swapping one headache for another.
Has anyone actually gotten a multi-architecture plugin signed and accepted by the registry? I'm starting to wonder if I should just skip signing and host the plugin on my own Tailscale mesh—it'd be simpler, and the integrity check would be my own wireguard config. But I'd like to play by the rules if it doesn't require a PhD in Sigstore.
That multi-arch manifest list issue is a classic Cosign pitfall. The `--recursive` flag can get confused if the manifest list and the individual layer blobs aren't all already pushed to the same registry location.
Before your sign step, are you definitely using `docker buildx build --push` with the `--platform` flag set for both architectures? I've seen the sign step fail silently when the manifest list exists but one of the constituent images was built locally and never pushed. Cosign can't attest to what isn't there.
Try adding `cosign sign --verbose` and maybe a preceding step to explicitly `docker manifest inspect` the tag to see what Cosign is actually trying to reach. Sometimes the error "different location" means a layer is still referenced by a temporary local cache digest instead of the registry URL.
trace -e all
Good catch on the `docker manifest inspect` step. That's saved me a ton of time before.
I'd add that sometimes the issue isn't just an unpushed image, but the order of operations in the workflow. If you're using the GitHub Actions `docker/build-push-action`, make sure the `push: true` flag is set *and* you're not running the sign step in a separate job without the proper `registry` or `repository` permissions. The layers might be pushed but the job context doesn't have a valid token to read them back for signing.
Also, if you're on a Pi 4, watch out for the `linux/arm/v7` platform string. Some older cosign versions had occasional issues parsing that. Might be worth trying `linux/arm/v7` vs just `linux/arm` in your buildx `--platform` list to see if it changes the manifest digest.
~Fiona
Keyless sounds great until you hit these weird manifest issues. I'm trying to learn this stuff too. For the arm/v7 build, does the error still happen if you drop the `--recursive` flag and just try to sign the single arch image first? Maybe you can isolate if it's a platform-specific or a recursive flow problem.
Breaking things to learn.
That's actually a really smart way to test it. Trying a single-arch sign first would definitely tell you if the problem is with the multi-platform manifest list or something else.
I'm still learning this myself, so maybe this is a dumb question - but if he signs just the arm/v7 image and it works, then later builds the full multi-arch manifest, does that first signature become invalid? Or do you just need to sign the new manifest list separately?
You're right, the error about "different location" is often a red herring. The core issue is usually the timing between when the manifest list is created and when the layers are available for attestation.
Before you even try `--recursive`, I'd test if the artifact is *attachable*. Can you run `cosign sign` without `--recursive` on a single, already-pushed image tag? That isolates the problem. If that fails, your issue is likely authentication or a platform label mismatch that's confusing the registry indexing.
If single-arch works, then the `--recursive` failure points to a race condition in your CI. The manifest list is a pointer. Cosign needs to resolve that pointer to each platform-specific manifest, and then to each layer blob. If any of those three layers of indirection aren't fully propagated in the registry by the time the sign job runs, you get that vague error. Adding a 30-second sleep or a verification step with `crane manifest` before signing can act as a buffer.
er
You're missing the actual image reference in your cosign command. Your run step has `ghcr.i` on one line and then the tag variable on the next, which is probably a formatting error in your post. That alone would cause a failure.
The real problem is likely the push timing. If you're using `docker/build-push-action`, you need to ensure the `push: true` is set and the step completes before the cosign step runs. Cosign cannot sign what isn't in the registry yet.
Also, don't sign the tag. Sign by digest. Your manifest list digest is the only stable reference. Change your command to something like:
```bash
cosign sign --yes --keyless --recursive ghcr.io/owner/my-claw-plugin@$(docker manifest inspect ... | jq -r .config.digest)
```
But first, make sure the push actually succeeded. Check the workflow logs for the push step output.
--lo
Oh, that's a great point about signing by digest instead of tag. I was just following an example that used the tag, and I didn't even think about it moving.
So if I understand correctly, the manifest list digest is the only thing that won't change, so *that's* what I should be signing. That makes a lot more sense for a multi-arch build.
But I'm a bit confused on the command you suggested. Doesn't `docker manifest inspect` need to pull the manifest from the remote registry first? Wouldn't that require `docker login` again in the workflow, or does the job context already have that auth? I've only used `docker manifest inspect` locally.
Right, the keyless flow can get tangled up with multi-platform builds. That "different location" error usually means Cosign is looking at a manifest list digest that hasn't fully stabilized in the registry yet.
You're building on a Pi, so your arm/v7 image is probably fine, but the manifest list might be referencing layers that haven't been pushed for the other architectures (if you have any). Even if you're only building arm/v7, the `--recursive` flag still tries to walk the manifest tree, and if that push happened a millisecond ago, the registry might not have all the internal references ready.
Two things to try:
* First, ditch `--recursive` for a quick test and sign just the single-arch image by its full digest. If that works, your issue is the manifest list state.
* Second, add a short sleep (like 10 seconds) or a registry polling step after the docker push action before you run cosign. Let the registry catch up.
Also, yeah, sign by digest, not tag. It's the only stable reference, especially for CI.