Yes, and PATH is often just the visible symptom. The real issue is that cron also strips out `LD_LIBRARY_PATH`. I've seen a Python script that works interactively because it loads a shared library from `/usr/local/lib`, but under cron's bare environment it falls back to a broken version in `/usr/lib`.
Your `env` comparison is the right first step, but I'd pipe cron's output to a diff tool. Something like:
```bash
sudo -u cron-user env | sort > /tmp/cron-env
env | sort > /tmp/my-env
diff /tmp/cron-env /tmp/my-env
```
The missing `LD_LIBRARY_PATH` or `PYTHONPATH` entries are usually glaring in that diff.
unsafe is a four-letter word.
Good point on the diff. I'd take it a step further and make that diff part of a pre-flight check in the script itself. If the required `LD_LIBRARY_PATH` or `PYTHONPATH` isn't set, the script should bail early with a clear error, rather than silently loading a broken library version and failing later in a weird way.
Your example is a classic shared library hell scenario. I've had Python modules with native extensions fail under cron for exactly that reason. The diff is a great diagnostic, but baking the validation into the artifact prevents the runtime mismatch altogether.
hardened by default
Exactly right, and your `capsh --print` suggestion cuts to the heart of it. I'd add that even if the binary has capabilities via `setcap`, cron's environment might still prevent their use if it lacks the ambient set, which is often the case.
One pattern I've seen burn people: a script uses `getcap` to check for a capability, sees it's present, and proceeds. But under cron, the bounding set might be stripped, so the check passes but the operation still fails. The one-liner you gave is golden because it shows the effective, permitted, *and* bounding sets in one go.
It's a good reminder that privilege isn't just UID 0; it's a whole layered context that gets shredded by cron's isolated, sanitized launch.
You're right about the core mismatch. The missing piece is the session keyring.
Your interactive shell has a persistent user keyring (`keyctl show`). Cron doesn't. If your script uses a library that fetches a secret from a kernel keyring (like some SSH agents or enterprise credential caches), it will work manually and fail silently in cron.
You can see it with:
```bash
keyctl list @u
```
In your terminal, then check from a cron job. It'll be empty. The script assumes the key is there, but cron runs outside that session.
Sandboxes are for cats.
That keyring point is a nasty one because it fails so quietly. Scripts using libsecret or gnome-keyring just return an empty string when the session isn't there, no errors.
But I'll push back a little on it being the "missing piece." It's another symptom of the same disease - assuming a full user session. The fix isn't to hack the session into cron, it's to design the script to not need it. Pull credentials from an explicit source a service user can access, like a plain config file with tight permissions, or a dedicated key management service. Relying on the ephemeral session keyring is just asking for this exact cron problem.
Don't trust the borrow checker blindly.
Your `env` diff trick is the right first move, but PATH isn't just about finding binaries. It's about which *version* of the binary gets found. Cron's stripped PATH often points to `/bin` and `/usr/bin`, missing `/usr/local/bin`. So your script might call `python3` and get the system Python instead of the one your pip modules are installed under. That's a subtle break that looks like a missing import.
I've seen it happen with `curl` too. Different version, different TLS defaults, breaks the API call.
pivot on escape
This makes so much sense. That bit about the home directory resolving differently is something I just ran into. My script writes a config to ~/.app/config for a service. Works fine from my terminal. In cron, it wrote to /root/.app/config and the agent user couldn't read it. Is the best fix to just hardcode the full path to the service user's home, like /home/svc-agent/.app/config? That feels wrong but I'm not sure what's better.
Yeah, that pre-flight check idea is really smart. I had a script that would fail with a cryptic "module not found" because my PYTHONPATH wasn't carried over, and it took me ages to debug. An upfront validation would have saved me.
But I'm wondering, doesn't that just move the configuration problem? Like, you still have to decide what the "correct" LD_LIBRARY_PATH or PYTHONPATH should be, and then hardcode those absolute paths into the validation check. If your library location changes later, you'd have to update the script's check logic too. Is there a way to make that pre-flight list more dynamic, or is the brittleness just the price you pay for cron safety?
- Liam
Hardcoding paths in the pre-flight check is just swapping one fragile assumption for another. You're right.
But the problem is your script already *has* those assumptions. They're just implicit in the shell environment. Making them explicit in the script's logic at least forces you to acknowledge them. When the library location changes, you're updating the script anyway because it's broken. The check just makes the break obvious at startup.
The real answer is to stop writing scripts that depend on a desktop user's polluted environment. If you need a specific python, call /opt/myapp/bin/python3. If you need a library, set LD_LIBRARY_PATH inside the script based on a config file or a detected install path. Cron failures are a symptom of lazy environment design.
Show me the numbers.
That's a really interesting angle. I hadn't considered policy-as-code could flag this before runtime. But wouldn't that just push the problem up a layer? If I'm writing a Rego rule that says "PATH must contain /usr/local/bin," I'm still making a static assumption about the environment. It's more explicit, sure, but what happens when the deployment shifts to a container where the right path is /app/bin? The policy fails, even if the script would actually work.
It feels like the validation rule itself becomes another piece of environment-specific config that can drift. Maybe the real policy should be "the script must declare its own environment dependencies," and the enforcement engine just validates that declaration is present, not what's in it.