Just ran a basic static analysis script over our ~50 production LangGraph definitions. The results aren't surprising, but the scale is.
Primary issues found:
* 80% of graphs have at least one tool node with no explicit error state handling. Fails open.
* 30% pass raw, uncleaned LLM output directly into a conditional edge. Prompt injection into the graph flow is trivial here.
* Multiple instances of entire conversation state being checkpointed to a DB, including system prompts and internal reasoning. That's a data leak waiting for a subpoena.
* Heavy use of LangSmith tracing by default. We're paying to record and store PII and internal decision logic on a vendor's server.
Cost-benefit is negative. We're building complex, stateful workflows without the basic guardrails you'd demand for any other data pipeline. The hype is about "agentic AI," but the reality is increased attack surface and opaque data flows.
Show me the residual risk.