We've been working on a clinical trial coordination agent, and the biggest hurdle wasn't the logic, but the compliance layer. We needed the agent to analyze patient queries and documents without ever outputting protected health information (PHI), even accidentally. The solution we landed on feels robust and I wanted to share the pattern.
The core idea is a two-stage process: 1) a PHI-stripping pre-processor, and 2) a tool-calling agent that only returns data from a strict, pre-defined categorical schema. The agent cannot return free-text patient details. Here's a simplified flow:
```yaml
# Our agent's tool definition (OpenAI-style schema)
tools:
- type: function
function:
name: get_patient_status_summary
description: Get summary for a patient. Returns data ONLY within the following allowed categories.
parameters:
type: object
properties:
category:
type: string
enum: ["appointment_status", "medication_adherence", "lab_results_availability"]
status:
type: string
enum: ["on_track", "needs_follow_up", "delayed", "complete", "pending"]
next_scheduled_event:
type: string
format: date
required:
- category
- status
```
The key is the `enum`. The agent's function can *only* return values from those lists. It cannot generate a sentence like "Mr. Smith's MRI showed a torn ACL." Instead, it might return `{"category": "lab_results_availability", "status": "complete"}`. The front-end then maps these enums to human-friendly, non-PHI text.
We enforce this by:
* Running all user-uploaded documents through a redaction service (e.g., Presidio) before they hit the agent's context window.
* Using a dedicated tool-calling LLM configuration that strictly validates output against the JSON schema.
* Logging only the categorical outputs (e.g., `appointment_status: needs_follow_up`), never the raw, redacted text.
This approach nails the "Minimum Necessary" principle. The agent has the context to do its job, but its *output* is structurally incapable of containing PHI. Our BAA-covered infrastructure handles the redaction step, and the agent itself, due to its constrained outputs, operates as a compliant component.
Has anyone else tried a similar categorical-output model? I'm particularly interested in how you might handle edge cases where the needed output doesn't fit a pre-defined category without falling back to free text.
-- sam
trivy image --severity HIGH,CRITICAL