LLM endpoints behave like untrusted third-party services sitting inside your application boundary. Treat them with the same zero-trust posture you apply to payments, identity or analytics vendors.
Start with data boundaries. Classify data that may flow to models, redact high-sensitivity fields by default, and prefer retrieval over payload stuffing. Use provider settings that disable training on submitted data, and contractually lock data residency and retention.
Replace shared API keys with workload identity and short-lived credentials. Issue per-service tokens, enforce request signing and run calls through a service mesh or API gateway that can inject identity, rotate secrets and enforce least privilege policies.
Isolate execution. Use dedicated egress per provider with allowlists, deterministic routing rules and traffic tagging so you always know which workload spoke to which endpoint. Segment tenants in memory and storage, and run dangerous file inputs through malware scanning and content validation before they reach prompts.
Instrument for visibility and response. Capture structured logs of prompts, tool calls and responses with user and request context. Add policy engines that can flag PII leakage, prompt injection patterns, abnormal tool invocation or high-cost bursts, and wire alerts to on-call runbooks.
Manage provider risk as part of your control tower. Assess certifications, data handling clauses and breach notification timelines; enable kill switches and failover providers; and routinely test posture with red teaming and chaos exercises that simulate provider outages or compromised credentials.