Retrieval-augmented generation (RAG) is only safe if it respects access control. Unlike classic search, a single leaked paragraph can be summarised into an authoritative answer and spread quickly. Permission design must be explicit, tested, and observable.
Start with an explicit permission model
Define what a request is allowed to access: user identity, group membership, tenant, and any attribute-based constraints (role, region, project). Make these entitlements available at query time and treat them as part of the request context.
Enforce permissions at retrieval
The safest pattern is to apply permission filters before retrieval results are scored or reranked. This requires disciplined metadata. Treat access control metadata as mandatory fields in your ingestion pipeline (see metadata strategy). If you must post-filter, measure how often post-filtering removes all candidates and what the fallback experience is.
Choose an isolation pattern
There are two common patterns:
- Separate indexes per tenant/domain. Strong isolation and simpler filtering, but higher operational overhead.
- Shared index with strict filters. Lower overhead, but requires careful testing and strong guardrails (see multi-tenancy design).
Handle caching and reranking carefully
Permission bugs often come from caches. If you cache answers or retrieved chunks, include user/tenant entitlements and relevant policy versions in the cache key. Avoid cross-tenant caches unless they are provably safe (see safe caching).
Prove you do not leak
Build tests that try to break your controls:
- Canary documents. Seed unique phrases into restricted content and alert if they appear in answers for unauthorised users.
- Permission boundary suites. For each role/tenant, run a standard query set and compare retrieved sources.
- Adversarial prompts. Attempt prompt injection and role confusion (see prompt injection defence and red teaming).
Audit trails and incident response
For every answer, record: who asked, what entitlements were applied, what sources were retrieved, and what policy version was in force. This supports investigations and compliance (see incident response and privacy threat modelling).
RAG permissioning is not a single filter. It is a system of metadata, enforcement points, and tests that keep knowledge access safe over time.