Auth & Cross-Tenant Access Control

Multi-tenant SaaS platforms must enforce hard identity boundaries without breaking legitimate B2B collaboration. This reference defines how to isolate tenant identities, propagate context across services, and authorize controlled cross-tenant workflows while keeping every action auditable.

Isolation and collaboration sit on a continuum. Over-isolating breaks partner integrations and delegated administration; under-isolating risks data leakage and compliance failure. Explicit trust boundaries, deterministic policy evaluation, and immutable telemetry prevent both failure modes.

Auth Model Overview

The authentication boundary you choose must align with your underlying data architecture and your regulatory profile. The isolation model dictates how identity providers, session stores, policy engines, and API gateways partition traffic. It also constrains how densely you can pack tenants onto shared infrastructure and how much per-request latency policy evaluation adds.

There are three dominant boundary models, plus a hybrid. Physical isolation gives each tenant a dedicated identity realm, session store, and credential boundary. Logical isolation shares the identity plane and discriminates on a tenant_id claim enforced at every layer. Federated isolation delegates authentication to an external identity provider per tenant and maps asserted claims to internal roles. Most production platforms run a hybrid: logical isolation by default, with physical or federated boundaries reserved for regulated or enterprise tenants.

Boundary model	Boundary enforcement	Tenant density	Query/eval latency	Operational overhead	Compliance fit
Physical (realm-per-tenant)	Hard — separate IdP realm + session store	Low (100s–1000s)	Low (local lookups)	High (per-tenant provisioning)	HIPAA / FedRAMP / regulated
Logical (shared + `tenant_id` claim)	Policy-dependent — claim + RLS + middleware	Very high (millions)	1–8ms (cached policy)	Low (shared plane)	Standard SaaS / SOC 2
Federated (external IdP per tenant)	Strong — trust boundary at SP	Medium–high	Medium (token exchange)	Medium (trust lifecycle)	Enterprise / B2B
Hybrid (logical default + carve-outs)	Defense-in-depth	High overall	Mixed	Medium–high	Mixed estates

Boundary enforcement must occur at the ingress layer. Never trust a downstream service to derive or re-validate tenant context on its own — a service that infers tenancy from a path parameter or a header it did not verify is one bug away from a cross-tenant read.

The control plane: tenant context is established once at the edge, authorized centrally, and carried — never re-derived — through every downstream hop.

Core Architecture & Pattern Variants

Authentication boundaries must mirror the data boundaries described in the Multi-Tenant Database Isolation Models reference. An identity plane that allows a token to address a tenant the data layer cannot isolate is a security gap waiting to be found.

Physical (realm-per-tenant). Each tenant owns a dedicated identity realm — a separate Keycloak realm, Auth0 organization, or Cognito user pool — and frequently a dedicated session store and signing key. Cross-tenant addressing is impossible by construction because tokens minted in one realm are not valid in another. The cost is provisioning and operational multiplication: every realm needs lifecycle automation, key rotation, and monitoring. This model suits regulated tenants where a hard cryptographic boundary is a contractual requirement.

Logical (shared plane + tenant_id claim). All tenants authenticate through one identity plane. A signed tenant_id claim discriminates every request, and that claim is enforced redundantly at the gateway, the policy engine, the application middleware, and the database via row-level security. This is the densest and cheapest model and the default for most SaaS. Its weakness is that correctness depends entirely on the claim being present, validated, and propagated unmodified — a single unguarded query is a leak. Claim structuring, validation, and key rotation are covered in depth under tenant-aware JWT and token management.

Federated (external IdP per tenant). Enterprise tenants bring their own identity provider over OIDC or SAML. Your platform is the Service Provider; you map asserted claims and group memberships to internal tenant-scoped roles. This delegates credential management and lifecycle to the customer while keeping authorization yours. The trust boundary, attribute mapping, and metadata lifecycle are detailed in SSO mapping and identity federation.

Authorization across all variants is where tenancy is actually enforced at request time. Granular, tenant-scoped permissions prevent privilege escalation across boundaries. Hierarchical role structures mirror organizational reporting lines; flat structures simplify evaluation but lack nuance; hybrid models assign base roles with scoped overrides. Externalize the decision into a policy engine so that policy is versioned, testable, and auditable independently of business logic. The per-tenant permission modeling, delegation, and evaluation patterns live under role-based access control per tenant.

The choice between these variants is rarely binary. A mature platform typically runs the logical model as its default — because that is the only model that scales to millions of tenants on shared infrastructure — and then promotes individual tenants to physical or federated boundaries as contracts, regions, or threat models demand. What makes this tractable is keeping the authorization contract stable regardless of how a tenant authenticates: whether a principal arrives via a shared user pool, a dedicated realm, or a partner SAML assertion, by the time the policy engine evaluates the request it sees the same normalized inputs — tenant_id, scope, role, and a delegation context. Designing the policy input schema first, and treating every authentication variant as merely a producer of that schema, is what lets a platform mix models without forking its authorization logic.

A common mistake is to entangle the boundary model with the authorization rules. Teams that hardcode "if tenant is on the enterprise realm, allow X" end up with policy that drifts per tenant and cannot be reasoned about or tested as a unit. Keep the realm or federation detail out of the authorization rules entirely; resolve it during context establishment and discard it once the normalized claims exist. The policy engine should never know — or care — how the principal authenticated.

Policy engine	Evaluation latency	Flexibility	Scaling characteristics	Best fit
OPA (Rego)	2–8ms (cached)	High (Turing-complete DSL)	Horizontal via sidecars	Complex cross-tenant delegation
Cedar	1–4ms (compiled)	Medium-high (typed schema)	Embedded or service-based	High-throughput, schema-validated authz
Custom RBAC middleware	<1ms	Low (hardcoded logic)	Coupled to the app	Simple flat-role SaaS

Tenant Routing & Context Propagation

Tenant context must survive across the gateway, microservices, async workers, and third-party integrations without ever being silently recomputed. The token issued at the edge is the single source of truth: it carries tenant_id, the principal sub, scope, and aud, all signed. Embedding tenancy directly in the token avoids a directory lookup per request at the cost of a marginally larger token; the trade-off is almost always worth it.

Context propagation is a layered problem. At the edge, the gateway validates the signature against cached JWKS and rejects malformed or expired tokens before they reach any service. Between services, the verified claims are forwarded as a signed internal header or a re-minted internal token — never as an unsigned hint that a downstream service trusts blindly. Across async boundaries — queues, scheduled jobs, webhooks — the tenant context must be serialized into the job payload and re-asserted when the job runs, because the original request principal is long gone. The same propagation discipline underpins the data layer described in Tenant-Aware Data Routing & Query Scoping.

Routing layer	Where tenant is resolved	Enforcement mechanism	Failure mode if skipped
Edge / API gateway	`tenant_id` claim in JWT	Signature + `aud` + TTL check	Forged or expired tokens accepted
Service mesh / RPC	Signed internal header or re-minted token	mTLS + header validation	Spoofed internal calls
Application middleware	Request-scoped context object	Reject if claim absent	Default-tenant fallthrough
Database	Session `SET app.tenant_id`	Row-level security policy	Unscoped query reads all tenants
Async / queue	Serialized into job payload	Re-assert on dequeue	Job runs with wrong/no tenant

Each hop independently verifies the signature and re-asserts the tenant boundary; no service trusts an unverified upstream assertion.

For B2B partnerships, federation establishes trust without merging directories. A partner IdP asserts a user identity over SAML or OIDC; the gateway validates the assertion signature and certificate against a trust registry, maps external attributes and group memberships to platform-scoped roles, strips unnecessary PII, and injects the resolved tenant_id. Never enable open federation — require signed metadata, certificate pinning, and explicit allowlists, with mutual TLS for high-risk integrations.

Compliance & Auditability Alignment

Auth telemetry is where security meets the auditor. SOC 2 Type II, HIPAA, GDPR, and FedRAMP all require that authentication events, authorization decisions, and cross-tenant access grants be logged immutably and attributable to a verifiable principal and tenant. The unifying primitive is a single tenant_context_id that flows across the auth, API, and billing pipelines so a delegated action can always be traced back to its originating tenant and the policy result that allowed it.

Immutable retention requires cryptographic integrity, not just append-only intent. Chain log entries with a hash of the prior entry, or commit periodic Merkle roots to tamper-evident storage, and rotate signing keys on a defined schedule. This makes silent deletion or back-dating detectable, which is the property assessors actually test. The full audit architecture and per-tenant artifact generation are covered under Multi-Tenant Compliance & Data Governance.

What distinguishes an auditable auth plane from a merely logged one is the granularity of what gets recorded at the decision point. It is not enough to log "user X accessed resource Y." A defensible record captures the principal, the originating tenant, the target tenant if different, the exact scope and role asserted, the policy bundle version that evaluated the request, the decision (allow or deny), and the rule that produced it. When an assessor or an incident responder asks "why was this cross-tenant read permitted," the answer must be reconstructable from the log alone, without re-running the system. Logging denials matters as much as logging grants: a sustained pattern of denied cross-tenant attempts is often the first signal of an active probe.

Cross-tenant delegation is the hardest case for auditability because the acting principal and the authorizing principal differ. A partner user operating under a delegation grant must be logged with both identities — who acted, and which tenant admin authorized the delegation that made it possible — plus the delegation's scope and expiry. Without that linkage, a delegated action looks like an unexplained cross-tenant access and fails review. Standardizing the tenant_context_id across the request lifecycle is what stitches these records together so a single delegated operation appears as one coherent, attributable trail rather than several disconnected events.

Framework	Auth control it demands	How the auth plane satisfies it
SOC 2 Type II	Logical access logging + review	Immutable auth + policy-decision logs, indexed by tenant
HIPAA	Access controls + audit of PHI access	Per-tenant RBAC, hash-chained access logs, scoped sessions
GDPR	Lawful access + erasure traceability	`tenant_context_id` links access to data-subject scope
FedRAMP	Strong boundary + continuous monitoring	Physical/federated realm per tenant, real-time anomaly alerts

Static policy review catches misconfiguration; dynamic monitoring catches active exploitation. Baseline each tenant's normal behavior — login frequency, API call volume, delegation rate — and alert on deviations. The table below maps common signals to response actions.

Telemetry signal	Detection threshold	Response action	Escalation
Cross-tenant API spike	>500% baseline in 5m	Rate limit + step-up auth	Tenant admin
Invalid claim injection	>10 attempts/min	Block IP + revoke session	SOC Tier 1
Delegation scope abuse	>3 unauthorized grants	Suspend partner token	Security lead
Concurrent session anomaly	Distinct IPs >2	Force re-auth + invalidate	Automated

Billing Sync & Metering Architecture

Cross-tenant calls are where usage metering quietly breaks. A delegated request that touches two tenants can be double-counted or attributed to the wrong tenant, inflating one invoice and under-billing another. The fix is to scope every usage meter to the tenant boundary established at auth time and to carry an explicit delegation flag so a delegated action is metered once, against the correct tenant. Internal health checks and partner sync traffic must be excluded from billable metrics entirely.

The metering pipeline consumes auth and API events, correlates them by tenant_context_id, deduplicates within a short window keyed on a stable event_id, and only then meters and syncs. The component table below shows the path; the detailed pipeline patterns live under Tenant Billing & Usage Metering.

Component	Responsibility	Tenant-scoping mechanism
Event bus (Kafka)	Durable ingest of auth/API events	`tenant_id` in key + payload
Dedup processor	Drop replays within window	Idempotency key = `event_id`
Correlation engine	Attribute delegated calls	Join on `tenant_context_id`
Metering store	Aggregate billable units	Per-tenant partitioned counters
Billing sync	Push to invoicing/Stripe	Per-tenant customer mapping

Migration & Hybrid Strategies

Most platforms do not pick one model and stay there. They start logical for density, then carve out physical or federated boundaries for tenants that demand them. The migration is rarely a flag flip — it is a phased re-issuance of identity.

Migrating a tenant from the shared logical plane to a dedicated realm proceeds in stages: stand up the new realm and mirror its roles; dual-issue tokens so both old and new are accepted during a cutover window; migrate session state into the tenant-scoped store; switch the gateway to route the tenant's hostname to the new realm; and finally revoke the old tokens once the refresh-token TTL window has fully drained. Throughout, the tenant_id claim stays stable so authorization policy and audit correlation never break.

The hard constraints are token TTL and session lifetime. You cannot safely cut over faster than your longest-lived refresh token, and you cannot revoke old credentials until you can prove no live session depends on them — which is exactly why instant revocation, via short access-token TTLs plus a denylist or token versioning, is worth building before you ever need to migrate. The same dual-write discipline mirrors data-layer migrations such as moving from a shared database to schema-per-tenant in the database isolation models reference.

Hybrid estates introduce their own steady-state complexity that outlasts any single migration. Once some tenants authenticate through a shared pool and others through dedicated realms or partner IdPs, the gateway must route by hostname or tenant claim to the correct validation path, and the JWKS cache must hold keys for every active realm. Token validation logic has to select the right key set per tenant without leaking a key meant for one boundary into the validation of another. The discipline that keeps this sane is the same one that enables migration: a stable tenant_id claim and a normalized authorization input, so that the only thing that varies across boundaries is which keys verify the signature, not how the request is authorized once verified.

Plan migrations around observable readiness gates rather than calendar dates. Before flipping the gateway route, confirm that the new realm issues tokens accepted by every downstream service, that session state has fully replicated, and that audit correlation still resolves a tenant_context_id end to end. Before revoking old credentials, instrument the proportion of requests still arriving on old-realm tokens and wait for it to reach zero across a full refresh-token lifetime. Treating these as gates — not assumptions — is what separates a clean cutover from a silent partial outage where a subset of a tenant's users are quietly locked out.

Implementation Reference

The following snippets are deliberately minimal and runnable, each illustrating one enforcement point.

Validate and pin tenant context from a JWT at the edge (TypeScript / Express). The middleware refuses any request whose token lacks a tenant_id or scope, then pins an immutable context object that downstream handlers read instead of re-deriving tenancy.

import type { Request, Response, NextFunction } from "express";
import { verifyJwt } from "./jwks"; // verifies against cached JWKS

export async function tenantContext(req: Request, res: Response, next: NextFunction) {
  const token = req.headers.authorization?.replace(/^Bearer\s+/i, "");
  if (!token) return res.status(401).json({ error: "missing_token" });

  let payload;
  try {
    payload = await verifyJwt(token); // throws on bad sig / expiry / aud
  } catch {
    return res.status(401).json({ error: "invalid_token" });
  }
  if (!payload.tenant_id || !payload.scope) {
    return res.status(403).json({ error: "missing_tenant_claims" });
  }

  Object.defineProperty(req, "tenant", {
    value: Object.freeze({
      tenantId: payload.tenant_id,
      principalId: payload.sub,
      scope: String(payload.scope).split(" "),
      contextId: payload.tenant_context_id ?? payload.jti,
    }),
    writable: false,
  });
  next();
}

Authorize a cross-tenant request with OPA (Rego). Access defaults to deny. A delegated read is allowed only when the caller holds the partner:read scope and the delegation target matches the resource's tenant; broad cross-tenant management additionally requires an active delegation record supplied by the OPA data bundle.

package tenant.authz

default allow = false

allow {
    input.method == "GET"
    input.context.scope[_] == "partner:read"
    input.context.delegated_tenant_id == input.resource.tenant_id
}

allow {
    input.context.role == "admin"
    input.context.scope[_] == "cross_tenant:manage"
    delegation_active(input.context.delegation_id)
}

delegation_active(id) {
    data.delegations[id].status == "active"
    data.delegations[id].expires_at > input.now
}

Enforce the tenant boundary at the database with PostgreSQL RLS (SQL). Even if every application guard is bypassed, the row-level policy ties visible rows to the per-session app.tenant_id, providing the last line of defense.

ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON documents
  USING (tenant_id = current_setting('app.tenant_id')::uuid)
  WITH CHECK (tenant_id = current_setting('app.tenant_id')::uuid);

-- The application sets this once per connection/transaction, from the
-- verified JWT claim — never from user-supplied input.
SET app.tenant_id = '8f2b...';  -- bound parameter in real code

Store sessions with tenant-scoped keys and bulk-invalidate (Python / Redis). Key prefixing prevents cross-tenant collision; a single namespaced scan invalidates an entire tenant's sessions on a security event.

import json, hashlib, redis

class TenantSessionStore:
    def __init__(self, url: str):
        self.client = redis.Redis.from_url(url, decode_responses=True)

    def _key(self, tenant_id: str, session_id: str) -> str:
        digest = hashlib.sha256(session_id.encode()).hexdigest()[:12]
        return f"sess:{tenant_id}:{digest}"

    def store(self, tenant_id: str, session_id: str, data: dict, ttl: int = 3600):
        self.client.setex(self._key(tenant_id, session_id), ttl, json.dumps(data))

    def invalidate_tenant(self, tenant_id: str):
        cursor = 0
        while True:
            cursor, keys = self.client.scan(cursor, match=f"sess:{tenant_id}:*", count=500)
            if keys:
                self.client.delete(*keys)
            if cursor == 0:
                break

Register a partner IdP client for federated onboarding (bash / OIDC dynamic registration). Dynamic client registration provisions a partner integration with private_key_jwt auth and a constrained scope, keeping credential management on the partner side.

curl -X POST https://idp.example.com/oidc/register \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{
    "client_name": "PartnerCorp-Integration",
    "redirect_uris": ["https://partner.example.com/callback"],
    "grant_types": ["authorization_code", "refresh_token"],
    "response_types": ["code"],
    "token_endpoint_auth_method": "private_key_jwt",
    "jwks_uri": "https://partner.example.com/.well-known/jwks.json",
    "scope": "openid profile cross_tenant:read"
  }'

Pitfalls & Anti-Patterns

Shared session store without tenant-scoped keys. Writing sessions under a flat key namespace lets one tenant's session id collide with or be enumerated against another's, leaking state across boundaries. Always namespace keys as sess:{tenant_id}:{id} and consider dedicated instances for regulated tenants. The partitioning and invalidation topology is detailed under session isolation and state management.

Trusting unverified internal context. Treating "internal" service-to-service traffic as inherently safe — forwarding a plaintext X-Tenant-Id header that the receiver trusts without verification — turns one spoofable hop into a full cross-tenant breach. Re-validate a signed token or enforce mTLS at every service boundary, not just the edge.

Default-tenant fallthrough. Middleware that falls back to a "default" tenant when the claim is missing converts an authentication failure into a silent data exposure. Absence of a verified tenant_id must be a hard rejection, never a default.

Over-permissive cross-tenant roles. Granting a broad cross_tenant:* scope instead of a narrowly delegated grant defeats least privilege and is nearly impossible to audit. Model delegation as an explicit, expiring record validated by the policy engine, as described under role-based access control per tenant.

No instant revocation path. Relying solely on long-lived tokens means a compromised credential stays valid until natural expiry. Pair short access-token TTLs with token versioning or a denylist so a single event invalidates sessions across every service immediately.

Billing double-counting from uncorrelated calls. Metering each leg of a delegated cross-tenant request independently inflates usage and corrupts invoices. Correlate by tenant_context_id and deduplicate on a stable event_id before ingestion.

FAQ

How do you allow cross-tenant access without breaking data isolation? Model it as explicit, expiring delegation: issue a scoped delegation token, authorize it at the gateway through the policy engine against the resource's tenant_id, and record the grant in an immutable audit log. The data layer's row-level security still enforces the boundary, so an over-broad token cannot read rows the policy did not intend.

Is physical isolation required for SOC 2 or HIPAA? Not strictly. Logical isolation with strong per-tenant RBAC, cryptographic key separation, and immutable audit logging typically satisfies SOC 2 and HIPAA when properly documented and tested. Regulated tenants or FedRAMP profiles may still require a physical or federated boundary; confirm specifics with a qualified assessor.

What is the performance cost of tenant-aware token validation? Minimal for stateless JWTs validated against cached JWKS — signature checks are sub-millisecond. Latency grows only with complex policy evaluation (typically 1–8ms cached) or synchronous calls to an external identity provider, which should be avoided on the hot path.

How do you revoke access across many tenants at once? Combine short-lived access tokens with token versioning (bump a per-principal or per-tenant version that invalidates outstanding tokens) or a centralized denylist keyed on jti. Broadcast the revocation over pub/sub so every service rejects the credential on its next request.

How do you prevent billing discrepancies from cross-tenant API calls? Carry a unified tenant_context_id from auth through API to metering, flag delegated calls explicitly, and deduplicate on a stable event_id before the metering store aggregates. This ensures a delegated action is counted exactly once against the correct tenant.