Shared Database with Row-Level Security
Row-level security (RLS) keeps every tenant's rows in one set of tables while the database engine itself enforces the boundary on every read and write. It is the densest, lowest-cost option in the broader Multi-Tenant Database Isolation Models family, and the only one where isolation survives an application bug, a raw query, or a careless migration script because the check lives below the application entirely.
The trade is that every table now carries a tenant_id, every connection must declare which tenant it is acting for, and every policy must be written so the planner can still use an index. Get those three things right and a single PostgreSQL instance comfortably serves tens of thousands of tenants. Get any of them wrong and you get either silent cross-tenant leakage or a sequential scan on every query.
Prerequisites
Confirm the following before you enable a single policy. Skipping any of these produces either a leak or a performance cliff that is hard to diagnose later.
- [ ] PostgreSQL 14+ (10+ works, but
current_setting(..., true)null-handling and partition behaviour are cleaner on 14+). - [ ] A non-superuser application role for the connection string. Superusers and
BYPASSRLSroles ignore every policy. - [ ]
tenant_id uuid NOT NULLon every tenant-scoped table, defaulted from session context, never trusted from the client. - [ ] A connection pool that supports per-transaction setup hooks (pgbouncer in transaction mode, or an application pool that runs
SET LOCALinside each transaction). - [ ] A composite index plan:
(tenant_id, <primary access key>)on every policy-protected table. - [ ] A way to set a request-scoped tenant identifier — JWT claim or subdomain resolved at the gateway, validated before any connection is acquired.
- [ ] A separate, audited admin role for cross-tenant reads. Do not reuse the application role.
Step-by-Step Implementation
The work breaks into five ordered steps: stamp the column, enable enforcement, write the policies, inject context per transaction, and lock down the role. Run them in this order — enabling RLS before the column exists, or before policies exist, locks the table to all reads.
1. Add the tenant column and default it from context
Every tenant-scoped table needs tenant_id. Default it from the session variable so inserts cannot forget it and cannot forge it — the value comes from the connection's context, not the row payload.
ALTER TABLE orders
ADD COLUMN tenant_id uuid NOT NULL
DEFAULT current_setting('app.tenant_id')::uuid;
-- Backfill existing rows from an authoritative mapping before going live.
UPDATE orders o
SET tenant_id = m.tenant_id
FROM order_tenant_map m
WHERE m.order_id = o.id;
2. Enable row-level security and force it for owners
ENABLE ROW LEVEL SECURITY activates policies for ordinary roles. FORCE ROW LEVEL SECURITY also applies them to the table owner, which closes the common gap where migrations run as the owner and quietly see every tenant.
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
ALTER TABLE orders FORCE ROW LEVEL SECURITY;
3. Write read and write policies against the session variable
Use current_setting('app.tenant_id', true) — the true returns NULL instead of raising when the variable is unset, so a missing context fails closed (no rows) rather than throwing an opaque error. USING governs which rows are visible; WITH CHECK governs which rows a write is allowed to produce, preventing a tenant from inserting or updating a row into another tenant's space.
CREATE POLICY tenant_isolation_select ON orders
FOR SELECT
USING (tenant_id = current_setting('app.tenant_id', true)::uuid);
CREATE POLICY tenant_isolation_modify ON orders
FOR ALL
USING (tenant_id = current_setting('app.tenant_id', true)::uuid)
WITH CHECK (tenant_id = current_setting('app.tenant_id', true)::uuid);
4. Inject tenant context per transaction
Set the variable with SET LOCAL inside the transaction that runs the query. SET LOCAL is scoped to the transaction and is discarded on commit or rollback, so a pooled connection never carries one tenant's context into the next request. A plain SET would persist on the physical connection and leak across pooled checkouts.
import { Pool } from "pg";
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
export async function withTenant<T>(
tenantId: string,
run: (client: import("pg").PoolClient) => Promise<T>,
): Promise<T> {
const client = await pool.connect();
try {
await client.query("BEGIN");
// Parameterised set_config avoids string interpolation into SET.
await client.query("SELECT set_config('app.tenant_id', $1, true)", [tenantId]);
const result = await run(client);
await client.query("COMMIT");
return result;
} catch (err) {
await client.query("ROLLBACK");
throw err;
} finally {
client.release();
}
}
5. Grant the application role and strip bypass privileges
The application role gets DML, nothing more. Verify it cannot bypass policies. A role with BYPASSRLS or SUPERUSER silently disables every guarantee above.
GRANT SELECT, INSERT, UPDATE, DELETE ON orders TO app_tenant_role;
ALTER ROLE app_tenant_role NOBYPASSRLS NOSUPERUSER;
-- Audit: any of these returning true on the app role is a leak.
SELECT rolname, rolsuper, rolbypassrls
FROM pg_roles
WHERE rolname = 'app_tenant_role';
PostgreSQL's native policy objects make this clean. Other engines do not have a direct equivalent — MySQL relies on view-based workarounds or security-definer routines, and SQL Server uses inline table-valued security predicates. For the full Postgres reference including partitioned tables and multi-column policies, see implementing RLS in PostgreSQL for SaaS. To prove the boundary actually holds under each role, work through testing RLS policies for tenant isolation.
When Shared RLS Is the Right Choice
RLS is not always the answer. Weigh row density and operational simplicity against the harder namespace boundaries of a schema-per-tenant architecture or the physical separation of database-per-tenant isolation, and read the full cost vs. security tradeoff analysis before locking in.
| Factor | Shared DB + RLS | Schema-Per-Tenant | Database-Per-Tenant |
|---|---|---|---|
| Boundary enforcement | Engine policies | Namespace separation | Physical instance |
| Tenant density per node | Highest (10k+) | Medium (hundreds) | Lowest (tens) |
| Per-tenant migration | One migration | One per schema | One per database |
| Noisy-neighbour blast radius | High | Medium | None |
| Per-tenant encryption keys | Hard | Possible | Native |
| Best fit | Many small tenants | Mid-market, regulated | Enterprise, sovereignty |
The deciding question is usually compliance. If a contract requires a dedicated key or a provable physical boundary, RLS alone will not satisfy the auditor. If the goal is to run thousands of tenants cheaply with a single migration path, RLS wins outright.
The Request-to-Row Path
The hardest part to reason about is the chain from an inbound request to a scoped result set: where the tenant is resolved, where the context is set, and where the engine actually filters. The figure below traces that path and marks the two points where most leaks occur — an unset context and a pooled connection that kept a previous tenant's variable.
Dynamic Query Scoping & Connection Handling
The policy is static; the context is dynamic. Every transaction must arrive at the database already carrying the right app.tenant_id, and that value must be discarded the moment the transaction ends. This is where pooling architecture decides whether RLS is safe.
In transaction pooling (pgbouncer pool_mode = transaction), a physical connection is handed to whichever client is mid-transaction, then returned. SET LOCAL and set_config(..., true) are transaction-scoped, so they are the only safe way to set context here — a session-level SET would stay on the connection and surface under the next tenant. Never use session-level variables with a transaction pooler.
[databases]
app = host=127.0.0.1 port=5432 dbname=app
[pgbouncer]
pool_mode = transaction
max_client_conn = 5000
default_pool_size = 40
server_reset_query = DISCARD ALL
server_reset_query = DISCARD ALL is a backstop, not the primary control: it clears any leaked session state when a server connection is reused, but it runs between transactions, so correct per-transaction SET LOCAL is still mandatory. For the deeper treatment of pool sizing and reset semantics, the routing pillar covers tenant-aware data routing and query scoping end to end.
ORMs need explicit wiring. Run the set_config call as the first statement of every transaction the ORM opens, ideally through a connection hook so no query path can skip it. Do not put the tenant filter in the ORM query builder as your only defence — that is application-layer filtering wearing a disguise, and it disappears the moment someone runs raw SQL.
Security Enforcement & Access Control
RLS gives you defence in depth only when the layers are stacked correctly. The application filter is a convenience and an optimisation; the database policy is the actual boundary. The two roles below must be genuinely separate credentials, never the same connection string with different runtime flags.
| Layer | Mechanism | Enforced by | Failure if absent |
|---|---|---|---|
| Edge | JWT claim / subdomain to tenant_id | Gateway | Requests act as no tenant |
| Transaction | SET LOCAL app.tenant_id |
App + pool | Context bleeds across pooled checkouts |
| Read path | USING predicate |
Database | Cross-tenant rows returned |
| Write path | WITH CHECK predicate |
Database | Rows written into another tenant |
| Admin path | Separate audited role | Database + audit log | Untracked cross-tenant access |
Cross-tenant work — platform analytics, support tooling, billing rollups — runs under a dedicated role, not the application role. Give that role BYPASSRLS, restrict it to read-only endpoints, and log every statement it runs to an immutable sink. Tie those credentials into the broader auth and cross-tenant access control model so the admin role's grants are reviewed alongside every other privileged path. The principle: tenants are isolated by policy, operators are isolated by audit.
Operational Overhead & Scaling Metrics
RLS adds a predicate to every query. With the right index it is a few percent of CPU; without it, it is a sequential scan that grows linearly with table size. Watch the following and act at the thresholds.
| Metric | Healthy | Warning threshold | Mitigation |
|---|---|---|---|
| Policy-protected query plan | Index Scan on (tenant_id, key) |
Seq Scan in EXPLAIN ANALYZE |
Add composite (tenant_id, key) index |
| Policy eval CPU share | 1–3% | >8% sustained | Simplify predicate; avoid function calls in USING |
| Pool utilisation | <70% | >85% with stale contexts | Confirm SET LOCAL, not session SET |
| Largest table heap size | <500 GB | >1–2 TB per table | Declarative partition by tenant_id |
| Tenants per node | up to ~10k–50k | planner stats skew on huge tenants | Per-tenant ANALYZE; consider extracting whales |
The single highest-leverage action is the composite index. A policy of tenant_id = current_setting(...)::uuid is only fast if the planner can satisfy tenant_id from an index that also covers the query's access pattern. Lead the index with tenant_id, follow with the column the query orders or filters on.
CREATE INDEX idx_orders_tenant_created
ON orders (tenant_id, created_at DESC);
-- Confirm the planner uses it, not a sequential scan.
EXPLAIN ANALYZE
SELECT * FROM orders ORDER BY created_at DESC LIMIT 50;
For very large tenants, declarative partitioning by tenant_id lets partition pruning discard irrelevant data before the policy predicate even runs, keeping per-query cost flat as total rows grow.
Pitfalls & Anti-Patterns
Application-only filtering. Relying on WHERE tenant_id = ? in application code with no database policy means the boundary vanishes the instant anyone runs raw SQL, a migration, or an ORM escape hatch. The filter is fine as defence in depth; it must never be the only enforcement.
Session-level context in a pool. Setting app.tenant_id with a plain SET on a pooled connection persists it past the request. The next tenant to check out that physical connection inherits the previous tenant's context and reads their rows. Always use SET LOCAL or set_config(..., true) inside the transaction.
Bypass privileges on the application role. A SUPERUSER or BYPASSRLS application role ignores every policy silently — no error, just full visibility. Verify rolbypassrls is false on the connection's role, and skip FORCE ROW LEVEL SECURITY and even the owner sees everything.
Missing or mis-ordered composite index. Without a (tenant_id, key) index leading on tenant_id, the policy predicate forces a sequential scan that scales with the whole table, not the tenant's slice. Latency climbs as unrelated tenants add rows.
No WITH CHECK on writes. A read policy alone (USING only) blocks reads but lets a tenant insert or update a row stamped with another tenant's tenant_id. Always pair USING with WITH CHECK on INSERT, UPDATE, and DELETE paths.
Frequently Asked Questions
Does enabling RLS slow down queries?
Only marginally when a composite (tenant_id, key) index exists and the planner uses predicate pushdown — typically 1–3% CPU. The visible slowdowns come from missing indexes that turn the tenant predicate into a sequential scan, or from policies that call functions the planner cannot push down.
Can RLS fully replace application-level tenant filtering? Yes for correctness — the database becomes the authoritative boundary, so isolation holds even when application code is wrong. Keep the application filter as defence in depth and as a query optimisation hint, but never depend on it as the only mechanism.
How do I run cross-tenant analytics or support queries?
Use a separate role granted BYPASSRLS, restricted to read-only endpoints, with every statement written to an immutable audit log. Do not reuse the application role or toggle bypass at runtime — the privileged path must be a distinct, reviewable credential.
What breaks when the tenant context is not set?
With current_setting('app.tenant_id', true) the predicate evaluates against NULL, which matches no rows, so the query fails closed and returns nothing. That is the desired behaviour: a forgotten SET LOCAL produces an obvious empty result, not a silent cross-tenant read.
How many tenants can one shared instance hold?
Tens of thousands of small tenants per node is routine. The practical limits are planner statistics skew when one tenant is vastly larger than the rest, and total table size — partition by tenant_id and run per-tenant ANALYZE before either becomes a problem.