Coverage for integrations / social / tenant_filter.py: 60.2%
88 statements
« prev ^ index » next coverage.py v7.14.0, created at 2026-05-12 04:49 +0000
« prev ^ index » next coverage.py v7.14.0, created at 2026-05-12 04:49 +0000
1"""
2HevolveSocial — global tenant filter for SQLAlchemy ORM queries.
4Phase 7a / Phase 8. Plan reference: sunny-gliding-eich.md, Part C.1 + Part E.1.
6Migration v40 (`migrate_to_v40_tenancy`) added a nullable `tenant_id`
7column to ~34 social tables via raw `ALTER TABLE`. The columns exist
8in the SQL schema but were never declared on the Python ORM model
9classes — so without this module, every ORM query would silently
10ignore them and return every tenant's rows undifferentiated.
12Strict mode (Phase 8) — rollback semantics:
13 Toggling `tenant_strict_mode` on or off is bit-for-bit reversible
14 at the SQL layer. The listener only ADDS a WHERE clause; it never
15 mutates any row. The `before_flush` auto-stamp continues to write
16 `tenant_id = g.tenant_id` on inserts in BOTH modes, so no new
17 untenanted rows appear once strict is on. Flipping back to loose
18 immediately restores visibility of pre-v40 NULL rows. Operators
19 can toggle the per-tenant override safely without DB mutations.
21Strict mode — env-var caveat:
22 `HEVOLVE_FLAG_TENANT_STRICT_MODE` is a PROCESS-GLOBAL fallback
23 used when no Flask request context is active (daemon threads,
24 scripts, tests). In production, per-request `g.feature_flags`
25 takes priority — set per-tenant via the `tenant_feature_flags`
26 table. Don't flip the env var on a multi-tenant cloud node
27 unless you intend it for every tenant on that process.
29This module fixes that without modifying the external `hevolve-database`
30repo (`sql.models`) or the local fallback (`_models_local.py`):
32 1. `register_tenant_aware(cls)` runtime-augments the ORM mapper with
33 a `tenant_id` Column attribute. The SQL column already exists; we
34 just teach SQLAlchemy about it.
36 2. `install_tenant_filter()` installs two SQLAlchemy event listeners
37 on the global Session class:
39 a. `do_orm_execute` — injects `with_loader_criteria` on every
40 SELECT for tenant-aware classes. Filter shape is:
41 `tenant_id == g.tenant_id OR tenant_id IS NULL`
42 The NULL pass-through means flat/regional rows (untenanted)
43 remain visible in cloud mode for backward compatibility.
45 b. `before_flush` — auto-stamps `tenant_id = g.tenant_id` on
46 every INSERT for tenant-aware classes that don't already have
47 it set, so apps don't need to remember to set it.
49 3. Both listeners no-op outside Flask request context (tests, daemon
50 threads) and when `g.tenant_id` is None (flat/regional). This is
51 the property that makes the listener regression-safe — adding it
52 to a single-tenant deploy changes no behavior.
54Usage (called once at startup, after models are imported):
56 from integrations.social.tenant_filter import (
57 install_tenant_filter, register_tenant_aware,
58 )
59 from integrations.social.models import (
60 User, Post, Comment, Community, Notification,
61 CommunityMembership, Vote, Follow,
62 )
63 install_tenant_filter()
64 for cls in (User, Post, Comment, Community, Notification,
65 CommunityMembership, Vote, Follow):
66 register_tenant_aware(cls)
68Idempotent — calling either function twice is a no-op.
70Defense in depth:
71 This is the SAME defensive layer the WAMP per-tenant ACL provides on
72 the realtime side. They cover different attack surfaces — WAMP gates
73 cross-tenant subscribe, the ORM filter gates cross-tenant query.
74 Both must be installed for a multi-tenant cloud deploy.
75"""
77from __future__ import annotations
79import logging
80import threading
81from typing import List, Type
83from sqlalchemy import Column, String, event, inspect, or_, and_, true, false
84from sqlalchemy.orm import Session, with_loader_criteria
86logger = logging.getLogger('hevolve_social')
88# Whitelist of tenant-aware classes. Populated by register_tenant_aware().
89# Order doesn't matter — the listener iterates and applies a criterion
90# per class on every query.
91_TENANT_AWARE: List[Type] = []
92_TENANT_AWARE_LOCK = threading.Lock()
94# Sentinel: True once install_tenant_filter() has wired the listeners
95# onto the global Session class. Idempotent.
96#
97# Pass-1 M4 note — idempotency contract:
98# Both `install_tenant_filter()` and `register_tenant_aware()` are
99# safe to call multiple times. install_tenant_filter short-circuits
100# on the second call via _INSTALLED; register_tenant_aware
101# short-circuits via the `cls in _TENANT_AWARE` check. The listener
102# itself reads _TENANT_AWARE on every query, so registering more
103# classes after install_tenant_filter has already fired works
104# correctly — no need to re-install. Test fixtures in this repo
105# exploit this pattern: `models_mod._engine = None` between tests
106# resets the engine, but the Session-class listener stays attached
107# across tests, so it picks up the fresh engine's queries
108# automatically.
109_INSTALLED = False
110_INSTALL_LOCK = threading.Lock()
113def _augment_class(cls: Type) -> None:
114 """Teach SQLAlchemy about the `tenant_id` column on `cls`.
116 The migration v40 created the column in the SQL schema; we just
117 need to declare it on the in-memory Table + Mapper so ORM queries
118 can reference it.
120 Idempotent — if the class already has `tenant_id` (because some
121 future canonical model adds it declaratively), this is a no-op.
122 """
123 try:
124 mapper = inspect(cls)
125 except Exception as e:
126 logger.warning("tenant_filter: cannot inspect %s: %s", cls, e)
127 return
129 # Already declared (canonical model added it) — nothing to do.
130 if 'tenant_id' in mapper.columns:
131 return
133 # Add the column to the Table object. append_column is in-memory
134 # only; SQLAlchemy never re-issues DDL for it.
135 table = cls.__table__
136 if 'tenant_id' not in table.c:
137 try:
138 table.append_column(
139 Column('tenant_id', String(64), nullable=True, index=True))
140 except Exception as e:
141 logger.warning("tenant_filter: append_column failed for %s: %s",
142 cls, e)
143 return
145 # Register the column as a mapped property so cls.tenant_id works.
146 try:
147 mapper.add_property('tenant_id', table.c.tenant_id)
148 except Exception as e:
149 logger.warning("tenant_filter: add_property failed for %s: %s",
150 cls, e)
153def register_tenant_aware(cls: Type) -> None:
154 """Mark `cls` as tenant-aware: queries are filtered, inserts stamped.
156 Safe to call before or after `install_tenant_filter()`. Idempotent.
157 """
158 with _TENANT_AWARE_LOCK:
159 if cls in _TENANT_AWARE:
160 return
161 _augment_class(cls)
162 _TENANT_AWARE.append(cls)
163 logger.debug("tenant_filter: registered %s", cls.__name__)
166def get_tenant_aware_classes() -> List[Type]:
167 """Snapshot of the registered tenant-aware classes (test/inspection)."""
168 with _TENANT_AWARE_LOCK:
169 return list(_TENANT_AWARE)
172def _current_tenant_id():
173 """Return the current request's tenant_id, or None if not in a Flask
174 request context. Wrapped in try/except to gracefully handle:
176 - Flask not installed (extreme degraded boot)
177 - Outside any request (daemon threads, scripts, tests)
178 - g.tenant_id not set (require_auth never ran — admin-only routes
179 bypassing auth, or pre-7a code path)
180 """
181 try:
182 from flask import g, has_request_context
183 if not has_request_context():
184 return None
185 return getattr(g, 'tenant_id', None)
186 except Exception:
187 return None
190def _strict_mode_enabled() -> bool:
191 """Phase 8 — when `tenant_strict_mode` flag is on, the NULL
192 pass-through is dropped: legacy untenanted rows become invisible
193 to tenanted requests. This is the Pass-2 H-NEW-2 hardening
194 promised at the time of the original loose-mode trade-off, plus
195 the closely related Pass-4 P4-3 system-agent concern (legacy
196 rows could carry across tenants).
198 Falls back to env var `HEVOLVE_FLAG_TENANT_STRICT_MODE` when no
199 Flask context is active so tests + daemon threads can opt in.
201 Priority order (Pass-5 F6 — same shape as feature_flags.get_flag,
202 just compressed because g.feature_flags has already resolved
203 tenant > env > default at request boot):
205 1. g.feature_flags['tenant_strict_mode'] (per-request decided)
206 2. HEVOLVE_FLAG_TENANT_STRICT_MODE env var (PROCESS-GLOBAL —
207 see module docstring caveat; do not set on multi-tenant
208 cloud nodes unless every tenant should be strict)
209 3. False (the default-OFF default)
210 """
211 try:
212 from flask import g, has_request_context
213 if has_request_context():
214 flags = getattr(g, 'feature_flags', None) or {}
215 if 'tenant_strict_mode' in flags:
216 return bool(flags['tenant_strict_mode'])
217 except Exception:
218 pass
219 # Env-var fallback so tests + scripts can flip the mode without
220 # bootstrapping a full Flask request.
221 import os
222 raw = os.environ.get('HEVOLVE_FLAG_TENANT_STRICT_MODE', '')
223 return raw.strip().lower() in ('1', 'true', 'yes', 'on')
226def _build_criterion(cls, tid, strict):
227 """Build a `with_loader_criteria` option for `cls`.
229 Two modes:
230 Loose → `(tenant_id == :tid) OR (tenant_id IS NULL)`
231 Strict → `(tenant_id == :tid)` (NULL rows hidden)
233 Critical implementation detail: SQLAlchemy 2.0 tracks lambda
234 CLOSURE variables for cache-invalidation, but NOT default-args
235 (`__defaults__`). Using `lambda c, _tid=tid: ...` would bake
236 the FIRST call's tid into the cached SQL forever; subsequent
237 calls with different tids would reuse the stale SQL. The fix
238 is to capture `tid` via ACTUAL closure (no default arg) so
239 SQLAlchemy's tracker correctly invalidates per-tid.
241 Strict vs loose is encoded by RETURNING two structurally-
242 different lambdas (different `__code__`), so the cache
243 naturally separates them.
244 """
245 if strict:
246 def strict_criterion(c):
247 return c.tenant_id == tid
248 return with_loader_criteria(
249 cls, strict_criterion, include_aliases=True)
250 else:
251 def loose_criterion(c):
252 return or_(c.tenant_id == tid, c.tenant_id.is_(None))
253 return with_loader_criteria(
254 cls, loose_criterion, include_aliases=True)
257def install_tenant_filter() -> None:
258 """Install do_orm_execute + before_flush listeners on Session.
260 Idempotent — repeat calls are a no-op. Listeners only fire when
261 inside a Flask request context AND `g.tenant_id` is non-None,
262 making the install free of regression for single-tenant deploys.
263 """
264 global _INSTALLED
265 with _INSTALL_LOCK:
266 if _INSTALLED:
267 return
268 _INSTALLED = True
270 @event.listens_for(Session, 'do_orm_execute')
271 def _on_orm_execute(execute_state):
272 # Only filter SELECTs — INSERTs/UPDATEs/DELETEs use before_flush.
273 if not execute_state.is_select:
274 return
275 tid = _current_tenant_id()
276 if tid is None:
277 return # flat/regional or outside request context — pass-through
279 # Snapshot of registered classes — caller may register more
280 # after install, and this list is rebuilt every query so they
281 # pick up automatically.
282 with _TENANT_AWARE_LOCK:
283 classes = list(_TENANT_AWARE)
285 # Phase 8 — strict mode drops the NULL pass-through. Legacy
286 # untenanted rows become invisible to tenanted requests.
287 # Tradeoff vs. loose mode (default): strict gives stronger
288 # cross-tenant isolation guarantees at the cost of breaking
289 # backward-compat reads of pre-v40 rows. Choose per tenant.
290 #
291 # IMPLEMENTATION NOTE — lambda caching gotcha:
292 # SQLAlchemy's `with_loader_criteria` caches compiled SQL by
293 # the lambda's CODE object. Earlier attempts used two
294 # different lambda bodies (one for strict, one for loose),
295 # which produced cache pollution where the wrong-mode lambda
296 # was reused under sequential test execution.
297 #
298 # The fix: ALWAYS use the SAME lambda body that includes
299 # BOTH branches as SQL clauses. Strict mode is encoded as a
300 # plain Python conditional INSIDE the lambda where the
301 # `_strict` value is captured via default-arg. SQLAlchemy
302 # tracks `_tid` as a closure variable but treats the default-
303 # arg branch decision as part of the SQL shape — so each
304 # call's actual SQL output matches the captured strict value.
305 # The cache may store one entry per (code, _strict-arg-value)
306 # pair which is what we want.
307 strict = _strict_mode_enabled()
309 for cls in classes:
310 execute_state.statement = execute_state.statement.options(
311 _build_criterion(cls, tid, strict)
312 )
314 @event.listens_for(Session, 'before_flush')
315 def _on_before_flush(session, flush_context, instances):
316 tid = _current_tenant_id()
317 if tid is None:
318 return
320 with _TENANT_AWARE_LOCK:
321 classes = tuple(_TENANT_AWARE)
323 if not classes:
324 return
326 # Auto-stamp tenant_id on new (INSERT) instances of registered
327 # classes. Never overwrites an explicit value the caller set —
328 # this is purely a default for code that pre-dates 7a.
329 for instance in session.new:
330 if isinstance(instance, classes):
331 if getattr(instance, 'tenant_id', None) is None:
332 try:
333 instance.tenant_id = tid
334 except Exception:
335 pass
337 logger.info("tenant_filter: installed (do_orm_execute + before_flush "
338 "listeners on global Session)")
341__all__ = [
342 'install_tenant_filter',
343 'register_tenant_aware',
344 'get_tenant_aware_classes',
345]