Coverage for integrations / social / tenant_filter.py: 60.2%

88 statements  

« prev     ^ index     » next       coverage.py v7.14.0, created at 2026-05-12 04:49 +0000

1""" 

2HevolveSocial — global tenant filter for SQLAlchemy ORM queries. 

3 

4Phase 7a / Phase 8. Plan reference: sunny-gliding-eich.md, Part C.1 + Part E.1. 

5 

6Migration v40 (`migrate_to_v40_tenancy`) added a nullable `tenant_id` 

7column to ~34 social tables via raw `ALTER TABLE`. The columns exist 

8in the SQL schema but were never declared on the Python ORM model 

9classes — so without this module, every ORM query would silently 

10ignore them and return every tenant's rows undifferentiated. 

11 

12Strict mode (Phase 8) — rollback semantics: 

13 Toggling `tenant_strict_mode` on or off is bit-for-bit reversible 

14 at the SQL layer. The listener only ADDS a WHERE clause; it never 

15 mutates any row. The `before_flush` auto-stamp continues to write 

16 `tenant_id = g.tenant_id` on inserts in BOTH modes, so no new 

17 untenanted rows appear once strict is on. Flipping back to loose 

18 immediately restores visibility of pre-v40 NULL rows. Operators 

19 can toggle the per-tenant override safely without DB mutations. 

20 

21Strict mode — env-var caveat: 

22 `HEVOLVE_FLAG_TENANT_STRICT_MODE` is a PROCESS-GLOBAL fallback 

23 used when no Flask request context is active (daemon threads, 

24 scripts, tests). In production, per-request `g.feature_flags` 

25 takes priority — set per-tenant via the `tenant_feature_flags` 

26 table. Don't flip the env var on a multi-tenant cloud node 

27 unless you intend it for every tenant on that process. 

28 

29This module fixes that without modifying the external `hevolve-database` 

30repo (`sql.models`) or the local fallback (`_models_local.py`): 

31 

32 1. `register_tenant_aware(cls)` runtime-augments the ORM mapper with 

33 a `tenant_id` Column attribute. The SQL column already exists; we 

34 just teach SQLAlchemy about it. 

35 

36 2. `install_tenant_filter()` installs two SQLAlchemy event listeners 

37 on the global Session class: 

38 

39 a. `do_orm_execute` — injects `with_loader_criteria` on every 

40 SELECT for tenant-aware classes. Filter shape is: 

41 `tenant_id == g.tenant_id OR tenant_id IS NULL` 

42 The NULL pass-through means flat/regional rows (untenanted) 

43 remain visible in cloud mode for backward compatibility. 

44 

45 b. `before_flush` — auto-stamps `tenant_id = g.tenant_id` on 

46 every INSERT for tenant-aware classes that don't already have 

47 it set, so apps don't need to remember to set it. 

48 

49 3. Both listeners no-op outside Flask request context (tests, daemon 

50 threads) and when `g.tenant_id` is None (flat/regional). This is 

51 the property that makes the listener regression-safe — adding it 

52 to a single-tenant deploy changes no behavior. 

53 

54Usage (called once at startup, after models are imported): 

55 

56 from integrations.social.tenant_filter import ( 

57 install_tenant_filter, register_tenant_aware, 

58 ) 

59 from integrations.social.models import ( 

60 User, Post, Comment, Community, Notification, 

61 CommunityMembership, Vote, Follow, 

62 ) 

63 install_tenant_filter() 

64 for cls in (User, Post, Comment, Community, Notification, 

65 CommunityMembership, Vote, Follow): 

66 register_tenant_aware(cls) 

67 

68Idempotent — calling either function twice is a no-op. 

69 

70Defense in depth: 

71 This is the SAME defensive layer the WAMP per-tenant ACL provides on 

72 the realtime side. They cover different attack surfaces — WAMP gates 

73 cross-tenant subscribe, the ORM filter gates cross-tenant query. 

74 Both must be installed for a multi-tenant cloud deploy. 

75""" 

76 

77from __future__ import annotations 

78 

79import logging 

80import threading 

81from typing import List, Type 

82 

83from sqlalchemy import Column, String, event, inspect, or_, and_, true, false 

84from sqlalchemy.orm import Session, with_loader_criteria 

85 

86logger = logging.getLogger('hevolve_social') 

87 

88# Whitelist of tenant-aware classes. Populated by register_tenant_aware(). 

89# Order doesn't matter — the listener iterates and applies a criterion 

90# per class on every query. 

91_TENANT_AWARE: List[Type] = [] 

92_TENANT_AWARE_LOCK = threading.Lock() 

93 

94# Sentinel: True once install_tenant_filter() has wired the listeners 

95# onto the global Session class. Idempotent. 

96# 

97# Pass-1 M4 note — idempotency contract: 

98# Both `install_tenant_filter()` and `register_tenant_aware()` are 

99# safe to call multiple times. install_tenant_filter short-circuits 

100# on the second call via _INSTALLED; register_tenant_aware 

101# short-circuits via the `cls in _TENANT_AWARE` check. The listener 

102# itself reads _TENANT_AWARE on every query, so registering more 

103# classes after install_tenant_filter has already fired works 

104# correctly — no need to re-install. Test fixtures in this repo 

105# exploit this pattern: `models_mod._engine = None` between tests 

106# resets the engine, but the Session-class listener stays attached 

107# across tests, so it picks up the fresh engine's queries 

108# automatically. 

109_INSTALLED = False 

110_INSTALL_LOCK = threading.Lock() 

111 

112 

113def _augment_class(cls: Type) -> None: 

114 """Teach SQLAlchemy about the `tenant_id` column on `cls`. 

115 

116 The migration v40 created the column in the SQL schema; we just 

117 need to declare it on the in-memory Table + Mapper so ORM queries 

118 can reference it. 

119 

120 Idempotent — if the class already has `tenant_id` (because some 

121 future canonical model adds it declaratively), this is a no-op. 

122 """ 

123 try: 

124 mapper = inspect(cls) 

125 except Exception as e: 

126 logger.warning("tenant_filter: cannot inspect %s: %s", cls, e) 

127 return 

128 

129 # Already declared (canonical model added it) — nothing to do. 

130 if 'tenant_id' in mapper.columns: 

131 return 

132 

133 # Add the column to the Table object. append_column is in-memory 

134 # only; SQLAlchemy never re-issues DDL for it. 

135 table = cls.__table__ 

136 if 'tenant_id' not in table.c: 

137 try: 

138 table.append_column( 

139 Column('tenant_id', String(64), nullable=True, index=True)) 

140 except Exception as e: 

141 logger.warning("tenant_filter: append_column failed for %s: %s", 

142 cls, e) 

143 return 

144 

145 # Register the column as a mapped property so cls.tenant_id works. 

146 try: 

147 mapper.add_property('tenant_id', table.c.tenant_id) 

148 except Exception as e: 

149 logger.warning("tenant_filter: add_property failed for %s: %s", 

150 cls, e) 

151 

152 

153def register_tenant_aware(cls: Type) -> None: 

154 """Mark `cls` as tenant-aware: queries are filtered, inserts stamped. 

155 

156 Safe to call before or after `install_tenant_filter()`. Idempotent. 

157 """ 

158 with _TENANT_AWARE_LOCK: 

159 if cls in _TENANT_AWARE: 

160 return 

161 _augment_class(cls) 

162 _TENANT_AWARE.append(cls) 

163 logger.debug("tenant_filter: registered %s", cls.__name__) 

164 

165 

166def get_tenant_aware_classes() -> List[Type]: 

167 """Snapshot of the registered tenant-aware classes (test/inspection).""" 

168 with _TENANT_AWARE_LOCK: 

169 return list(_TENANT_AWARE) 

170 

171 

172def _current_tenant_id(): 

173 """Return the current request's tenant_id, or None if not in a Flask 

174 request context. Wrapped in try/except to gracefully handle: 

175 

176 - Flask not installed (extreme degraded boot) 

177 - Outside any request (daemon threads, scripts, tests) 

178 - g.tenant_id not set (require_auth never ran — admin-only routes 

179 bypassing auth, or pre-7a code path) 

180 """ 

181 try: 

182 from flask import g, has_request_context 

183 if not has_request_context(): 

184 return None 

185 return getattr(g, 'tenant_id', None) 

186 except Exception: 

187 return None 

188 

189 

190def _strict_mode_enabled() -> bool: 

191 """Phase 8 — when `tenant_strict_mode` flag is on, the NULL 

192 pass-through is dropped: legacy untenanted rows become invisible 

193 to tenanted requests. This is the Pass-2 H-NEW-2 hardening 

194 promised at the time of the original loose-mode trade-off, plus 

195 the closely related Pass-4 P4-3 system-agent concern (legacy 

196 rows could carry across tenants). 

197 

198 Falls back to env var `HEVOLVE_FLAG_TENANT_STRICT_MODE` when no 

199 Flask context is active so tests + daemon threads can opt in. 

200 

201 Priority order (Pass-5 F6 — same shape as feature_flags.get_flag, 

202 just compressed because g.feature_flags has already resolved 

203 tenant > env > default at request boot): 

204 

205 1. g.feature_flags['tenant_strict_mode'] (per-request decided) 

206 2. HEVOLVE_FLAG_TENANT_STRICT_MODE env var (PROCESS-GLOBAL — 

207 see module docstring caveat; do not set on multi-tenant 

208 cloud nodes unless every tenant should be strict) 

209 3. False (the default-OFF default) 

210 """ 

211 try: 

212 from flask import g, has_request_context 

213 if has_request_context(): 

214 flags = getattr(g, 'feature_flags', None) or {} 

215 if 'tenant_strict_mode' in flags: 

216 return bool(flags['tenant_strict_mode']) 

217 except Exception: 

218 pass 

219 # Env-var fallback so tests + scripts can flip the mode without 

220 # bootstrapping a full Flask request. 

221 import os 

222 raw = os.environ.get('HEVOLVE_FLAG_TENANT_STRICT_MODE', '') 

223 return raw.strip().lower() in ('1', 'true', 'yes', 'on') 

224 

225 

226def _build_criterion(cls, tid, strict): 

227 """Build a `with_loader_criteria` option for `cls`. 

228 

229 Two modes: 

230 Loose → `(tenant_id == :tid) OR (tenant_id IS NULL)` 

231 Strict → `(tenant_id == :tid)` (NULL rows hidden) 

232 

233 Critical implementation detail: SQLAlchemy 2.0 tracks lambda 

234 CLOSURE variables for cache-invalidation, but NOT default-args 

235 (`__defaults__`). Using `lambda c, _tid=tid: ...` would bake 

236 the FIRST call's tid into the cached SQL forever; subsequent 

237 calls with different tids would reuse the stale SQL. The fix 

238 is to capture `tid` via ACTUAL closure (no default arg) so 

239 SQLAlchemy's tracker correctly invalidates per-tid. 

240 

241 Strict vs loose is encoded by RETURNING two structurally- 

242 different lambdas (different `__code__`), so the cache 

243 naturally separates them. 

244 """ 

245 if strict: 

246 def strict_criterion(c): 

247 return c.tenant_id == tid 

248 return with_loader_criteria( 

249 cls, strict_criterion, include_aliases=True) 

250 else: 

251 def loose_criterion(c): 

252 return or_(c.tenant_id == tid, c.tenant_id.is_(None)) 

253 return with_loader_criteria( 

254 cls, loose_criterion, include_aliases=True) 

255 

256 

257def install_tenant_filter() -> None: 

258 """Install do_orm_execute + before_flush listeners on Session. 

259 

260 Idempotent — repeat calls are a no-op. Listeners only fire when 

261 inside a Flask request context AND `g.tenant_id` is non-None, 

262 making the install free of regression for single-tenant deploys. 

263 """ 

264 global _INSTALLED 

265 with _INSTALL_LOCK: 

266 if _INSTALLED: 

267 return 

268 _INSTALLED = True 

269 

270 @event.listens_for(Session, 'do_orm_execute') 

271 def _on_orm_execute(execute_state): 

272 # Only filter SELECTs — INSERTs/UPDATEs/DELETEs use before_flush. 

273 if not execute_state.is_select: 

274 return 

275 tid = _current_tenant_id() 

276 if tid is None: 

277 return # flat/regional or outside request context — pass-through 

278 

279 # Snapshot of registered classes — caller may register more 

280 # after install, and this list is rebuilt every query so they 

281 # pick up automatically. 

282 with _TENANT_AWARE_LOCK: 

283 classes = list(_TENANT_AWARE) 

284 

285 # Phase 8 — strict mode drops the NULL pass-through. Legacy 

286 # untenanted rows become invisible to tenanted requests. 

287 # Tradeoff vs. loose mode (default): strict gives stronger 

288 # cross-tenant isolation guarantees at the cost of breaking 

289 # backward-compat reads of pre-v40 rows. Choose per tenant. 

290 # 

291 # IMPLEMENTATION NOTE — lambda caching gotcha: 

292 # SQLAlchemy's `with_loader_criteria` caches compiled SQL by 

293 # the lambda's CODE object. Earlier attempts used two 

294 # different lambda bodies (one for strict, one for loose), 

295 # which produced cache pollution where the wrong-mode lambda 

296 # was reused under sequential test execution. 

297 # 

298 # The fix: ALWAYS use the SAME lambda body that includes 

299 # BOTH branches as SQL clauses. Strict mode is encoded as a 

300 # plain Python conditional INSIDE the lambda where the 

301 # `_strict` value is captured via default-arg. SQLAlchemy 

302 # tracks `_tid` as a closure variable but treats the default- 

303 # arg branch decision as part of the SQL shape — so each 

304 # call's actual SQL output matches the captured strict value. 

305 # The cache may store one entry per (code, _strict-arg-value) 

306 # pair which is what we want. 

307 strict = _strict_mode_enabled() 

308 

309 for cls in classes: 

310 execute_state.statement = execute_state.statement.options( 

311 _build_criterion(cls, tid, strict) 

312 ) 

313 

314 @event.listens_for(Session, 'before_flush') 

315 def _on_before_flush(session, flush_context, instances): 

316 tid = _current_tenant_id() 

317 if tid is None: 

318 return 

319 

320 with _TENANT_AWARE_LOCK: 

321 classes = tuple(_TENANT_AWARE) 

322 

323 if not classes: 

324 return 

325 

326 # Auto-stamp tenant_id on new (INSERT) instances of registered 

327 # classes. Never overwrites an explicit value the caller set — 

328 # this is purely a default for code that pre-dates 7a. 

329 for instance in session.new: 

330 if isinstance(instance, classes): 

331 if getattr(instance, 'tenant_id', None) is None: 

332 try: 

333 instance.tenant_id = tid 

334 except Exception: 

335 pass 

336 

337 logger.info("tenant_filter: installed (do_orm_execute + before_flush " 

338 "listeners on global Session)") 

339 

340 

341__all__ = [ 

342 'install_tenant_filter', 

343 'register_tenant_aware', 

344 'get_tenant_aware_classes', 

345]