Coverage for core / user_context.py: 76.8%

207 statements  

« prev     ^ index     » next       coverage.py v7.14.0, created at 2026-05-12 04:49 +0000

1"""core/user_context.py — canonical user action + profile resolver. 

2 

3Single source of truth for the ``(user_details, actions)`` tuple that 

4was previously implemented THREE times — in ``hart_intelligence_entry``, 

5``create_recipe``, and ``reuse_recipe`` — with subtle drift. Consolidating 

6here fixes the four non-classification problems the reviewer flagged 

7in the 2026-04-11 "hi took 33.8s" post-mortem: 

8 

91. **Hard time budget.** Every HTTP fetch is bounded by a total budget 

10 (default 1.5s). If the budget blows, we return cached-or-default 

11 instantly and spawn a background refresh so the NEXT request has 

12 fresh data — the hot path never blocks more than the budget 

13 regardless of how slow the backend gets. This alone collapses the 

14 33.8s worst case to 1.5s. 

152. **30-second TTL cache per user_id.** Fetching the same action 

16 history + profile on every chat message was pure waste. The cache 

17 collapses that to one fetch per 30s of activity — the second "hi" 

18 within 30s is microseconds. 

193. **Deduplication / SRP.** The three copies lived in three modules 

20 with different filter lists (some skipped ``Screen Reasoning``, 

21 some didn't), different crash behavior on missing profile fields, 

22 and different use of ``pooled_request`` vs raw ``requests``. One 

23 canonical resolver with a ``mode`` parameter gives callers the 

24 create/reuse behavior they need without duplicating HTTP plumbing. 

254. **Thread-safety.** Background refresh uses a bounded 

26 ``ThreadPoolExecutor`` and the cache is a ``TTLCache`` — the same 

27 primitives ``core/session_cache.py`` already uses for other per- 

28 user state. No new concurrency primitives, no new parallel paths. 

29 

30**No Python-side classification.** An earlier draft of this module 

31had a ``_is_casual_greeting(query)`` regex short-circuit ("hi" / 

32"hello" / "hey" → skip HTTP entirely). That was reverted on operator 

33directive: keyword regex hacks for chat classification are exactly 

34the kind of drifting, locale-brittle shadow-code the "no parallel 

35paths" sweep is trying to eliminate. Chat intent classification is 

36owned by the draft 0.8B model (``speculative_dispatcher. 

37dispatch_draft_first``) which emits a structured envelope with 

38``is_casual`` / ``is_correction`` / ``is_create_agent`` flags. If a 

39future caller wants to skip the action-history fetch based on that 

40classification, it should pass the draft's already-computed 

41``is_casual`` flag down — not re-classify in Python with a regex. 

42 

43Architecture (layered, SRP): 

44 

45 get_user_context(user_id, mode, ...) ← public entry 

46 

47 ├── UserContextCache.get / set ← caching layer 

48 ├── _fetch_actions_raw(user_id, budget) ← HTTP layer 

49 ├── _fetch_profile_raw(user_id, budget) ← HTTP layer 

50 ├── _format_action_rich(...) ← formatting layer 

51 ├── _format_action_simple(...) ← formatting layer 

52 └── _schedule_background_refresh(...) ← async refresh layer 

53 

54Callers (``hart_intelligence_entry``, ``create_recipe``, ``reuse_recipe``) 

55pass ``mode='reuse'`` or ``mode='create'``. They no longer own any of 

56the HTTP, caching, or formatting logic. 

57""" 

58from __future__ import annotations 

59 

60import json 

61import logging 

62import threading 

63import time 

64from concurrent.futures import ThreadPoolExecutor 

65from datetime import datetime, timedelta 

66from typing import Literal, Optional 

67 

68import pytz 

69from dateutil.parser import parse as parse_date 

70 

71from core.http_pool import pooled_get, pooled_post 

72from core.session_cache import TTLCache 

73 

74logger = logging.getLogger('hevolve.user_context') 

75 

76 

77# ─── Module constants ───────────────────────────────────────────────────── 

78 

79#: Default cache TTL. 30s is short enough that a user changing their 

80#: display name in settings sees the change on the next chat message, 

81#: but long enough to absorb a burst of messages without re-fetching. 

82DEFAULT_TTL_SECONDS = 30.0 

83 

84#: Default hot-path time budget. The old code timed out at 5s per HTTP 

85#: call (10s combined); that's four draft-model replies. 1.5s total 

86#: gives either call a chance to return on a healthy local backend, 

87#: and still lets us fall through to cached-or-default if the backend 

88#: is GIL-starved (which was the exact 2026-04-11 incident). 

89DEFAULT_BUDGET_SECONDS = 1.5 

90 

91#: Cap on the number of cached users. A single Nunba instance rarely 

92#: talks to more than a handful of concurrent users but the cache is 

93#: sized with headroom so a stampede of channel messages doesn't evict 

94#: the session that's driving the UI. 

95MAX_CACHED_USERS = 256 

96 

97#: Actions the agent system produces as noise during training and 

98#: should not be echoed back to the LLM as "past user actions". 

99_UNWANTED_ACTIONS = frozenset([ 

100 'Topic Cofirmation', 'Topic Confirmation', 'Topic confirmation', 

101 'Topic not found', 'Topic Listing', 

102 'Langchain', 'Assessment Ended', 'Casual Conversation', 

103 'Probe', 'Question Answering', 'Fallback', 

104]) 

105 

106#: IANA timezone for rendering action timestamps. Previously hardcoded 

107#: in every copy of the function — leaves a TODO in the old code 

108#: ("get, and populate timezone from client"). 

109# TODO: read from thread-local session or user profile once the 

110# frontend sends that header. For now this matches the legacy behavior. 

111_DEFAULT_TZ = 'Asia/Kolkata' 

112 

113 

114# ─── Layer 1: caching ───────────────────────────────────────────────────── 

115 

116class UserContextCache: 

117 """Thread-safe per-user cache for ``(user_details, actions)`` tuples. 

118 

119 Wraps the shared ``TTLCache`` primitive from ``core/session_cache.py`` 

120 — deliberately uses the existing TTL/LRU machinery instead of a 

121 parallel implementation. The only reason this class exists is to 

122 give callers a typed, single-purpose handle so the cache key naming 

123 (``{mode}:{user_id}``) stays in one place. 

124 """ 

125 

126 def __init__(self, ttl_seconds: float = DEFAULT_TTL_SECONDS, 

127 max_size: int = MAX_CACHED_USERS): 

128 # int(ttl_seconds) because TTLCache stores seconds as int. A 

129 # fractional-second TTL would require rewriting TTLCache, which 

130 # is out of scope — 30s rounding to 30 is lossless. 

131 self._cache = TTLCache( 

132 ttl_seconds=int(ttl_seconds), 

133 max_size=max_size, 

134 name='user_context', 

135 ) 

136 

137 @staticmethod 

138 def _key(user_id, mode: str) -> str: 

139 return f"{mode}:{user_id}" 

140 

141 def get(self, user_id, mode: str) -> Optional[tuple[str, str]]: 

142 """Return the cached ``(user_details, actions)`` tuple or None.""" 

143 return self._cache.get(self._key(user_id, mode)) 

144 

145 def set(self, user_id, mode: str, value: tuple[str, str]) -> None: 

146 self._cache[self._key(user_id, mode)] = value 

147 

148 def invalidate(self, user_id, mode: Optional[str] = None) -> None: 

149 """Drop one or all cached entries for a user. 

150 

151 When ``mode`` is None, both create and reuse entries are 

152 dropped so a profile change is visible to either path on the 

153 next request. 

154 """ 

155 if mode is None: 

156 for m in ('create', 'reuse'): 

157 try: 

158 del self._cache[self._key(user_id, m)] 

159 except KeyError: 

160 pass 

161 else: 

162 try: 

163 del self._cache[self._key(user_id, mode)] 

164 except KeyError: 

165 pass 

166 

167 

168#: Module-level singleton — mirrors the ``_registry`` / ``_integration`` 

169#: pattern used by ``channels/registry.py`` and others. 

170_cache: Optional[UserContextCache] = None 

171_cache_lock = threading.Lock() 

172 

173 

174def get_user_context_cache() -> UserContextCache: 

175 """Return the singleton UserContextCache, constructing on first call.""" 

176 global _cache 

177 if _cache is None: 

178 with _cache_lock: 

179 if _cache is None: 

180 _cache = UserContextCache() 

181 return _cache 

182 

183 

184# ─── Layer 2: background refresh pool ───────────────────────────────────── 

185 

186#: Separate thread pool for cache refreshes so the hot path never 

187#: blocks on pool saturation. Small (4 workers) — refreshes are cheap 

188#: and we don't want to steal CPU from the LLM subprocesses. 

189_refresh_pool = ThreadPoolExecutor(max_workers=4, thread_name_prefix='uctx-refresh') 

190 

191#: Per-user lock that prevents N concurrent refreshes of the same user. 

192#: A stampede of messages for user 10077 still triggers exactly one 

193#: background refresh instead of N parallel HTTP calls. 

194_refresh_inflight: dict = {} 

195_refresh_inflight_lock = threading.Lock() 

196 

197 

198def _schedule_background_refresh(user_id, mode: str, timeout_budget_s: float) -> None: 

199 """Kick off a non-blocking cache refresh for a user. 

200 

201 Deduped — if a refresh is already in flight for this (user_id, mode), 

202 this call is a no-op. 

203 """ 

204 key = (user_id, mode) 

205 with _refresh_inflight_lock: 

206 if key in _refresh_inflight: 

207 return 

208 _refresh_inflight[key] = True 

209 

210 def _task(): 

211 try: 

212 _resolve_fresh(user_id, mode, timeout_budget_s) 

213 except Exception as e: 

214 logger.debug("background refresh failed for %s: %s", key, e) 

215 finally: 

216 with _refresh_inflight_lock: 

217 _refresh_inflight.pop(key, None) 

218 

219 try: 

220 _refresh_pool.submit(_task) 

221 except Exception as e: 

222 logger.debug("refresh submit failed for %s: %s", key, e) 

223 with _refresh_inflight_lock: 

224 _refresh_inflight.pop(key, None) 

225 

226 

227# ─── Layer 3: HTTP fetchers ─────────────────────────────────────────────── 

228 

229def _action_api_url(user_id) -> str: 

230 """Resolve the action-history URL. Defers import to avoid circulars 

231 with ``config_cache`` at module-load time.""" 

232 from core.config_cache import get_action_api 

233 return f"{get_action_api()}?user_id={user_id}" 

234 

235 

236def _student_api_url() -> str: 

237 from core.config_cache import get_student_api 

238 return get_student_api() 

239 

240 

241def _fetch_actions_raw(user_id, timeout_s: float) -> Optional[list]: 

242 """GET the raw action list for a user. Returns None on any failure. 

243 

244 Uses ``pooled_get`` so connection reuse and the shared HTTP pool 

245 limits apply — raw ``requests.request`` in the legacy copies 

246 bypassed pooling and contributed to the 33.8s stall. 

247 """ 

248 try: 

249 response = pooled_get(_action_api_url(user_id), timeout=timeout_s) 

250 if response.status_code == 200: 

251 return response.json() 

252 except Exception as e: 

253 logger.debug("action fetch failed for user %s: %s", user_id, e) 

254 return None 

255 

256 

257def _fetch_profile_raw(user_id, timeout_s: float) -> Optional[dict]: 

258 """POST to the student profile API. Returns None on any failure.""" 

259 try: 

260 body = json.dumps({"user_id": user_id}) 

261 headers = {'Content-Type': 'application/json'} 

262 response = pooled_post( 

263 _student_api_url(), data=body, headers=headers, timeout=timeout_s, 

264 ) 

265 if response.status_code == 200: 

266 return response.json() 

267 except Exception as e: 

268 logger.debug("profile fetch failed for user %s: %s", user_id, e) 

269 return None 

270 

271 

272# ─── Layer 5: formatting ────────────────────────────────────────────────── 

273 

274def _get_tz(): 

275 try: 

276 return pytz.timezone(_DEFAULT_TZ) 

277 except Exception: 

278 return None 

279 

280 

281def _format_action_simple(raw_actions: list) -> str: 

282 """CREATE-mode action formatter. 

283 

284 Mirrors the create_recipe.py version: one line per action with an 

285 ISO-formatted timestamp, no deduplication, no visual/screen 

286 context windows. The create flow is during initial agent training 

287 where the teacher walks through actions explicitly — dedup and 

288 video context would clutter the prompt. 

289 """ 

290 tz = _get_tz() 

291 filtered = [ 

292 obj for obj in raw_actions 

293 if obj.get("action") not in _UNWANTED_ACTIONS 

294 and obj.get("zeroshot_label") not in ('Video Reasoning',) 

295 ] 

296 texts = [] 

297 for obj in filtered: 

298 action = obj.get("action", "") 

299 try: 

300 date = parse_date(obj["created_date"]) 

301 rendered = date.astimezone(tz) if tz else date 

302 texts.append(f"{action} on {rendered.strftime('%Y-%m-%dT%H:%M:%S')}") 

303 except Exception: 

304 texts.append(action) 

305 if not texts: 

306 return 'user has not performed any actions yet.' 

307 return ", ".join(texts) 

308 

309 

310def _format_action_rich(raw_actions: list) -> str: 

311 """REUSE-mode action formatter with dedup + visual + screen context. 

312 

313 Mirrors hart_intelligence_entry / reuse_recipe: 

314 - Dedup by action name with first/last occurrence dates. 

315 - 5-minute visual context window tagged with 

316 ``<Last_5_Minutes_Visual_Context_Start/End>``. 

317 - 2-minute screen context window tagged with 

318 ``<Last_2_Minutes_Screen_Context_Start/End>``. 

319 - Trailing current-time hint used by the LLM for "what time is it" 

320 style questions. 

321 """ 

322 tz = _get_tz() 

323 now = datetime.now() 

324 

325 filtered = [ 

326 obj for obj in raw_actions 

327 if obj.get("action") not in _UNWANTED_ACTIONS 

328 and obj.get("zeroshot_label") not in ('Video Reasoning', 'Screen Reasoning') 

329 ] 

330 filtered_video = [ 

331 obj for obj in raw_actions 

332 if obj.get("zeroshot_label") == 'Video Reasoning' 

333 ] 

334 filtered_screen = [ 

335 obj for obj in raw_actions 

336 if obj.get("zeroshot_label") == 'Screen Reasoning' 

337 ] 

338 

339 # Dedup: first/last date per action name. 

340 action_occurrences: dict = {} 

341 for obj in filtered: 

342 action = obj.get("action", "") 

343 try: 

344 date = parse_date(obj["created_date"]) 

345 except Exception: 

346 continue 

347 existing = action_occurrences.get(action) 

348 if existing is None: 

349 action_occurrences[action] = [date, date] 

350 else: 

351 first_date, last_date = existing 

352 action_occurrences[action] = [min(first_date, date), max(last_date, date)] 

353 

354 action_texts = [] 

355 for action, (first_date, last_date) in action_occurrences.items(): 

356 first_r = first_date.astimezone(tz) if tz else first_date 

357 action_texts.append(f"{action} on {first_r.strftime('%Y-%m-%dT%H:%M:%S')}") 

358 if first_date != last_date: 

359 last_r = last_date.astimezone(tz) if tz else last_date 

360 action_texts.append(f"{action} on {last_r.strftime('%Y-%m-%dT%H:%M:%S')}") 

361 

362 # Visual context window (last 5 minutes). 

363 video_texts = [] 

364 for obj in filtered_video: 

365 try: 

366 date = parse_date(obj["created_date"]) 

367 except Exception: 

368 continue 

369 if obj.get("gpt3_label") == 'Visual Context': 

370 if (now - date.replace(tzinfo=None)) > timedelta(minutes=5): 

371 continue 

372 date_r = date.astimezone(tz) if tz else date 

373 video_texts.append( 

374 f"{obj.get('action', '')} on {date_r.strftime('%Y-%m-%dT%H:%M:%S')}") 

375 if video_texts: 

376 action_texts.append('<Last_5_Minutes_Visual_Context_Start>') 

377 action_texts.extend(video_texts) 

378 action_texts.append('<Last_5_Minutes_Visual_Context_End>') 

379 action_texts.append( 

380 "If a person is identified in Visual_Context section " 

381 "that's most probably the user (me) & most likely not " 

382 "taking any selfie.") 

383 

384 # Screen context window (last 2 minutes). 

385 screen_texts = [] 

386 for obj in filtered_screen: 

387 try: 

388 date = parse_date(obj["created_date"]) 

389 except Exception: 

390 continue 

391 if (now - date.replace(tzinfo=None)) > timedelta(minutes=2): 

392 continue 

393 date_r = date.astimezone(tz) if tz else date 

394 screen_texts.append( 

395 f"{obj.get('action', '')} on {date_r.strftime('%Y-%m-%dT%H:%M:%S')}") 

396 if screen_texts: 

397 action_texts.append('<Last_2_Minutes_Screen_Context_Start>') 

398 action_texts.extend(screen_texts) 

399 action_texts.append('<Last_2_Minutes_Screen_Context_End>') 

400 action_texts.append( 

401 "Screen_Context shows what is currently displayed on the " 

402 "user's computer screen.") 

403 

404 if not action_texts: 

405 action_texts = ['user has not performed any actions yet.'] 

406 actions = ", ".join(action_texts) 

407 

408 formatted_time = datetime.now(pytz.utc).astimezone(tz).strftime( 

409 '%Y-%m-%d %H:%M:%S') if tz else datetime.now().strftime('%Y-%m-%d %H:%M:%S') 

410 actions += ( 

411 f". List of actions ends. <PREVIOUS_USER_ACTION_END> \n " 

412 f"Today's datetime in {_DEFAULT_TZ}is: {formatted_time} in this " 

413 f"format:'%Y-%m-%dT%H:%M:%S' \n Whenever user is asking about " 

414 f"current date or current time at particular location then use " 

415 f"this datetime format by asking what user's location is. Use " 

416 f"the previous sentence datetime info to answer current time " 

417 f"based questions coupled with google_search for current time " 

418 f"or full_history for historical conversation based answers. " 

419 f"Take a deep breath and think step by step.\n" 

420 ) 

421 return actions 

422 

423 

424def _format_profile(user_data: Optional[dict], verbose: bool = True) -> str: 

425 """Render a user profile dict into the prompt-ready string. 

426 

427 Uses ``.get()`` with "not specified" defaults throughout — the old 

428 ``reuse_recipe`` copy crashed on ``KeyError`` when a guest user had 

429 no cloud profile, which the hart_intelligence_entry copy already 

430 fixed. We keep the safer path as the only path. 

431 """ 

432 if not user_data: 

433 return "No user details available." 

434 

435 name = user_data.get("name") or user_data.get("display_name") \ 

436 or user_data.get("username") or "User" 

437 gender = user_data.get("gender", "not specified") 

438 lang = user_data.get("preferred_language", "not specified") 

439 dob = user_data.get("dob", "not specified") 

440 eng = user_data.get("english_proficiency", "not specified") 

441 created = user_data.get("created_date", "unknown") 

442 standard = user_data.get("standard", "not specified") 

443 pays = user_data.get("who_pays_for_course", "not specified") 

444 

445 if verbose: 

446 return ( 

447 f"Below are the information about the user.\n" 

448 f"user_name: {name} (Call the user by this name only when " 

449 f"required and not always), gender: {gender}, " 

450 f"who_pays_for_course: {pays}(Entity Responsible for Paying " 

451 f"the Course Fees), preferred_language: {lang}(User's " 

452 f"Preferred Language), date_of_birth: {dob}, " 

453 f"english_proficiency: {eng}(User's English Proficiency " 

454 f"Level), created_date: {created}(user creation date), " 

455 f"standard: {standard}(User's Standard in which user studying)\n" 

456 f"If any of the above fields show \"not specified\", do not " 

457 f"ask the user for this information proactively. Only note " 

458 f"it when naturally relevant. The user's privacy is paramount " 

459 f"— store preferences locally when volunteered, never push " 

460 f"for personal data." 

461 ) 

462 # Simple format for create-mode agent training. 

463 return ( 

464 f"Below are the information about the user.\n" 

465 f"user_name: {name}, gender: {gender}, " 

466 f"preferred_language: {lang}, date_of_birth: {dob}" 

467 ) 

468 

469 

470# ─── Layer 5: cheap-defaults (used on budget-blown fetch with empty cache) ─── 

471 

472#: Fallback strings used when the backend is unreachable AND nothing is 

473#: in the cache. These are intentionally generic — the LLM sees them 

474#: and understands "no profile / no action history available". They 

475#: used to be named with a "GREETING" prefix from the now-removed 

476#: regex short-circuit path; the module no longer has any classifier. 

477_DEFAULT_ACTIONS_EMPTY = 'user has not performed any actions yet.' 

478_DEFAULT_DETAILS_EMPTY = 'No user details available.' 

479 

480 

481def _cheap_defaults(mode: str) -> tuple[str, str]: 

482 """Return the zero-HTTP fallback tuple used when the budget is 

483 blown AND the cache is empty. Not used for classification — the 

484 only caller is :func:`get_user_context` after it gives up on a 

485 stalled backend and has no cached entry to fall back to.""" 

486 if mode == 'reuse': 

487 tz = _get_tz() 

488 formatted_time = datetime.now(pytz.utc).astimezone(tz).strftime( 

489 '%Y-%m-%d %H:%M:%S') if tz else datetime.now().strftime('%Y-%m-%d %H:%M:%S') 

490 actions = ( 

491 f"{_DEFAULT_ACTIONS_EMPTY}. <PREVIOUS_USER_ACTION_END> \n " 

492 f"Today's datetime in {_DEFAULT_TZ}is: {formatted_time}" 

493 ) 

494 else: 

495 actions = _DEFAULT_ACTIONS_EMPTY 

496 return _DEFAULT_DETAILS_EMPTY, actions 

497 

498 

499# ─── Layer 7: orchestration ─────────────────────────────────────────────── 

500 

501def _resolve_fresh(user_id, mode: str, timeout_budget_s: float) -> tuple[str, str]: 

502 """Fetch + format fresh data from the backend, respecting a total 

503 time budget. Populates the cache on success. 

504 

505 The budget is divided 50/50 between the two HTTP calls so a single 

506 slow endpoint can't consume the whole budget and starve the other. 

507 """ 

508 per_call_budget = max(0.3, timeout_budget_s / 2.0) 

509 

510 raw_actions = _fetch_actions_raw(user_id, per_call_budget) 

511 user_data = _fetch_profile_raw(user_id, per_call_budget) 

512 

513 if mode == 'reuse': 

514 actions_text = _format_action_rich(raw_actions or []) 

515 details_text = _format_profile(user_data, verbose=True) 

516 else: 

517 actions_text = _format_action_simple(raw_actions or []) 

518 details_text = _format_profile(user_data, verbose=False) 

519 

520 result = (details_text, actions_text) 

521 # Only cache if at least one HTTP call succeeded — caching a pure 

522 # default would lock the user into defaults for 30s after a 

523 # transient backend hiccup. 

524 if raw_actions is not None or user_data is not None: 

525 try: 

526 get_user_context_cache().set(user_id, mode, result) 

527 except Exception as e: 

528 logger.debug("cache set failed for user %s: %s", user_id, e) 

529 return result 

530 

531 

532def get_user_context( 

533 user_id, 

534 mode: Literal['create', 'reuse'] = 'reuse', 

535 timeout_budget_s: float = DEFAULT_BUDGET_SECONDS, 

536 ttl_s: float = DEFAULT_TTL_SECONDS, 

537) -> tuple[str, str]: 

538 """Canonical public entry point. 

539 

540 Two decision layers stacked fast-first: 

541 

542 1. Cache hit — if we fetched the same (user_id, mode) within 

543 the TTL, return the cached tuple in microseconds. 

544 2. Budget-guarded fresh fetch — submit the HTTP fetch to a 

545 background thread and wait on it with a hard wall-clock 

546 deadline. If the fetch completes in time, its result is 

547 cached and returned. If it blows the budget, the running 

548 future is LEFT RUNNING (it will populate the cache when it 

549 eventually lands) and we return cheap defaults immediately 

550 so the hot path never blocks past ``timeout_budget_s``. 

551 

552 This function deliberately does NOT classify the user's chat 

553 message. Chat intent classification is owned by the draft 0.8B 

554 model in ``speculative_dispatcher.dispatch_draft_first`` with its 

555 augmented classifier prompt — callers that want to skip fetching 

556 for a casual message should consult the draft's structured 

557 envelope, never re-classify here with Python regex. 

558 

559 Args: 

560 user_id: Hevolve user id (int or str — passed through to the 

561 backend as-is). 

562 mode: ``'reuse'`` for normal chat path (rich formatting, 

563 visual + screen context, verbose profile). ``'create'`` 

564 for the initial agent-training path (simple formatting, 

565 no context windows). 

566 timeout_budget_s: Hard wall-clock budget for the HTTP fetch 

567 phase. Defaults to ``DEFAULT_BUDGET_SECONDS`` (1.5s). 

568 ttl_s: Cache freshness window. Reserved for future per-call 

569 overrides — the module-level constant applies today. 

570 

571 Returns: 

572 ``(user_details, actions)`` — both strings, both safe to embed 

573 in an LLM prompt. On total failure both default to cheap 

574 placeholder strings, never None, never an exception. 

575 """ 

576 del ttl_s # Reserved for future use; the module-level constant applies. 

577 

578 # Layer 1: cache hit. 

579 cache = get_user_context_cache() 

580 cached = cache.get(user_id, mode) 

581 if cached is not None: 

582 logger.debug("user_context: cache hit user=%s mode=%s", user_id, mode) 

583 return cached 

584 

585 # Layer 2: budget-guarded fetch. We want the fetch to COMPLETE 

586 # inside the budget, not merely to start it — so we use a thread 

587 # with a future.result(timeout) wall. 

588 start = time.monotonic() 

589 future = _refresh_pool.submit(_resolve_fresh, user_id, mode, timeout_budget_s) 

590 try: 

591 result = future.result(timeout=timeout_budget_s) 

592 logger.debug( 

593 "user_context: fresh fetch user=%s mode=%s %.2fs", 

594 user_id, mode, time.monotonic() - start, 

595 ) 

596 return result 

597 except Exception as e: 

598 # Timeout or fetch error. Return cheap defaults IMMEDIATELY and 

599 # let the already-submitted future keep running — when it lands, 

600 # it will populate the cache so the next request is fast. 

601 logger.info( 

602 "user_context: hot-path budget blown (%.2fs > %.2fs) user=%s: %s — " 

603 "returning defaults, refresh continues in background", 

604 time.monotonic() - start, timeout_budget_s, user_id, e, 

605 ) 

606 return _cheap_defaults(mode) 

607 

608 

609# ─── Public helpers ─────────────────────────────────────────────────────── 

610 

611def invalidate_user_context(user_id, mode: Optional[str] = None) -> None: 

612 """Drop cached entries for a user. Call when the backend writes a 

613 profile update or when auth state changes.""" 

614 try: 

615 get_user_context_cache().invalidate(user_id, mode) 

616 except Exception as e: 

617 logger.debug("invalidate failed for %s: %s", user_id, e) 

618 

619 

620__all__ = [ 

621 'get_user_context', 

622 'get_user_context_cache', 

623 'invalidate_user_context', 

624 'UserContextCache', 

625 'DEFAULT_BUDGET_SECONDS', 

626 'DEFAULT_TTL_SECONDS', 

627]