Budget Gating¶
Pre-dispatch cost control that prevents LLM calls from exceeding available budgets.
Gate Functions¶
estimate_llm_cost_spark()¶
Estimates the Spark cost of an LLM call before it happens.
- Calculates token count from the prompt.
- Multiplies by the per-model cost rate.
- Local models (llama, mistral, phi, qwen, groq) have a cost rate of 0 Spark.
- Cloud models have per-1K-token rates defined in the model registry.
check_goal_budget()¶
Checks and atomically deducts from a goal's Spark budget.
- Uses a row-level lock on
AgentGoal.spark_budget. - If the estimated cost exceeds the remaining budget, the call is rejected.
- Deduction is atomic to prevent race conditions under concurrent dispatch.
check_platform_affordability()¶
Verifies the platform can afford the call at a macro level.
- Computes 7-day net revenue (revenue minus costs).
- Result is cached for 60 seconds to avoid repeated database queries.
- If the platform is running at a net loss, non-essential calls may be throttled.
pre_dispatch_budget_gate()¶
The combined gate that runs all three checks in sequence:
estimate_llm_cost_spark()-- compute the cost.check_goal_budget()-- verify the goal can pay.check_platform_affordability()-- verify the platform can sustain it.
If any check fails, the LLM call is not dispatched.
Source Files¶
integrations/agent_engine/budget_gate.pyintegrations/agent_engine/speculative_dispatcher.py