Class: Assistant::ToolLoopGuard
- Inherits:
-
Object
- Object
- Assistant::ToolLoopGuard
- Includes:
- ResponseMessages
- Defined in:
- app/services/assistant/tool_loop_guard.rb,
app/services/assistant/tool_loop_guard/response_messages.rb
Overview
Wraps RubyLLM tools with lightweight guardrails to prevent repetitive
tool loops and force recovery steps when SQL calls repeatedly fail.
Guards:
- Identical call dedup: blocks the same tool+args after MAX_IDENTICAL_CALLS.
- Consecutive SQL failures: forces schema discovery after MAX_CONSECUTIVE_SQL_FAILURES.
- Total call ceiling: hard-stops all tool calls after the effective limit to
prevent unbounded recursion in RubyLLM's handle_tool_calls → complete loop.
Budget scales with plan size: BASE_TOOL_CALLS without a plan, up to
MAX_PLAN_TOOL_CALLS when a plan is declared (CALLS_PER_PLAN_STEP × steps).
Many orthogonal guardrails intentionally share one lifecycle; splitting files
would obscure the single-turn flow and duplicate shared constants.
rubocop:disable Metrics/ClassLength, Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
Defined Under Namespace
Modules: ResponseMessages
Constant Summary collapse
- MAX_IDENTICAL_CALLS =
Block ANY repeat of the exact same call within a turn
1- MAX_CONSECUTIVE_SQL_FAILURES =
Maximum consecutive sql failures.
3- MAX_CONSECUTIVE_PATCH_FAILURES =
Maximum consecutive patch failures.
3- BASE_TOOL_CALLS =
Default limit without a plan. Was 15 — bumped after a 10-day audit
20- PLAN_STEP_TOOL_CALLS =
showed read-only sales-rep workflows ("review opportunity X, draft a follow-up email") routinely
legitimately consume 15-20 calls per turn (describe ×2 + bundled lookup + activity search +
quote/order pulls + email draft) and were hitting the limit before the model could answer. 40- CALLS_PER_PLAN_STEP =
Per-step budget during server-orchestrated plan execution
20- MAX_PLAN_TOOL_CALLS =
Extra budget granted per declared plan step
200- BUDGET_WARNING_THRESHOLD =
Threshold for budget warning.
5- MAX_TURN_DURATION =
seconds — default wall-clock cap per turn (Flash, GPT, Haiku)
180- MAX_TURN_DURATION_THINKING =
seconds — extended cap for thinking-capable models
360- MAX_TURN_DURATION_AUTHORING =
(Gemini Pro / Claude Sonnet+Opus / GPT-5 reasoning).
The first describe_available_data call alone can take
~60-130s on Pro because of extended thinking, leaving
almost no budget under the default cap. 600- MAX_STEP_DURATION =
turns (blog_management / email_management) on thinking models. These edit large HTML bodies
with many media/embed/product tool calls plus extended thinking, and legitimately ran ~420-450s
— past the 360s thinking cap — so they were being halted mid-edit (AppSignal #4714 on
edit_email_template). Matches MAX_STEP_DURATION; non-authoring turns keep the 360s telemetry cap. 600- WRITE_TOOLS =
seconds — wall-clock cap per plan step (Gemini Pro needs 60-90s/call)
%w[ update_blog_post create_blog_post edit_blog_post update_email_template create_email_template edit_email_template clone_email_template remove_embed replace_embed move_embed create_faq insert_faqs refresh_blog_oembeds ].to_set.freeze
- PATCHABLE_TOOLS =
Tools that write to a blog post body — participate in the patch-failure
circuit breaker and per-post reread gate (Stage 6). %w[update_blog_post edit_blog_post].to_set.freeze
- READS_INVALIDATED_BY_WRITES =
Read tools whose cached signatures should be cleared after any write tool
executes, because the underlying content has changed and re-reading is valid.
get_block_html / get_email_block_html are included so the model can re-read
a block's exact HTML after editing it (and read a different block later)
without the per-turn dedup guard blocking the call as a repeat. %w[get_blog_post get_block_html get_email_block_html].freeze
- COMPLETION_MARKING_TOOLS =
Tools that mark SEO work as "done" — must not run until writes are verified.
%w[seo_update_action_item seo_batch_update_action_items].to_set.freeze
- VERIFICATION_READS =
Read tools that count as verification of a prior write.
%w[get_blog_post].to_set.freeze
- WRITES_REQUIRING_VERIFICATION =
Historically patch_blog_post needed a follow-up get_blog_post to verify
because find/replace could silently mismatch. That tool was removed in
Stage 7 of the Sunny blog editor fix plan, so no current write tool
needs post-hoc verification — block-ID ops in edit_blog_post are
self-verifying (each op_result reports applied / not_found / invalid),
update_blog_post replaces the full body, and create_faq / insert_faqs
manipulate separate records. Set.new.freeze
- SELF_VERIFYING_WRITES =
A successful full-body write is authoritative — it makes any prior pending
patch verifications moot because the entire content was replaced. %w[update_blog_post create_blog_post].to_set.freeze
- MAX_CONSECUTIVE_EMPTY_SEARCHES =
Consecutive search-type calls that return zero results before forcing
the model to stop and ask the user for clarification. 3- MAX_CONSECUTIVE_SEO_LOOKUP_FAILURES =
seo_get_page / seo_get_action_items(path:) when the model guesses wrong paths repeatedly
3- SEO_LOOKUP_TOOLS =
Seo lookup tools.
%w[seo_get_page seo_get_action_items].freeze
- SEARCH_TOOLS =
Search/lookup tools that are prone to "spiral" loops when the model
can't find what it's looking for and keeps retrying with variations. %w[ semantic_search find_images search_images find_faqs find_call_recordings search_activity_notes search_brain search_products find_support_cases search_support_notes find_employee get_team_availability get_pipeline_summary ].to_set.freeze
Instance Method Summary collapse
- #apply!(tools) ⇒ Object
-
#cancelled? ⇒ Boolean
True after the user-cancellation check fired at least once.
-
#effective_limit ⇒ Object
Dynamic tool call limit.
-
#initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil, supports_thinking: false, cancel_check: nil, available_tool_names: nil, described_views: nil, long_form_authoring: false) ⇒ ToolLoopGuard
constructor
A new instance of ToolLoopGuard.
-
#limit_reached? ⇒ Boolean
True after the total call ceiling was hit at least once.
-
#loaded_write_tool_names ⇒ Array<String>
Names of write tools that are loaded for this turn.
-
#max_turn_duration ⇒ Integer
Wall-clock cap for the current turn.
- #stats ⇒ Object
Constructor Details
#initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil, supports_thinking: false, cancel_check: nil, available_tool_names: nil, described_views: nil, long_form_authoring: false) ⇒ ToolLoopGuard
Returns a new instance of ToolLoopGuard.
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
# File 'app/services/assistant/tool_loop_guard.rb', line 105 def initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil, supports_thinking: false, cancel_check: nil, available_tool_names: nil, described_views: nil, long_form_authoring: false) @role = role.to_sym @conversation_id = conversation_id @plan_step_mode = plan_step_mode @cross_step_cache = cross_step_cache @supports_thinking = supports_thinking # True when the turn loads long-form content-authoring tools (blog/email). # Grants a longer wall-clock cap (MAX_TURN_DURATION_AUTHORING) so large # template/body edits finish instead of being halted mid-edit (#4714). @long_form_authoring = # Optional Proc returning truthy when the user has requested cancellation. # Checked in preflight so the guard halts mid-tool-loop, not just at # plan-step boundaries — see PlanOrchestrator and AssistantChatWorker. @cancel_check = cancel_check # Names of every tool registered for this turn. Used by response messages # so the budget-exhaustion / hard-stop guidance only suggests a write # tool when one is actually loaded — read-only sales workflows used to # be told "use a write op" with no write op available, which the model # acted on by ending mid-task. @available_tool_names = Array(available_tool_names).map(&:to_s).freeze @call_counts = Hash.new(0) # Tools whose execute body actually ran (preflight passed and # original_execute was invoked). @call_counts is bumped before dedup / # recovery short-circuits, so it must not drive plan "services used" # carryover — see PR 644 review. @executed_tool_names = Set.new @consecutive_sql_failures = 0 @consecutive_patch_failures = 0 @consecutive_empty_searches = 0 @cumulative_empty_searches = 0 @cumulative_sql_errors = 0 @cumulative_patch_errors = 0 @cross_step_cache_hits = 0 @total_calls = 0 @final_write_used = false @limit_reached = false @cancelled = false @writes_pending_verification = 0 @plan_declared = false @plan_step_count = 0 @plan_nudge_sent = false @consecutive_seo_lookup_failures = 0 # Stage 6 of the Sunny blog editor fix plan: per-post reread breaker. # Any failed patch_blog_post / edit_blog_post sets the post_id here # so the next write to that post is blocked until get_blog_post(post_id) # is called. Cleared by the verification read. @posts_requiring_reread = Set.new # SQL HYGIENE: track which views have been resolved via # describe_available_data this turn. Subsequent *_execute_sql # calls that reference a view NOT in this set are blocked with a # structured error pointing the model at describe first — prevents # the column-hallucination loop that PR #720 first tried to address # with prompt-only guidance (kept inventing fresh column names like # `assigned_resource_name` / `assigned_rep_name`). # # In plan_step_mode the orchestrator passes a shared Set so a # describe call in step 1 still satisfies SQL in step 5 — otherwise # the model would be forced to re-describe each step, which is wasteful # and itself burns budget. @described_views = described_views || Set.new @started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC) end |
Instance Method Details
#apply!(tools) ⇒ Object
238 239 240 241 |
# File 'app/services/assistant/tool_loop_guard.rb', line 238 def apply!(tools) Array(tools).each { |tool| wrap!(tool) } tools end |
#cancelled? ⇒ Boolean
True after the user-cancellation check fired at least once.
Used by PlanOrchestrator to short-circuit remaining steps without
waiting for the next outer cancel_check at the step boundary.
193 194 195 |
# File 'app/services/assistant/tool_loop_guard.rb', line 193 def cancelled? @cancelled end |
#effective_limit ⇒ Object
Dynamic tool call limit.
- plan_step_mode: server-orchestrated step execution → PLAN_STEP_TOOL_CALLS
- plan_declared: model declared a plan inline → scales with step count
- otherwise: BASE_TOOL_CALLS (keeps initial turns fast)
174 175 176 177 178 179 180 181 182 |
# File 'app/services/assistant/tool_loop_guard.rb', line 174 def effective_limit if @plan_step_mode PLAN_STEP_TOOL_CALLS elsif @plan_declared && @plan_step_count.positive? [BASE_TOOL_CALLS, @plan_step_count * CALLS_PER_PLAN_STEP].max.clamp(BASE_TOOL_CALLS, MAX_PLAN_TOOL_CALLS) else BASE_TOOL_CALLS end end |
#limit_reached? ⇒ Boolean
True after the total call ceiling was hit at least once.
Used by the worker to render a "Continue in new conversation" button.
186 187 188 |
# File 'app/services/assistant/tool_loop_guard.rb', line 186 def limit_reached? @limit_reached end |
#loaded_write_tool_names ⇒ Array<String>
Names of write tools that are loaded for this turn. Used by response
messages to tailor the budget-exhaustion guidance — read-only sales
workflows shouldn't be told to "use a write tool" when none exist.
218 219 220 |
# File 'app/services/assistant/tool_loop_guard.rb', line 218 def loaded_write_tool_names @loaded_write_tool_names ||= (@available_tool_names & WRITE_TOOLS.to_a).freeze end |
#max_turn_duration ⇒ Integer
Wall-clock cap for the current turn. Thinking-capable models (Gemini Pro,
Claude Sonnet+Opus reasoning, etc.) get a longer cap because their first
tool call commonly spends 60-130s on extended thinking. Long-form
content-authoring turns (blog/email) get the longest cap because they edit
large HTML bodies across many tool calls and were being halted mid-edit (#4714).
206 207 208 209 210 211 |
# File 'app/services/assistant/tool_loop_guard.rb', line 206 def max_turn_duration return MAX_TURN_DURATION unless @supports_thinking return MAX_TURN_DURATION_AUTHORING if @long_form_authoring MAX_TURN_DURATION_THINKING end |
#stats ⇒ Object
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
# File 'app/services/assistant/tool_loop_guard.rb', line 222 def stats tool_names = @executed_tool_names.to_a { total_tool_calls: @total_calls, tool_call_limit: effective_limit, sql_errors: @cumulative_sql_errors, patch_errors: @cumulative_patch_errors, empty_searches: @cumulative_empty_searches, writes_pending: @writes_pending_verification, unique_tools: tool_names.size, unique_tool_names: tool_names, limit_reached: @limit_reached, cross_step_cache_hits: @cross_step_cache_hits } end |