Class: Assistant::ToolLoopGuard

Inherits:
Object
  • Object
show all
Includes:
ResponseMessages
Defined in:
app/services/assistant/tool_loop_guard.rb,
app/services/assistant/tool_loop_guard/response_messages.rb

Overview

Wraps RubyLLM tools with lightweight guardrails to prevent repetitive
tool loops and force recovery steps when SQL calls repeatedly fail.

Guards:

  1. Identical call dedup: blocks the same tool+args after MAX_IDENTICAL_CALLS.
  2. Consecutive SQL failures: forces schema discovery after MAX_CONSECUTIVE_SQL_FAILURES.
  3. Total call ceiling: hard-stops all tool calls after the effective limit to
    prevent unbounded recursion in RubyLLM's handle_tool_calls → complete loop.
    Budget scales with plan size: BASE_TOOL_CALLS without a plan, up to
    MAX_PLAN_TOOL_CALLS when a plan is declared (CALLS_PER_PLAN_STEP × steps).

Many orthogonal guardrails intentionally share one lifecycle; splitting files
would obscure the single-turn flow and duplicate shared constants.
rubocop:disable Metrics/ClassLength, Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/MethodLength, Metrics/PerceivedComplexity, Metrics/ParameterLists

Defined Under Namespace

Modules: ResponseMessages

Constant Summary collapse

MAX_IDENTICAL_CALLS =

Block ANY repeat of the exact same call within a turn

1
MAX_CONSECUTIVE_SQL_FAILURES =
3
MAX_CONSECUTIVE_PATCH_FAILURES =
3
BASE_TOOL_CALLS =

Default limit without a plan — keeps initial turns fast

15
PLAN_STEP_TOOL_CALLS =

Per-step budget during server-orchestrated plan execution

40
CALLS_PER_PLAN_STEP =

Extra budget granted per declared plan step

20
MAX_PLAN_TOOL_CALLS =

Absolute ceiling even with a plan

200
BUDGET_WARNING_THRESHOLD =
5
MAX_TURN_DURATION =

seconds — default wall-clock cap per turn (Flash, GPT, Haiku)

180
MAX_TURN_DURATION_THINKING =

seconds — extended cap for thinking-capable models

360
MAX_STEP_DURATION =

(Gemini Pro / Claude Sonnet+Opus / GPT-5 reasoning).
The first describe_available_data call alone can take
~60-130s on Pro because of extended thinking, leaving
almost no budget under the default cap.

600
WRITE_TOOLS =

seconds — wall-clock cap per plan step (Gemini Pro needs 60-90s/call)

%w[
  update_blog_post create_blog_post edit_blog_post
  remove_embed replace_embed move_embed
  create_faq insert_faqs refresh_blog_oembeds
].to_set.freeze
PATCHABLE_TOOLS =

Tools that write to a blog post body — participate in the patch-failure
circuit breaker and per-post reread gate (Stage 6).

%w[update_blog_post edit_blog_post].to_set.freeze
READS_INVALIDATED_BY_WRITES =

Read tools whose cached signatures should be cleared after any write tool
executes, because the underlying content has changed and re-reading is valid.

%w[get_blog_post].freeze
COMPLETION_MARKING_TOOLS =

Tools that mark SEO work as "done" — must not run until writes are verified.

%w[seo_update_action_item seo_batch_update_action_items].to_set.freeze
VERIFICATION_READS =

Read tools that count as verification of a prior write.

%w[get_blog_post].to_set.freeze
WRITES_REQUIRING_VERIFICATION =

Historically patch_blog_post needed a follow-up get_blog_post to verify
because find/replace could silently mismatch. That tool was removed in
Stage 7 of the Sunny blog editor fix plan, so no current write tool
needs post-hoc verification — block-ID ops in edit_blog_post are
self-verifying (each op_result reports applied / not_found / invalid),
update_blog_post replaces the full body, and create_faq / insert_faqs
manipulate separate records.

Set.new.freeze
SELF_VERIFYING_WRITES =

A successful full-body write is authoritative — it makes any prior pending
patch verifications moot because the entire content was replaced.

%w[update_blog_post create_blog_post].to_set.freeze
MAX_CONSECUTIVE_EMPTY_SEARCHES =

Consecutive search-type calls that return zero results before forcing
the model to stop and ask the user for clarification.

3
MAX_CONSECUTIVE_SEO_LOOKUP_FAILURES =

seo_get_page / seo_get_action_items(path:) when the model guesses wrong paths repeatedly

3
SEO_LOOKUP_TOOLS =
%w[seo_get_page seo_get_action_items].freeze
SEARCH_TOOLS =

Search/lookup tools that are prone to "spiral" loops when the model
can't find what it's looking for and keeps retrying with variations.

%w[
  semantic_search find_images search_images find_faqs find_call_recordings
  search_activity_notes search_brain search_products
  find_support_cases search_support_notes
  find_employee get_team_availability get_pipeline_summary
].to_set.freeze

Instance Method Summary collapse

Constructor Details

#initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil, supports_thinking: false, cancel_check: nil) ⇒ ToolLoopGuard

Returns a new instance of ToolLoopGuard.



89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# File 'app/services/assistant/tool_loop_guard.rb', line 89

def initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil,
               supports_thinking: false, cancel_check: nil)
  @role = role.to_sym
  @conversation_id = conversation_id
  @plan_step_mode = plan_step_mode
  @cross_step_cache = cross_step_cache
  @supports_thinking = supports_thinking
  # Optional Proc returning truthy when the user has requested cancellation.
  # Checked in preflight so the guard halts mid-tool-loop, not just at
  # plan-step boundaries — see PlanOrchestrator and AssistantChatWorker.
  @cancel_check = cancel_check
  @call_counts = Hash.new(0)
  # Tools whose execute body actually ran (preflight passed and
  # original_execute was invoked). @call_counts is bumped before dedup /
  # recovery short-circuits, so it must not drive plan "services used"
  # carryover — see PR 644 review.
  @executed_tool_names = Set.new
  @consecutive_sql_failures = 0
  @consecutive_patch_failures = 0
  @consecutive_empty_searches = 0
  @cumulative_empty_searches = 0
  @cumulative_sql_errors = 0
  @cumulative_patch_errors = 0
  @cross_step_cache_hits = 0
  @total_calls = 0
  @final_write_used = false
  @limit_reached = false
  @cancelled = false
  @writes_pending_verification = 0
  @plan_declared = false
  @plan_step_count = 0
  @plan_nudge_sent = false
  @consecutive_seo_lookup_failures = 0
  # Stage 6 of the Sunny blog editor fix plan: per-post reread breaker.
  # Any failed patch_blog_post / edit_blog_post sets the post_id here
  # so the next write to that post is blocked until get_blog_post(post_id)
  # is called. Cleared by the verification read.
  @posts_requiring_reread = Set.new
  @started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
end

Instance Method Details

#apply!(tools) ⇒ Object



180
181
182
183
# File 'app/services/assistant/tool_loop_guard.rb', line 180

def apply!(tools)
  Array(tools).each { |tool| wrap!(tool) }
  tools
end

#cancelled?Boolean

True after the user-cancellation check fired at least once.
Used by PlanOrchestrator to short-circuit remaining steps without
waiting for the next outer cancel_check at the step boundary.

Returns:

  • (Boolean)


153
154
155
# File 'app/services/assistant/tool_loop_guard.rb', line 153

def cancelled?
  @cancelled
end

#effective_limitObject

Dynamic tool call limit.

  • plan_step_mode: server-orchestrated step execution → PLAN_STEP_TOOL_CALLS
  • plan_declared: model declared a plan inline → scales with step count
  • otherwise: BASE_TOOL_CALLS (keeps initial turns fast)


134
135
136
137
138
139
140
141
142
# File 'app/services/assistant/tool_loop_guard.rb', line 134

def effective_limit
  if @plan_step_mode
    PLAN_STEP_TOOL_CALLS
  elsif @plan_declared && @plan_step_count.positive?
    [BASE_TOOL_CALLS, @plan_step_count * CALLS_PER_PLAN_STEP].max.clamp(BASE_TOOL_CALLS, MAX_PLAN_TOOL_CALLS)
  else
    BASE_TOOL_CALLS
  end
end

#limit_reached?Boolean

True after the total call ceiling was hit at least once.
Used by the worker to render a "Continue in new conversation" button.

Returns:

  • (Boolean)


146
147
148
# File 'app/services/assistant/tool_loop_guard.rb', line 146

def limit_reached?
  @limit_reached
end

#max_turn_durationObject

Wall-clock cap for the current turn. Thinking-capable models (Gemini Pro,
Claude Sonnet+Opus reasoning, etc.) get a longer cap because their first
tool call commonly spends 60-130s on extended thinking.



160
161
162
# File 'app/services/assistant/tool_loop_guard.rb', line 160

def max_turn_duration
  @supports_thinking ? MAX_TURN_DURATION_THINKING : MAX_TURN_DURATION
end

#statsObject



164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
# File 'app/services/assistant/tool_loop_guard.rb', line 164

def stats
  tool_names = @executed_tool_names.to_a
  {
    total_tool_calls: @total_calls,
    tool_call_limit: effective_limit,
    sql_errors: @cumulative_sql_errors,
    patch_errors: @cumulative_patch_errors,
    empty_searches: @cumulative_empty_searches,
    writes_pending: @writes_pending_verification,
    unique_tools: tool_names.size,
    unique_tool_names: tool_names,
    limit_reached: @limit_reached,
    cross_step_cache_hits: @cross_step_cache_hits
  }
end