Class: Assistant::ToolLoopGuard

Inherits:

Object

Object
Assistant::ToolLoopGuard

show all

Includes:: ResponseMessages

Defined in:: app/services/assistant/tool_loop_guard.rb,
app/services/assistant/tool_loop_guard/response_messages.rb

Overview

Wraps RubyLLM tools with lightweight guardrails to prevent repetitive
tool loops and force recovery steps when SQL calls repeatedly fail.

Guards:

Identical call dedup: blocks the same tool+args after MAX_IDENTICAL_CALLS.
Consecutive SQL failures: forces schema discovery after MAX_CONSECUTIVE_SQL_FAILURES.
Total call ceiling: hard-stops all tool calls after the effective limit to
prevent unbounded recursion in RubyLLM's handle_tool_calls → complete loop.
Budget scales with plan size: BASE_TOOL_CALLS without a plan, up to
MAX_PLAN_TOOL_CALLS when a plan is declared (CALLS_PER_PLAN_STEP × steps).

Many orthogonal guardrails intentionally share one lifecycle; splitting files
would obscure the single-turn flow and duplicate shared constants.
rubocop:disable Metrics/ClassLength, Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity

Defined Under Namespace

Modules: ResponseMessages

Constant Summary collapse

MAX_IDENTICAL_CALLS = Block ANY repeat of the exact same call within a turn

MAX_CONSECUTIVE_SQL_FAILURES = Maximum consecutive sql failures.

MAX_CONSECUTIVE_PATCH_FAILURES = Maximum consecutive patch failures.

BASE_TOOL_CALLS = Default limit without a plan. Was 15 — bumped after a 10-day audit

PLAN_STEP_TOOL_CALLS = showed read-only sales-rep workflows ("review opportunity X, draft a follow-up email") routinely legitimately consume 15-20 calls per turn (describe ×2 + bundled lookup + activity search + quote/order pulls + email draft) and were hitting the limit before the model could answer.

CALLS_PER_PLAN_STEP = Per-step budget during server-orchestrated plan execution

MAX_PLAN_TOOL_CALLS = Extra budget granted per declared plan step

BUDGET_WARNING_THRESHOLD = Threshold for budget warning.

MAX_TURN_DURATION = seconds — default wall-clock cap per turn (Flash, GPT, Haiku)

MAX_TURN_DURATION_THINKING = seconds — extended cap for thinking-capable models

MAX_TURN_DURATION_AUTHORING = (Gemini Pro / Claude Sonnet+Opus / GPT-5 reasoning). The first describe_available_data call alone can take ~60-130s on Pro because of extended thinking, leaving almost no budget under the default cap.

MAX_STEP_DURATION = turns (blog_management / email_management) on thinking models. These edit large HTML bodies with many media/embed/product tool calls plus extended thinking, and legitimately ran ~420-450s — past the 360s thinking cap — so they were being halted mid-edit (AppSignal #4714 on edit_email_template). Matches MAX_STEP_DURATION; non-authoring turns keep the 360s telemetry cap.

WRITE_TOOLS = seconds — wall-clock cap per plan step (Gemini Pro needs 60-90s/call)

%w[
  update_blog_post create_blog_post edit_blog_post
  update_email_template create_email_template edit_email_template clone_email_template
  remove_embed replace_embed move_embed
  create_faq insert_faqs refresh_blog_oembeds
].to_set.freeze

PATCHABLE_TOOLS = Tools that write to a blog post body — participate in the patch-failure circuit breaker and per-post reread gate (Stage 6).

%w[update_blog_post edit_blog_post].to_set.freeze

READS_INVALIDATED_BY_WRITES = Read tools whose cached signatures should be cleared after any write tool executes, because the underlying content has changed and re-reading is valid. get_block_html / get_email_block_html are included so the model can re-read a block's exact HTML after editing it (and read a different block later) without the per-turn dedup guard blocking the call as a repeat.

%w[get_blog_post get_block_html get_email_block_html].freeze

COMPLETION_MARKING_TOOLS = Tools that mark SEO work as "done" — must not run until writes are verified.

%w[seo_update_action_item seo_batch_update_action_items].to_set.freeze

VERIFICATION_READS = Read tools that count as verification of a prior write.

%w[get_blog_post].to_set.freeze

WRITES_REQUIRING_VERIFICATION = Historically patch_blog_post needed a follow-up get_blog_post to verify because find/replace could silently mismatch. That tool was removed in Stage 7 of the Sunny blog editor fix plan, so no current write tool needs post-hoc verification — block-ID ops in edit_blog_post are self-verifying (each op_result reports applied / not_found / invalid), update_blog_post replaces the full body, and create_faq / insert_faqs manipulate separate records.

Set.new.freeze

SELF_VERIFYING_WRITES = A successful full-body write is authoritative — it makes any prior pending patch verifications moot because the entire content was replaced.

%w[update_blog_post create_blog_post].to_set.freeze

MAX_CONSECUTIVE_EMPTY_SEARCHES = Consecutive search-type calls that return zero results before forcing the model to stop and ask the user for clarification.

MAX_CONSECUTIVE_SEO_LOOKUP_FAILURES = seo_get_page / seo_get_action_items(path:) when the model guesses wrong paths repeatedly

SEO_LOOKUP_TOOLS = Seo lookup tools.

%w[seo_get_page seo_get_action_items].freeze

SEARCH_TOOLS = Search/lookup tools that are prone to "spiral" loops when the model can't find what it's looking for and keeps retrying with variations.

%w[
  semantic_search find_images search_images find_faqs find_call_recordings
  search_activity_notes search_brain search_products
  find_support_cases search_support_notes
  find_employee get_team_availability get_pipeline_summary
].to_set.freeze

Instance Method Summary collapse

#apply!(tools) ⇒ Object
#cancelled? ⇒ Boolean
True after the user-cancellation check fired at least once.
#effective_limit ⇒ Object
Dynamic tool call limit.
#initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil, supports_thinking: false, cancel_check: nil, available_tool_names: nil, described_views: nil, long_form_authoring: false) ⇒ ToolLoopGuard constructor
A new instance of ToolLoopGuard.
#limit_reached? ⇒ Boolean
True after the total call ceiling was hit at least once.
#loaded_write_tool_names ⇒ Array<String>
Names of write tools that are loaded for this turn.
#max_turn_duration ⇒ Integer
Wall-clock cap for the current turn.
#stats ⇒ Object

Constructor Details

#initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil, supports_thinking: false, cancel_check: nil, available_tool_names: nil, described_views: nil, long_form_authoring: false) ⇒ `ToolLoopGuard`

Returns a new instance of ToolLoopGuard.

# File 'app/services/assistant/tool_loop_guard.rb', line 105

def initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil,
               supports_thinking: false, cancel_check: nil, available_tool_names: nil,
               described_views: nil, long_form_authoring: false)
  @role = role.to_sym
  @conversation_id = conversation_id
  @plan_step_mode = plan_step_mode
  @cross_step_cache = cross_step_cache
  @supports_thinking = supports_thinking
  # True when the turn loads long-form content-authoring tools (blog/email).
  # Grants a longer wall-clock cap (MAX_TURN_DURATION_AUTHORING) so large
  # template/body edits finish instead of being halted mid-edit (#4714).
  @long_form_authoring = long_form_authoring
  # Optional Proc returning truthy when the user has requested cancellation.
  # Checked in preflight so the guard halts mid-tool-loop, not just at
  # plan-step boundaries — see PlanOrchestrator and AssistantChatWorker.
  @cancel_check = cancel_check
  # Names of every tool registered for this turn. Used by response messages
  # so the budget-exhaustion / hard-stop guidance only suggests a write
  # tool when one is actually loaded — read-only sales workflows used to
  # be told "use a write op" with no write op available, which the model
  # acted on by ending mid-task.
  @available_tool_names = Array(available_tool_names).map(&:to_s).freeze
  @call_counts = Hash.new(0)
  # Tools whose execute body actually ran (preflight passed and
  # original_execute was invoked). @call_counts is bumped before dedup /
  # recovery short-circuits, so it must not drive plan "services used"
  # carryover — see PR 644 review.
  @executed_tool_names = Set.new
  @consecutive_sql_failures = 0
  @consecutive_patch_failures = 0
  @consecutive_empty_searches = 0
  @cumulative_empty_searches = 0
  @cumulative_sql_errors = 0
  @cumulative_patch_errors = 0
  @cross_step_cache_hits = 0
  @total_calls = 0
  @final_write_used = false
  @limit_reached = false
  @cancelled = false
  @writes_pending_verification = 0
  @plan_declared = false
  @plan_step_count = 0
  @plan_nudge_sent = false
  @consecutive_seo_lookup_failures = 0
  # Stage 6 of the Sunny blog editor fix plan: per-post reread breaker.
  # Any failed patch_blog_post / edit_blog_post sets the post_id here
  # so the next write to that post is blocked until get_blog_post(post_id)
  # is called. Cleared by the verification read.
  @posts_requiring_reread = Set.new
  # SQL HYGIENE: track which views have been resolved via
  # describe_available_data this turn. Subsequent *_execute_sql
  # calls that reference a view NOT in this set are blocked with a
  # structured error pointing the model at describe first — prevents
  # the column-hallucination loop that PR #720 first tried to address
  # with prompt-only guidance (kept inventing fresh column names like
  # `assigned_resource_name` / `assigned_rep_name`).
  #
  # In plan_step_mode the orchestrator passes a shared Set so a
  # describe call in step 1 still satisfies SQL in step 5 — otherwise
  # the model would be forced to re-describe each step, which is wasteful
  # and itself burns budget.
  @described_views = described_views || Set.new
  @started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
end

Instance Method Details

#apply!(tools) ⇒ `Object`

# File 'app/services/assistant/tool_loop_guard.rb', line 238

def apply!(tools)
  Array(tools).each { |tool| wrap!(tool) }
  tools
end

#cancelled? ⇒ `Boolean`

True after the user-cancellation check fired at least once.
Used by PlanOrchestrator to short-circuit remaining steps without
waiting for the next outer cancel_check at the step boundary.

Returns:

(Boolean)



193
194
195

# File 'app/services/assistant/tool_loop_guard.rb', line 193

def cancelled?
  @cancelled
end

#effective_limit ⇒ `Object`

Dynamic tool call limit.

plan_step_mode: server-orchestrated step execution → PLAN_STEP_TOOL_CALLS
plan_declared: model declared a plan inline → scales with step count
otherwise: BASE_TOOL_CALLS (keeps initial turns fast)

# File 'app/services/assistant/tool_loop_guard.rb', line 174

def effective_limit
  if @plan_step_mode
    PLAN_STEP_TOOL_CALLS
  elsif @plan_declared && @plan_step_count.positive?
    [BASE_TOOL_CALLS, @plan_step_count * CALLS_PER_PLAN_STEP].max.clamp(BASE_TOOL_CALLS, MAX_PLAN_TOOL_CALLS)
  else
    BASE_TOOL_CALLS
  end
end

#limit_reached? ⇒ `Boolean`

True after the total call ceiling was hit at least once.
Used by the worker to render a "Continue in new conversation" button.

Returns:

(Boolean)



186
187
188

# File 'app/services/assistant/tool_loop_guard.rb', line 186

def limit_reached?
  @limit_reached
end

#loaded_write_tool_names ⇒ `Array<String>`

Names of write tools that are loaded for this turn. Used by response
messages to tailor the budget-exhaustion guidance — read-only sales
workflows shouldn't be told to "use a write tool" when none exist.

Returns:

(Array<String>) —
loaded write-tool names (frozen, memoized)



218
219
220

# File 'app/services/assistant/tool_loop_guard.rb', line 218

def loaded_write_tool_names
  @loaded_write_tool_names ||= (@available_tool_names & WRITE_TOOLS.to_a).freeze
end

#max_turn_duration ⇒ `Integer`

Wall-clock cap for the current turn. Thinking-capable models (Gemini Pro,
Claude Sonnet+Opus reasoning, etc.) get a longer cap because their first
tool call commonly spends 60-130s on extended thinking. Long-form
content-authoring turns (blog/email) get the longest cap because they edit
large HTML bodies across many tool calls and were being halted mid-edit (#4714).

Returns:

(Integer) —
the per-turn wall-clock budget in seconds —
MAX_TURN_DURATION_AUTHORING for thinking + long-form authoring turns,
MAX_TURN_DURATION_THINKING for other thinking turns, else MAX_TURN_DURATION.

# File 'app/services/assistant/tool_loop_guard.rb', line 206

def max_turn_duration
  return MAX_TURN_DURATION unless @supports_thinking
  return MAX_TURN_DURATION_AUTHORING if @long_form_authoring

  MAX_TURN_DURATION_THINKING
end

#stats ⇒ `Object`

# File 'app/services/assistant/tool_loop_guard.rb', line 222

def stats
  tool_names = @executed_tool_names.to_a
  {
    total_tool_calls: @total_calls,
    tool_call_limit: effective_limit,
    sql_errors: @cumulative_sql_errors,
    patch_errors: @cumulative_patch_errors,
    empty_searches: @cumulative_empty_searches,
    writes_pending: @writes_pending_verification,
    unique_tools: tool_names.size,
    unique_tool_names: tool_names,
    limit_reached: @limit_reached,
    cross_step_cache_hits: @cross_step_cache_hits
  }
end

Class: Assistant::ToolLoopGuard

Overview

Defined Under Namespace

Constant Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil, supports_thinking: false, cancel_check: nil, available_tool_names: nil, described_views: nil, long_form_authoring: false) ⇒ ToolLoopGuard

Instance Method Details

#apply!(tools) ⇒ Object

#cancelled? ⇒ Boolean

#effective_limit ⇒ Object

#limit_reached? ⇒ Boolean

#loaded_write_tool_names ⇒ Array<String>

#max_turn_duration ⇒ Integer

#stats ⇒ Object

#initialize(role:, conversation_id: nil, plan_step_mode: false, cross_step_cache: nil, supports_thinking: false, cancel_check: nil, available_tool_names: nil, described_views: nil, long_form_authoring: false) ⇒ `ToolLoopGuard`

#apply!(tools) ⇒ `Object`

#cancelled? ⇒ `Boolean`

#effective_limit ⇒ `Object`

#limit_reached? ⇒ `Boolean`

#loaded_write_tool_names ⇒ `Array<String>`

#max_turn_duration ⇒ `Integer`

#stats ⇒ `Object`