Class: Retailer::ProbeAutoSkipper

Inherits:
Object
  • Object
show all
Defined in:
app/services/retailer/probe_auto_skipper.rb

Overview

Detects catalog items whose retailer URL has been failing repeatedly and
auto-flips skip_url_checks: true so the daily BatchPriceChecker
stops billing Oxylabs to re-probe a dead/wrong URL forever.

Why this exists: when a probe finishes with status failed, not_found, or
product_mismatch, no retail_price is captured. SiblingPriceRefresher
treats such items as perpetually stale and re-probes them on every Amazon
pricing run, and the next nightly batch probes them again. Audit (May 2026)
showed ~9% of monthly Oxylabs volume was wasted on items that had never
returned a successful price in the last 30 days.

Call this once after saving a probe. Idempotent and cheap (one indexed
query against (catalog_item_id, created_at)).

Examples:

probe.save!
Retailer::ProbeAutoSkipper.maybe_skip!(catalog_item)

Constant Summary collapse

PROBE_WINDOW =

How many of the most-recent probes to evaluate.

10
FAILURE_THRESHOLD =

Auto-skip once at least this many of the last PROBE_WINDOW probes are
non-success. This is a failure rate, not a strict consecutive streak: a
retailer that blocks our scraper but still succeeds intermittently would
keep resetting a consecutive streak and never trip. rona.ca (May 2026)
began timing out / faulting ~85% of Oxylabs probes while a handful still
succeeded — the old all-consecutive rule never fired. See the
project_retailer_oxylabs_probe_system memory.

8
NON_SUCCESS_STATUSES =

Statuses that count as "non-success".

%w[failed not_found product_mismatch].freeze

Class Method Summary collapse

Class Method Details

.maybe_skip!(catalog_item, window: PROBE_WINDOW) ⇒ Boolean

Inspects the catalog item's most-recent probes; if at least
FAILURE_THRESHOLD of the last PROBE_WINDOW are non-success, sets
skip_url_checks = true on the catalog item so subsequent batch and
sibling-refresh runs skip it.

Parameters:

  • catalog_item (CatalogItem)
  • window (Integer) (defaults to: PROBE_WINDOW)

    lookback size (override for tests)

Returns:

  • (Boolean)

    true if we just flipped skip_url_checks; false otherwise



45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
# File 'app/services/retailer/probe_auto_skipper.rb', line 45

def self.maybe_skip!(catalog_item, window: PROBE_WINDOW)
  return false if catalog_item.skip_url_checks?

  recent_statuses = catalog_item.retailer_probes
                                .order(created_at: :desc)
                                .limit(window)
                                .pluck(:status)

  return false if recent_statuses.size < window

  non_success = recent_statuses.count { |s| NON_SUCCESS_STATUSES.include?(s) }
  return false if non_success < failures_for(window)

  # Flip the unrelated `skip_url_checks` maintenance flag WITHOUT running the
  # model's full validations. A CatalogItem can carry pre-existing invalid data
  # unrelated to URL probing — e.g. a blank `amount` (validated at
  # catalog_item.rb:252) — and a plain `update!` would raise RecordInvalid on it
  # mid-webhook (AppSignal #6019). `update_attribute` saves with `validate: false`
  # while still firing callbacks, updated_at, and PaperTrail versioning — the
  # audit trail of when the system auto-skipped an item, which is the point.
  catalog_item.update_attribute(:skip_url_checks, true)
  Rails.logger.warn(
    "[ProbeAutoSkipper] Auto-skipped CatalogItem #{catalog_item.id} after " \
    "#{non_success}/#{window} non-success probes (latest: #{recent_statuses.first})"
  )
  true
end