Class: Retailer::ProbeAutoSkipper
- Inherits:
-
Object
- Object
- Retailer::ProbeAutoSkipper
- Defined in:
- app/services/retailer/probe_auto_skipper.rb
Overview
Detects catalog items whose retailer URL has been failing repeatedly and
auto-flips skip_url_checks: true so the daily BatchPriceChecker
stops billing Oxylabs to re-probe a dead/wrong URL forever.
Why this exists: when a probe finishes with status failed, not_found, or
product_mismatch, no retail_price is captured. SiblingPriceRefresher
treats such items as perpetually stale and re-probes them on every Amazon
pricing run, and the next nightly batch probes them again. Audit (May 2026)
showed ~9% of monthly Oxylabs volume was wasted on items that had never
returned a successful price in the last 30 days.
Call this once after saving a probe. Idempotent and cheap (one indexed
query against (catalog_item_id, created_at)).
Constant Summary collapse
- PROBE_WINDOW =
How many of the most-recent probes to evaluate.
10- FAILURE_THRESHOLD =
Auto-skip once at least this many of the last PROBE_WINDOW probes are
non-success. This is a failure rate, not a strict consecutive streak: a
retailer that blocks our scraper but still succeeds intermittently would
keep resetting a consecutive streak and never trip. rona.ca (May 2026)
began timing out / faulting ~85% of Oxylabs probes while a handful still
succeeded — the old all-consecutive rule never fired. See the
project_retailer_oxylabs_probe_systemmemory. 8- NON_SUCCESS_STATUSES =
Statuses that count as "non-success".
%w[failed not_found product_mismatch].freeze
Class Method Summary collapse
-
.maybe_skip!(catalog_item, window: PROBE_WINDOW) ⇒ Boolean
Inspects the catalog item's most-recent probes; if at least FAILURE_THRESHOLD of the last PROBE_WINDOW are non-success, sets
skip_url_checks = trueon the catalog item so subsequent batch and sibling-refresh runs skip it.
Class Method Details
.maybe_skip!(catalog_item, window: PROBE_WINDOW) ⇒ Boolean
Inspects the catalog item's most-recent probes; if at least
FAILURE_THRESHOLD of the last PROBE_WINDOW are non-success, sets
skip_url_checks = true on the catalog item so subsequent batch and
sibling-refresh runs skip it.
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
# File 'app/services/retailer/probe_auto_skipper.rb', line 45 def self.maybe_skip!(catalog_item, window: PROBE_WINDOW) return false if catalog_item.skip_url_checks? recent_statuses = catalog_item.retailer_probes .order(created_at: :desc) .limit(window) .pluck(:status) return false if recent_statuses.size < window non_success = recent_statuses.count { |s| NON_SUCCESS_STATUSES.include?(s) } return false if non_success < failures_for(window) # Flip the unrelated `skip_url_checks` maintenance flag WITHOUT running the # model's full validations. A CatalogItem can carry pre-existing invalid data # unrelated to URL probing — e.g. a blank `amount` (validated at # catalog_item.rb:252) — and a plain `update!` would raise RecordInvalid on it # mid-webhook (AppSignal #6019). `update_attribute` saves with `validate: false` # while still firing callbacks, updated_at, and PaperTrail versioning — the # audit trail of when the system auto-skipped an item, which is the point. catalog_item.update_attribute(:skip_url_checks, true) Rails.logger.warn( "[ProbeAutoSkipper] Auto-skipped CatalogItem #{catalog_item.id} after " \ "#{non_success}/#{window} non-success probes (latest: #{recent_statuses.first})" ) true end |