Retailer Price Monitoring System
Overview
The Retailer Price Monitoring system automatically tracks product prices and availability across third-party retail partners (Home Depot, Costco, Wayfair, Amazon, etc.). It detects MAP (Minimum Advertised Price) violations for vendor/1P catalogs and provides visibility into retailer pricing behavior.
Accessing the Feature
Retailer Probes Tab (Per Item)
Path: Catalog Items → [Select Item] → Retailer Probes tab
URL: https://crm.warmlyyours.me:3000/en-US/catalog_items/{id}?tab=retailer_probes
Only visible for catalog items in catalogs with external_price_check_enabled: true.
MAP Violations Search
Path: Catalogs → [Select Catalog] → "MAP Violations" link
Direct Search: ProductCatalogSearch with map_violation: true filter
Features
Automatic Price Checks
Daily automated checks via RetailerProbeWorker:
| Feature | Description |
|---|---|
| Price Extraction | Captures current/sale price and original/was price |
| Availability Detection | Detects "out of stock" and "unavailable" indicators |
| URL Validation | Tracks if product pages are accessible |
| History Tracking | Stores all checks in catalog_item_retailer_probes table |
Manual "Probe Now" Button
Users with :update capability on a catalog item can trigger an immediate price check:
- Button located on the Retailer Probes tab
- Uses Oxylabs Realtime API for immediate results
- Updates
retail_pricecolumn on success
MAP Violation Detection
For vendor/1P catalogs (Home Depot, Costco, Wayfair, Lowe's, etc.):
| Field | Description |
|---|---|
retailer_type |
:marketplace (3P) or :vendor (1P) |
map_percentage |
Default 80% of MSRP (20% max discount) |
map_price |
Calculated as msrp * map_percentage |
map_violation |
true if retail_price < map_price |
Catalog Show Page
For vendor catalogs, displays:
- Retailer Type Badge: "Vendor (1P)" or "Marketplace (3P)"
- MAP Percentage: e.g., "80% of MSRP"
- MAP Violations Link: Quick search for active items in violation
Supported Retailers
| Retailer | Extractor Class | Price Extraction |
|---|---|---|
| Amazon (all markets) | Retailer::Extractors::Amazon |
Parsed JSON from amazon_product |
| Home Depot US/CA | Retailer::Extractors::HomeDepot |
JSON-LD + HTML selectors |
| Costco CA | Retailer::Extractors::Costco |
JSON-LD + aria-labels |
| Wayfair US/CA/DE | Retailer::Extractors::Wayfair |
data-test-id attributes |
| Walmart US/CA | Retailer::Extractors::Walmart |
__NEXT_DATA__ + JSON-LD |
| Lowe's US/CA | Retailer::Extractors::Lowes |
JSON-LD + CSS selectors |
| Rona/RenoDepot | Retailer::Extractors::Rona |
Browser automation + parsing |
| Build.com (Ferguson) | Retailer::Extractors::BuildCom |
JSON-LD + URL discovery |
| Canadian Tire | Retailer::Extractors::CanadianTire |
JSON-LD + CSS selectors |
| Houzz | Retailer::Extractors::Houzz |
JSON-LD extraction |
| Best Buy Canada | Retailer::Extractors::BestbuyCanada |
JSON-LD extraction |
Technical Architecture
Clean Separation of Concerns
The system follows a clean architecture with clear separation:
┌─────────────────────────────────────────────────────────────┐
│ PriceChecker │
│ (Orchestration - Fetch + Delegate) │
│ • check(catalog_item) • check_catalog(catalog) │
│ • fetch_product(...) • process_result(...) │
└─────────────────────────────────────────────────────────────┘
│
│ Delegates to
▼
┌─────────────────────────────────────────────────────────────┐
│ Extractors::Factory.for(catalog) │
│ (Returns correct extractor) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Extractors (Per Retailer) │
│ (Payload Building + Data Extraction) │
│ │
│ Class Methods: Instance Methods: │
│ • build_payload(url:) • extract(check, content) │
│ • search_payload(query:) • validate_product_identity │
│ • discovery_payload • discovered_url │
└─────────────────────────────────────────────────────────────┘
│
│ Uses
▼
┌─────────────────────────────────────────────────────────────┐
│ OxylabsApi │
│ (Pure HTTP Client - Transport Only) │
│ • request(payload) • submit_job(payload) │
│ • job_status(id) • job_results(id) │
│ • poll_for_results(id) • submit_jobs_batch(payloads) │
└─────────────────────────────────────────────────────────────┘
Extractor Factory Pattern
All retailers use the same code path through the factory:
# In PriceChecker#process_result - unified for ALL retailers
extractor = Retailer::Extractors::Factory.for(catalog_item.catalog)
extractor.extract(check, content)
# Validate product identity (prevents false positives from redirects)
unless extractor.validate_product_identity(check, content, catalog_item)
check.price = nil # Clear extracted data on mismatch
end
Each extractor handles its own extraction logic:
# Get the right extractor for a catalog
extractor = Retailer::Extractors::Factory.for(catalog)
# Build payload (class method - knows retailer API specifics)
payload = extractor.class.build_payload(url: product_url)
# Make request (API client - knows HTTP)
api = Retailer::OxylabsApi.new
result = api.request(payload)
# Extract data (instance method - knows HTML structure)
extractor.extract(check, Retailer::OxylabsApi.html_content(result))
Oxylabs Integration
Two integration methods depending on use case:
| Method | Use Case | Endpoint |
|---|---|---|
| Realtime | Single "Probe Now" requests | Synchronous, blocks until result |
| Push-Pull | Daily batch processing | Async with webhook callback |
Webhook Callback (Production)
POST https://api.warmlyyours.com/v1/oxylabs/results?token=<jwt>
Authentication via time-limited JWT token (24 hours):
- Generated by
Retailer::CallbackTokenService - Embedded in callback URL when submitting batch jobs
- Validated by
Api::V1::Oxylabs::WebhooksController
Development vs Production
| Environment | Batch Method | Reason |
|---|---|---|
| Development | Polling | Local endpoints not reachable from Oxylabs |
| Production | Webhook callback | More efficient, no polling overhead |
Database Schema
catalog_item_retailer_probes
| Column | Type | Description |
|---|---|---|
catalog_item_id |
bigint | Foreign key to catalog_items |
status |
string | success, failed, not_found, pending |
url |
string | URL that was checked |
price |
decimal | Current/sale price extracted |
regular_price |
decimal | Original/was price (if on sale) |
currency |
string | Currency code (USD, CAD) |
product_available |
boolean | Availability status |
page_accessible |
boolean | Whether page loaded successfully |
error_message |
text | Error details if failed |
raw_title |
string | Product title from page |
scraper_source |
string | amazon, home_depot, wayfair, etc. |
geo_location |
string | ZIP code used for check |
created_at |
timestamp | When check was performed |
catalogs (new columns)
| Column | Type | Default | Description |
|---|---|---|---|
retailer_type |
integer | 0 | 0=marketplace, 1=vendor |
map_percentage |
decimal | 0.80 | MAP as percentage of MSRP |
external_price_check_enabled |
boolean | false | Enable retailer probes |
catalog_items
| Column | Description |
|---|---|
retail_price |
Stores last successfully pulled retailer price |
url |
Stored product URL (discovered or manual) |
Services & Classes
Core Services
| Service | Purpose |
|---|---|
Retailer::OxylabsApi |
Pure HTTP client for Oxylabs API |
Retailer::PriceChecker |
Single-item price checks (realtime) |
Retailer::BatchPriceChecker |
Batch processing (push-pull) |
Retailer::UrlConstructor |
Builds product URLs per retailer |
Retailer::CallbackTokenService |
JWT token generation/validation |
Retailer::WebhookResultProcessor |
Processes webhook payloads |
Extractor Classes
| Extractor | Key Features |
|---|---|
Retailer::Extractors::Base |
Shared Nokogiri parsing, JSON-LD extraction, validate_product_identity |
Retailer::Extractors::Factory |
Returns correct extractor for catalog ID |
Retailer::Extractors::Amazon |
Handles JSON from amazon_product, ASIN validation override |
Retailer::Extractors::HomeDepot |
data-automation + JSON-LD |
Retailer::Extractors::Costco |
aria-labels + JSON-LD |
Retailer::Extractors::Wayfair |
data-test-id selectors, URL discovery |
Retailer::Extractors::Walmart |
__NEXT_DATA__ parsing + JSON-LD fallback |
Retailer::Extractors::Lowes |
JSON-LD + CSS selectors |
Retailer::Extractors::Rona |
Browser automation + URL discovery |
Retailer::Extractors::BuildCom |
JSON-LD extraction, search page URL discovery |
Retailer::Extractors::CanadianTire |
JSON-LD + CSS selectors |
Retailer::Extractors::Houzz |
JSON-LD extraction |
Retailer::Extractors::BestbuyCanada |
JSON-LD extraction |
Retailer::Extractors::Generic |
Fallback for unknown retailers |
Product Identity Validation
All extractors inherit validate_product_identity from Base class, which prevents false positives when retailers redirect to different products:
# In Retailer::Extractors::Base
def validate_product_identity(check, content, catalog_item)
identifiers = collect_product_identifiers(catalog_item)
# Checks SKU, UPC, third_party_part_number, third_party_sku, parent_sku
# Returns false if none found in page content or URL
end
Amazon overrides this for direct ASIN comparison:
# In Retailer::Extractors::Amazon
def validate_product_identity(check, content, catalog_item)
if content.is_a?(Hash) && content['asin'].present?
return content['asin'] == catalog_item.amazon_asin
end
super # Fall back to base class for HTML content
end
Workers
| Worker | Schedule | Purpose |
|---|---|---|
RetailerProbeWorker |
Daily | Batch checks all enabled catalogs |
OxylabsResultWorker |
On webhook | Processes individual webhook results |
Configuration
Oxylabs credentials stored in Rails credentials:
Heatwave::Configuration.fetch(:oxylabs, :api_username)
Heatwave::Configuration.fetch(:oxylabs, :api_password)
Catalog IDs defined in CatalogConstants:
CatalogConstants::HOME_DEPOT_USA # => 1
CatalogConstants::AMAZON_SELLER_USA # => 5
CatalogConstants::RONA_CA # => 22
# etc.
Usage Examples
Trigger Manual Check
# Via worker (recommended)
RetailerProbeWorker.perform_async(catalog_item_id: 12345)
# Via service directly
checker = Retailer::PriceChecker.new
result = checker.check(CatalogItem.find(12345))
Check Entire Catalog
RetailerProbeWorker.perform_async(catalog_id: 18)
Check All Retailers
RetailerProbeWorker.perform_async
Find MAP Violations
# Via scope
ViewProductCatalog.vendor_catalogs.map_violations
# Via search
search = ProductCatalogSearch.create!(
query_params: { map_violation_eq: true, catalog_item_state_in: ['active'] }
)
Rona URL Discovery
Rona requires URL discovery due to JS-rendered search results:
# Discover and save URLs for all Rona items without URLs
Retailer::Extractors::Rona.seed_catalog_urls
# Or via rake task
bundle exec rake retailer:discover_rona_urls
Security
- Webhook endpoint requires valid JWT token
- Tokens expire after 24 hours
- Tokens are signed with
Rails.application.secret_key_base - Invalid tokens return 401 Unauthorized
Adding a New Retailer
- Create extractor class in
app/services/retailer/extractors/:
class Retailer::Extractors::NewRetailer < Retailer::Extractors::Base
def self.build_payload(url:)
{ source: 'universal', url: url, render: 'html' }
end
def extract(check, content)
return unless valid_html?(content)
check.scraper_source = source_name
check.currency = 'USD' # or determine from catalog
doc = parse_html(content)
# Try JSON-LD first (most reliable)
extract_json_ld_price(check, doc)
# Fallback: retailer-specific selectors
if check.price.blank?
selectors = ['.price', '[data-price]', '[itemprop="price"]']
extract_price_from_selectors(check, doc, selectors)
end
check.product_available = check_availability(content)
check.raw_title = extract_title(doc)
end
end
- Add to Factory in
app/services/retailer/extractors/factory.rb:
def extractor_class(catalog_id)
case catalog_id
when NEW_RETAILER_CATALOG_ID
Retailer::Extractors::NewRetailer
# ... other cases
end
end
-
Add catalog constant in
app/models/concerns/catalog_constants.rbif needed -
Add to CATALOG_RETAILER_TYPES in
Retailer::PriceCheckerfor fetch method routing -
Add URL pattern to
Retailer::UrlConstructorif retailer has predictable URL structure -
Override
validate_product_identityif retailer returns data in a special format (like Amazon's JSON)