Retailer Price Monitoring System
Overview
Section titled “Overview”The Retailer Price Monitoring system automatically tracks product prices and availability across third-party retail partners (Home Depot, Costco, Wayfair, Amazon, etc.). It detects MAP (Minimum Advertised Price) violations for vendor/1P catalogs and provides visibility into retailer pricing behavior.
Accessing the Feature
Section titled “Accessing the Feature”Retailer Probes Tab (Per Item)
Section titled “Retailer Probes Tab (Per Item)”Path: Catalog Items → [Select Item] → Retailer Probes tab
URL: https://crm.warmlyyours.me:3000/en-US/catalog_items/{id}?tab=retailer_probes
Only visible for catalog items in catalogs with external_price_check_enabled: true.
MAP Violations Search
Section titled “MAP Violations Search”Path: Catalogs → [Select Catalog] → “MAP Violations” link
Direct Search: ProductCatalogSearch with map_violation: true filter
Features
Section titled “Features”Automatic Price Checks
Section titled “Automatic Price Checks”Daily automated checks via RetailerProbeWorker:
| Feature | Description |
|---|---|
| Price Extraction | Captures current/sale price and original/was price |
| Availability Detection | Detects “out of stock” and “unavailable” indicators |
| URL Validation | Tracks if product pages are accessible |
| History Tracking | Stores all checks in catalog_item_retailer_probes table |
Manual “Probe Now” Button
Section titled “Manual “Probe Now” Button”Users with :update capability on a catalog item can trigger an immediate price check:
- Button located on the Retailer Probes tab
- Uses Oxylabs Realtime API for immediate results
- Updates
retail_pricecolumn on success
MAP Violation Detection
Section titled “MAP Violation Detection”For vendor/1P catalogs (Home Depot, Costco, Wayfair, Lowe’s, etc.):
| Field | Description |
|---|---|
retailer_type | :marketplace (3P) or :vendor (1P) |
map_percentage | Default 80% of MSRP (20% max discount) |
map_price | Calculated as msrp * map_percentage |
map_violation | true if retail_price < map_price |
Catalog Show Page
Section titled “Catalog Show Page”For vendor catalogs, displays:
- Retailer Type Badge: “Vendor (1P)” or “Marketplace (3P)”
- MAP Percentage: e.g., “80% of MSRP”
- MAP Violations Link: Quick search for active items in violation
Supported Retailers
Section titled “Supported Retailers”| Retailer | Extractor Class | Price Extraction |
|---|---|---|
| Amazon (all markets) | Retailer::Extractors::Amazon | Parsed JSON from amazon_product |
| Home Depot US/CA | Retailer::Extractors::HomeDepot | JSON-LD + HTML selectors |
| Costco CA | Retailer::Extractors::Costco | JSON-LD + aria-labels |
| Wayfair US/CA/DE | Retailer::Extractors::Wayfair | data-test-id attributes |
| Walmart US/CA | Retailer::Extractors::Walmart | __NEXT_DATA__ + JSON-LD |
| Lowe’s US/CA | Retailer::Extractors::Lowes | JSON-LD + CSS selectors |
| Rona/RenoDepot | Retailer::Extractors::Rona | Browser automation + parsing |
| Build.com (Ferguson) | Retailer::Extractors::BuildCom | JSON-LD + URL discovery |
| Canadian Tire | Retailer::Extractors::CanadianTire | JSON-LD + CSS selectors |
| Houzz | Retailer::Extractors::Houzz | JSON-LD extraction |
| Best Buy Canada | Retailer::Extractors::BestbuyCanada | JSON-LD extraction |
Technical Architecture
Section titled “Technical Architecture”Clean Separation of Concerns
Section titled “Clean Separation of Concerns”The system follows a clean architecture with clear separation:
┌─────────────────────────────────────────────────────────────┐│ PriceChecker ││ (Orchestration - Fetch + Delegate) ││ • check(catalog_item) • check_catalog(catalog) ││ • fetch_product(...) • process_result(...) │└─────────────────────────────────────────────────────────────┘ │ │ Delegates to ▼┌─────────────────────────────────────────────────────────────┐│ Extractors::Factory.for(catalog) ││ (Returns correct extractor) │└─────────────────────────────────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────┐│ Extractors (Per Retailer) ││ (Payload Building + Data Extraction) ││ ││ Class Methods: Instance Methods: ││ • build_payload(url:) • extract(check, content) ││ • search_payload(query:) • validate_product_identity ││ • discovery_payload • discovered_url │└─────────────────────────────────────────────────────────────┘ │ │ Uses ▼┌─────────────────────────────────────────────────────────────┐│ OxylabsApi ││ (Pure HTTP Client - Transport Only) ││ • request(payload) • submit_job(payload) ││ • job_status(id) • job_results(id) ││ • poll_for_results(id) • submit_jobs_batch(payloads) │└─────────────────────────────────────────────────────────────┘Extractor Factory Pattern
Section titled “Extractor Factory Pattern”All retailers use the same code path through the factory:
# In PriceChecker#process_result - unified for ALL retailersextractor = Retailer::Extractors::Factory.for(catalog_item.catalog)extractor.extract(check, content)
# Validate product identity (prevents false positives from redirects)unless extractor.validate_product_identity(check, content, catalog_item) check.price = nil # Clear extracted data on mismatchendEach extractor handles its own extraction logic:
# Get the right extractor for a catalogextractor = Retailer::Extractors::Factory.for(catalog)
# Build payload (class method - knows retailer API specifics)payload = extractor.class.build_payload(url: product_url)
# Make request (API client - knows HTTP)api = Retailer::OxylabsApi.newresult = api.request(payload)
# Extract data (instance method - knows HTML structure)extractor.extract(check, Retailer::OxylabsApi.html_content(result))Oxylabs Integration
Section titled “Oxylabs Integration”Two integration methods depending on use case:
| Method | Use Case | Endpoint |
|---|---|---|
| Realtime | Single “Probe Now” requests | Synchronous, blocks until result |
| Push-Pull | Daily batch processing | Async with webhook callback |
Webhook Callback (Production)
Section titled “Webhook Callback (Production)”POST https://api.warmlyyours.com/v1/oxylabs/results?token=<jwt>Authentication via time-limited JWT token (24 hours):
- Generated by
Retailer::CallbackTokenService - Embedded in callback URL when submitting batch jobs
- Validated by
Api::V1::Oxylabs::WebhooksController
Development vs Production
Section titled “Development vs Production”| Environment | Batch Method | Reason |
|---|---|---|
| Development | Polling | Local endpoints not reachable from Oxylabs |
| Production | Webhook callback | More efficient, no polling overhead |
Database Schema
Section titled “Database Schema”catalog_item_retailer_probes
Section titled “catalog_item_retailer_probes”| Column | Type | Description |
|---|---|---|
catalog_item_id | bigint | Foreign key to catalog_items |
status | string | success, failed, not_found, pending |
url | string | URL that was checked |
price | decimal | Current/sale price extracted |
regular_price | decimal | Original/was price (if on sale) |
currency | string | Currency code (USD, CAD) |
product_available | boolean | Availability status |
page_accessible | boolean | Whether page loaded successfully |
error_message | text | Error details if failed |
raw_title | string | Product title from page |
scraper_source | string | amazon, home_depot, wayfair, etc. |
geo_location | string | ZIP code used for check |
created_at | timestamp | When check was performed |
catalogs (new columns)
Section titled “catalogs (new columns)”| Column | Type | Default | Description |
|---|---|---|---|
retailer_type | integer | 0 | 0=marketplace, 1=vendor |
map_percentage | decimal | 0.80 | MAP as percentage of MSRP |
external_price_check_enabled | boolean | false | Enable retailer probes |
catalog_items
Section titled “catalog_items”| Column | Description |
|---|---|
retail_price | Stores last successfully pulled retailer price |
url | Stored product URL (discovered or manual) |
Services & Classes
Section titled “Services & Classes”Core Services
Section titled “Core Services”| Service | Purpose |
|---|---|
Retailer::OxylabsApi | Pure HTTP client for Oxylabs API |
Retailer::PriceChecker | Single-item price checks (realtime) |
Retailer::BatchPriceChecker | Batch processing (push-pull) |
Retailer::UrlConstructor | Builds product URLs per retailer |
Retailer::CallbackTokenService | JWT token generation/validation |
Retailer::WebhookResultProcessor | Processes webhook payloads |
Extractor Classes
Section titled “Extractor Classes”| Extractor | Key Features |
|---|---|
Retailer::Extractors::Base | Shared Nokogiri parsing, JSON-LD extraction, validate_product_identity |
Retailer::Extractors::Factory | Returns correct extractor for catalog ID |
Retailer::Extractors::Amazon | Handles JSON from amazon_product, ASIN validation override |
Retailer::Extractors::HomeDepot | data-automation + JSON-LD |
Retailer::Extractors::Costco | aria-labels + JSON-LD |
Retailer::Extractors::Wayfair | data-test-id selectors, URL discovery |
Retailer::Extractors::Walmart | __NEXT_DATA__ parsing + JSON-LD fallback |
Retailer::Extractors::Lowes | JSON-LD + CSS selectors |
Retailer::Extractors::Rona | Browser automation + URL discovery |
Retailer::Extractors::BuildCom | JSON-LD extraction, search page URL discovery |
Retailer::Extractors::CanadianTire | JSON-LD + CSS selectors |
Retailer::Extractors::Houzz | JSON-LD extraction |
Retailer::Extractors::BestbuyCanada | JSON-LD extraction |
Retailer::Extractors::Generic | Fallback for unknown retailers |
Product Identity Validation
Section titled “Product Identity Validation”All extractors inherit validate_product_identity from Base class, which prevents false positives when retailers redirect to different products:
# In Retailer::Extractors::Basedef validate_product_identity(check, content, catalog_item) identifiers = collect_product_identifiers(catalog_item) # Checks SKU, UPC, third_party_part_number, third_party_sku, parent_sku # Returns false if none found in page content or URLendAmazon overrides this for direct ASIN comparison:
# In Retailer::Extractors::Amazondef validate_product_identity(check, content, catalog_item) if content.is_a?(Hash) && content['asin'].present? return content['asin'] == catalog_item.amazon_asin end super # Fall back to base class for HTML contentendWorkers
Section titled “Workers”| Worker | Schedule | Purpose |
|---|---|---|
RetailerProbeWorker | Daily | Batch checks all enabled catalogs |
OxylabsResultWorker | On webhook | Processes individual webhook results |
Configuration
Section titled “Configuration”Oxylabs credentials stored in Rails credentials:
Heatwave::Configuration.fetch(:oxylabs, :api_username)Heatwave::Configuration.fetch(:oxylabs, :api_password)Catalog IDs defined in CatalogConstants:
CatalogConstants::HOME_DEPOT_USA # => 1CatalogConstants::AMAZON_SELLER_USA # => 5CatalogConstants::RONA_CA # => 22# etc.Usage Examples
Section titled “Usage Examples”Trigger Manual Check
Section titled “Trigger Manual Check”# Via worker (recommended)RetailerProbeWorker.perform_async(catalog_item_id: 12345)
# Via service directlychecker = Retailer::PriceChecker.newresult = checker.check(CatalogItem.find(12345))Check Entire Catalog
Section titled “Check Entire Catalog”RetailerProbeWorker.perform_async(catalog_id: 18)Check All Retailers
Section titled “Check All Retailers”RetailerProbeWorker.perform_asyncFind MAP Violations
Section titled “Find MAP Violations”# Via scopeViewProductCatalog.vendor_catalogs.map_violations
# Via searchsearch = ProductCatalogSearch.create!( query_params: { map_violation_eq: true, catalog_item_state_in: ['active'] })Rona URL Discovery
Section titled “Rona URL Discovery”Rona requires URL discovery due to JS-rendered search results:
# Discover and save URLs for all Rona items without URLsRetailer::Extractors::Rona.seed_catalog_urls
# Or via rake taskbundle exec rake retailer:discover_rona_urlsSecurity
Section titled “Security”- Webhook endpoint requires valid JWT token
- Tokens expire after 24 hours
- Tokens are signed with
Rails.application.secret_key_base - Invalid tokens return 401 Unauthorized
Adding a New Retailer
Section titled “Adding a New Retailer”- Create extractor class in
app/services/retailer/extractors/:
class Retailer::Extractors::NewRetailer < Retailer::Extractors::Base def self.build_payload(url:) { source: 'universal', url: url, render: 'html' } end
def extract(check, content) return unless valid_html?(content)
check.scraper_source = source_name check.currency = 'USD' # or determine from catalog
doc = parse_html(content)
# Try JSON-LD first (most reliable) extract_json_ld_price(check, doc)
# Fallback: retailer-specific selectors if check.price.blank? selectors = ['.price', '[data-price]', '[itemprop="price"]'] extract_price_from_selectors(check, doc, selectors) end
check.product_available = check_availability(content) check.raw_title = extract_title(doc) endend- Add to Factory in
app/services/retailer/extractors/factory.rb:
def extractor_class(catalog_id) case catalog_id when NEW_RETAILER_CATALOG_ID Retailer::Extractors::NewRetailer # ... other cases endend-
Add catalog constant in
app/models/concerns/catalog_constants.rbif needed -
Add to CATALOG_RETAILER_TYPES in
Retailer::PriceCheckerfor fetch method routing -
Add URL pattern to
Retailer::UrlConstructorif retailer has predictable URL structure -
Override
validate_product_identityif retailer returns data in a special format (like Amazon’s JSON)