Class: Retailer::Extractors::BestbuyCanada

Inherits:
Base
  • Object
show all
Defined in:
app/services/retailer/extractors/bestbuy_canada.rb

Overview

Best Buy Canada data extractor.
Uses JSON-LD structured data and specific selectors.
Supports URL discovery from search results.

URL format: https://www.bestbuy.ca/en-ca/product/product-slug/sku-id

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.build_payload(url:) ⇒ Hash

Build Oxylabs payload for Best Buy Canada product scraping
Uses 'universal' source with JS rendering.

Parameters:

  • url (String)

    Full product URL

Returns:

  • (Hash)

    Oxylabs API payload



15
16
17
18
19
20
21
22
23
24
25
26
27
28
# File 'app/services/retailer/extractors/bestbuy_canada.rb', line 15

def self.build_payload(url:)
  {
    source: 'universal',
    url: url,
    render: 'html',
    context: [
      { key: 'follow_redirects', value: true }
    ],
    # Best Buy Canada uses React, needs time to load
    browser_instructions: [
      { type: 'wait', wait_time_s: 2 }
    ]
  }
end

Instance Method Details

#catalog_base_urlObject (protected)



48
49
50
# File 'app/services/retailer/extractors/bestbuy_canada.rb', line 48

def catalog_base_url
  'https://www.bestbuy.ca'
end

#extract(check, content) ⇒ Object



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# File 'app/services/retailer/extractors/bestbuy_canada.rb', line 30

def extract(check, content)
  return unless valid_html?(content)

  check.scraper_source = source_name
  check.currency = 'CAD'

  doc = parse_html(content)

  # Determine if this is a search results page or product page
  if search_results_page?(content)
    extract_from_search_page(check, doc)
  else
    extract_from_product_page(check, doc, content)
  end
end