Class: Retailer::Extractors::Generic

Inherits:
Base
  • Object
show all
Defined in:
app/services/retailer/extractors/generic.rb

Overview

Generic fallback data extractor for unknown retailers.
Uses common e-commerce patterns.

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.build_payload(url:, render: true) ⇒ Hash

Build Oxylabs payload for generic product scraping
Uses 'universal' source with JS rendering.

Parameters:

  • url (String)

    Full product URL

  • render (Boolean) (defaults to: true)

    Whether to render JavaScript (default: true)

Returns:

  • (Hash)

    Oxylabs API payload



13
14
15
16
17
18
19
# File 'app/services/retailer/extractors/generic.rb', line 13

def self.build_payload(url:, render: true)
  {
    source: 'universal',
    url: url,
    render: render ? 'html' : nil
  }.compact
end

Instance Method Details

#extract(check, content) ⇒ Object



21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# File 'app/services/retailer/extractors/generic.rb', line 21

def extract(check, content)
  return unless valid_html?(content)

  check.scraper_source = source_name
  check.currency = determine_currency

  doc = parse_html(content)

  check.product_available = check_availability(content)

  # JSON-LD structured data (most reliable)
  extract_json_ld_price(check, doc)

  # Fallback: schema.org itemprop
  extract_from_itemprop(check, doc) if check.price.blank?

  # Fallback: common price class patterns
  extract_from_common_selectors(check, doc) if check.price.blank?

  # Extract title
  check.raw_title = extract_title(doc)
end