Class: Retailer::Extractors::Generic
- Inherits:
-
Base
- Object
- Base
- Retailer::Extractors::Generic
- Defined in:
- app/services/retailer/extractors/generic.rb
Overview
Generic fallback data extractor for unknown retailers.
Uses common e-commerce patterns.
Class Method Summary collapse
-
.build_payload(url:, render: true) ⇒ Hash
Build Oxylabs payload for generic product scraping Uses 'universal' source with JS rendering.
Instance Method Summary collapse
Class Method Details
.build_payload(url:, render: true) ⇒ Hash
Build Oxylabs payload for generic product scraping
Uses 'universal' source with JS rendering.
13 14 15 16 17 18 19 |
# File 'app/services/retailer/extractors/generic.rb', line 13 def self.build_payload(url:, render: true) { source: 'universal', url: url, render: render ? 'html' : nil }.compact end |
Instance Method Details
#extract(check, content) ⇒ Object
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
# File 'app/services/retailer/extractors/generic.rb', line 21 def extract(check, content) return unless valid_html?(content) check.scraper_source = source_name check.currency = determine_currency doc = parse_html(content) check.product_available = check_availability(content) # JSON-LD structured data (most reliable) extract_json_ld_price(check, doc) # Fallback: schema.org itemprop extract_from_itemprop(check, doc) if check.price.blank? # Fallback: common price class patterns extract_from_common_selectors(check, doc) if check.price.blank? # Extract title check.raw_title = extract_title(doc) end |