Class: Retailer::Extractors::BuildCom
- Inherits:
-
Base
- Object
- Base
- Retailer::Extractors::BuildCom
- Defined in:
- app/services/retailer/extractors/build_com.rb
Overview
Build.com (Ferguson Home) data extractor.
Uses JSON-LD structured data and specific selectors.
Supports URL discovery from search results.
Search URL format: https://www.fergusonhome.com/search?term=sku
Product URL format: https://www.fergusonhome.com/brand-product-slug/sid?uid=uid
Constant Summary collapse
- RENDER_REQUIRED =
Ferguson Home product pages use server-rendered HTML (data-automation,
JSON-LD); search-result fallback also runs on server HTML. Strong
candidate to flip to false in a follow-up after manual verification. true
Class Method Summary collapse
-
.build_payload(url:) ⇒ Hash
Build Oxylabs payload for Build.com product scraping Uses 'universal' source with JS rendering.
Instance Method Summary collapse
Class Method Details
.build_payload(url:) ⇒ Hash
Build Oxylabs payload for Build.com product scraping
Uses 'universal' source with JS rendering.
21 22 23 24 25 26 27 |
# File 'app/services/retailer/extractors/build_com.rb', line 21 def self.build_payload(url:) { source: 'universal', url: url, render: render_value }.compact end |
Instance Method Details
#catalog_base_url ⇒ Object (protected)
47 48 49 |
# File 'app/services/retailer/extractors/build_com.rb', line 47 def catalog_base_url 'https://www.fergusonhome.com' end |
#extract(check, content) ⇒ Object
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
# File 'app/services/retailer/extractors/build_com.rb', line 29 def extract(check, content) return unless valid_html?(content) check.scraper_source = source_name check.currency = 'USD' doc = parse_html(content) # Determine if this is a search results page or product page if search_results_page?(content) extract_from_search_page(check, doc) else extract_from_product_page(check, doc, content) end end |