Class: SiteMap
- Inherits:
-
ApplicationRecord
- Object
- ActiveRecord::Base
- ApplicationRecord
- SiteMap
- Extended by:
- FriendlyId
- Includes:
- Models::Embeddable, PgSearch::Model
- Defined in:
- app/models/site_map.rb
Overview
== Schema Information
Table name: site_maps
Database name: primary
id :integer not null, primary key
category :string
change_frequency :string default("monthly")
extracted_at :datetime
extracted_content :text
extracted_title :string
google_coverage_state :string
google_inspected_at :datetime
google_last_crawled_at :datetime
hide :boolean default(FALSE), not null
image_properties :jsonb
last_mod :datetime
last_status :string
last_status_datetime :datetime
legacy_url :string
locale :string
path :string
preserve :boolean default(FALSE), not null
priority :decimal(2, 1) default(0.5)
rendered_schema :jsonb
rendered_schema_at :datetime
resource_type :string
seo_clicks :integer
seo_keywords_count :integer
seo_report :jsonb
seo_synced_at :datetime
seo_top_keyword :string
seo_top_position :integer
seo_traffic :integer
seo_traffic_value :integer
state :enum default("active"), not null
target_query :string
visit_count_30d :integer
created_at :datetime not null
updated_at :datetime not null
resource_id :integer
Indexes
index_site_maps_on_category (category)
index_site_maps_on_extracted_at (extracted_at)
index_site_maps_on_legacy_url (legacy_url) UNIQUE
index_site_maps_on_locale_and_path (locale,path) UNIQUE
index_site_maps_on_path (path)
index_site_maps_on_path_trigram (path) USING gin
index_site_maps_on_rendered_schema (rendered_schema) USING gin
index_site_maps_on_resource_type_and_resource_id (resource_type,resource_id)
index_site_maps_on_seo_synced_at (seo_synced_at)
index_site_maps_on_seo_traffic (seo_traffic)
index_site_maps_on_state (state)
locale_category (locale,category)
Constant Summary collapse
- CHANGE_FREQUENCIES =
Change frequencies.
%w[always hourly daily weekly monthly yearly never].freeze
- EMBEDDABLE_CATEGORIES =
Categories that have extracted content for embedding
%w[static_page].freeze
- LOCALES =
Valid locales for site maps - must match LocaleUtility::SITE_LOCALES
LocaleUtility::SITE_LOCALES.map(&:to_s).freeze
- STALE_ANALYSIS_SQL =
Scopes for filtering by analysis freshness (index filter).
Stale = has analysis but page recrawled after it ran. <<~SQL.squish.freeze seo_report->>'analyzed_at' IS NOT NULL AND ( (rendered_schema_at IS NOT NULL AND rendered_schema_at > (seo_report->>'analyzed_at')::timestamptz) OR (extracted_at IS NOT NULL AND extracted_at > (seo_report->>'analyzed_at')::timestamptz) ) SQL
Constants included from Models::Embeddable
Models::Embeddable::MAX_CONTENT_LENGTH
Constants included from Schedulable
Schedulable::SIMPLE_FORM_OPTIONS
Instance Attribute Summary collapse
- #last_mod ⇒ Object readonly
- #locale ⇒ Object readonly
- #path ⇒ Object readonly
Belongs to collapse
Has many collapse
- #data_points ⇒ ActiveRecord::Relation<SiteMapDataPoint>
- #inbound_links ⇒ ActiveRecord::Relation<SiteMapLink>
-
#outbound_links ⇒ ActiveRecord::Relation<SiteMapLink>
Internal link graph.
-
#path_histories ⇒ ActiveRecord::Relation<SiteMapPathHistory>
Prior URLs this page was served at, for self-healing 301s after a rename.
-
#recommendations ⇒ ActiveRecord::Relation<SiteMapRecommendation>
SEO recommendations extracted from seo_report.
-
#seo_page_keywords ⇒ ActiveRecord::Relation<SeoPageKeyword>
SEO metrics associations.
Methods included from Models::Embeddable
Class Method Summary collapse
-
.by_traffic ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are by traffic.
-
.cacheable ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are cacheable.
- .categories_for_select ⇒ Object
-
.embeddable ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are embeddable.
-
.extract_path_from_url(full_url) ⇒ String
Extract path from a full URL, stripping domain and locale.
-
.for_path(path, locale = nil) ⇒ SiteMap?
Resolve a locale-less path — the page's CURRENT path or any PRIOR path it was renamed away from — to its SiteMap.
-
.high_traffic ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are high traffic.
-
.needs_extraction ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are needs extraction.
-
.needs_seo_sync ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are needs seo sync.
-
.page_friendly_id_for(category:, path:, resource_type: nil, resource_id: nil) ⇒ String
Deterministic
page_friendly_idderivation, shared by FriendlyId (on create) and Sitemap::SitemapGenerator (which computes it directly for its bulk upsert). -
.purge_edge_cache_by_pattern(pattern, async: false, delay: nil) ⇒ Object
Pattern can be e.g "/floor-heating/" for everything floor heating related Purges the edge cache for all URLs matching the given pattern.
-
.schema_stale ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are schema stale.
-
.seo_analysis_fresh ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are seo analysis fresh.
-
.seo_analysis_none ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are seo analysis none.
-
.seo_analysis_stale ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are seo analysis stale.
-
.with_extracted_content ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are with extracted content.
-
.with_rendered_schema ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are with rendered schema.
-
.with_schema_type ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are with schema type.
-
.with_seo_data ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are with seo data.
-
.without_rendered_schema ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are without rendered schema.
Instance Method Summary collapse
-
#cannibalization_risks ⇒ Array<Hash>
Check for keyword cannibalization.
-
#content_for_embedding(_content_type = :primary) ⇒ String?
Generate content for semantic search embedding For static pages, uses extracted content from crawler For other categories, delegates to the linked resource.
-
#embedding_content_changed? ⇒ Boolean
Check if embedding content has changed.
-
#extract_content!(force: false) ⇒ Object
Crawl this page and extract content.
-
#has_rendered_schema_type?(type) ⇒ Boolean
Check if the rendered page has a specific schema type.
-
#historical_urls ⇒ Array<String>
Get all historical URLs for this resource (for matching external data) Uses FriendlyId slug history when available.
-
#locale_for_embedding ⇒ String
Locale for embedding - uses the SiteMap's locale column Preserves full locale (en-US, en-CA) to allow region-specific search.
-
#production_url ⇒ String
Always returns the production URL regardless of environment.
- #purge_edge_cache(async: true, extra_urls: []) ⇒ Object
-
#ranking_keywords_count ⇒ Integer
Get count of ranking keywords (position 1-100) Uses actual records rather than cached counter for accuracy.
-
#rendered_faq_count ⇒ Integer
Count of FAQ questions in rendered FAQPage schema.
-
#rendered_schema_types ⇒ Array<String>
Schema types found on the rendered page (e.g., ["FAQPage", "Article", "BreadcrumbList"]).
-
#rendered_schemas_by_type(type) ⇒ Array<Hash>
Get schemas of a specific type from the rendered page.
-
#section_cache_urls ⇒ Array<String>
Product detail pages defer their tab content (documents, reviews, faq, …) to lazy Turbo-Frame endpoints at //products/code//section/ (Www::ProductsController#section).
-
#seo_analysis_stale? ⇒ Boolean
Whether the AI analysis is stale (page was recrawled after analysis ran).
-
#seo_avg_position ⇒ BigDecimal?
GSC average search position (28-day window).
-
#seo_ctr ⇒ BigDecimal?
GSC click-through rate (28-day window).
-
#seo_data? ⇒ Boolean
Check if SEO data has been synced or analyzed.
-
#seo_impressions ⇒ Integer?
GSC impressions (28-day window).
-
#seo_report_analyzed_at ⇒ Time?
Parsed analyzed_at from seo_report (ISO 8601 string).
-
#seo_traffic_trend ⇒ Symbol
Get traffic trend based on historical data.
-
#sibling_site_maps ⇒ ActiveRecord::Relation<SiteMap>
Other locales for the same path (e.g. en-CA when this is en-US).
-
#skip_cache_warmup? ⇒ Boolean
Those resources do not need cache warmup.
-
#slug_candidates ⇒ Object
FriendlyId base for
page_friendly_id. -
#suggested_keyword_target_for(keyword, search_volume: nil) ⇒ String
Intent-based suggestion: compare keyword intent with this page's intent and with other pages that rank for the same keyword.
-
#to_param ⇒ Object
Keep CRM/admin routes on the unambiguous primary key.
-
#top_keywords(limit: 10) ⇒ Array<SeoPageKeyword>
Get top keywords for this page.
-
#url ⇒ String
Constructs the full URL from WEB_URL + locale + path Example: locale='en-US', path='/products/foo' => 'https://www.warmlyyours.com/en-US/products/foo'.
-
#url=(full_url) ⇒ Object
Alias for backward compatibility - some code may use url=.
-
#url_path ⇒ String
Extract URL path for matching with external data Now simply returns the stored path.
-
#visit_count_90d ⇒ Integer?
Visit count over 90-day window.
- #warm_cache ⇒ Object
Methods included from Models::Embeddable
embeddable_content_types, #embeddable_locales, #embedding_content_hash, embedding_partition_class, #embedding_stale?, #embedding_type_name, #embedding_vector, #find_content_embedding, #find_similar, #generate_all_embeddings!, #generate_chunked_embeddings!, #generate_embedding!, #has_embedding?, #needs_chunking?, regenerate_all_embeddings, semantic_search
Methods inherited from ApplicationRecord
ransackable_associations, ransackable_attributes, ransackable_scopes, ransortable_attributes, #to_relation
Methods included from Schedulable
Methods included from Models::AfterCommittable
Methods included from Models::EventPublishable
Instance Attribute Details
#last_mod ⇒ Object (readonly)
121 |
# File 'app/models/site_map.rb', line 121 validates :path, :locale, :last_mod, presence: true |
#locale ⇒ Object (readonly)
121 |
# File 'app/models/site_map.rb', line 121 validates :path, :locale, :last_mod, presence: true |
#path ⇒ Object (readonly)
121 |
# File 'app/models/site_map.rb', line 121 validates :path, :locale, :last_mod, presence: true |
Class Method Details
.by_traffic ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are by traffic. Active Record Scope
136 |
# File 'app/models/site_map.rb', line 136 scope :by_traffic, -> { order(seo_traffic: :desc) } |
.cacheable ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are cacheable. Active Record Scope
128 |
# File 'app/models/site_map.rb', line 128 scope :cacheable, -> { active.where.not(category: %w[publication video]) } |
.categories_for_select ⇒ Object
321 322 323 |
# File 'app/models/site_map.rb', line 321 def self.categories_for_select %w[faqs floor_plan form post product product_line publication showcase static_page support tech_article towel_warmer_filter video] end |
.embeddable ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are embeddable. Active Record Scope
131 |
# File 'app/models/site_map.rb', line 131 scope :embeddable, -> { active.where(category: EMBEDDABLE_CATEGORIES).with_extracted_content } |
.extract_path_from_url(full_url) ⇒ String
Extract path from a full URL, stripping domain and locale
309 310 311 312 313 314 315 316 317 318 319 |
# File 'app/models/site_map.rb', line 309 def self.extract_path_from_url(full_url) return '/' if full_url.blank? # Remove protocol and domain uri_path = URI.parse(full_url).path # Remove locale prefix (e.g., /en-US/) uri_path.sub(%r{^/[a-z]{2}-[A-Z]{2}}, '') .then { |p| p.presence || '/' } rescue URI::InvalidURIError '/' end |
.for_path(path, locale = nil) ⇒ SiteMap?
Resolve a locale-less path — the page's CURRENT path or any PRIOR path it was
renamed away from — to its SiteMap. SEO syncs (GSC / Ahrefs / GA4 / Ads /
Cloudflare) call this so metrics still attributed to an old URL after a rename
(the external systems lag the 301 for weeks) land on the live page instead of
being dropped. Current path wins; SiteMapPathHistory is the fallback.
237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 |
# File 'app/models/site_map.rb', line 237 def self.for_path(path, locale = nil) return nil if path.blank? || path == '/' if locale.present? current = find_by(path: path, locale: locale) return current if current historical = SiteMapPathHistory.find_by(path: path, locale: locale)&.site_map return historical if historical end # Locale-less fallback: a path can repeat across locales, so order for a # deterministic result (lowest id wins, matching the Cloudflare lookup). where(path: path).order(:id).first || SiteMapPathHistory.where(path: path).order(:id).first&.site_map end |
.high_traffic ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are high traffic. Active Record Scope
137 |
# File 'app/models/site_map.rb', line 137 scope :high_traffic, ->(threshold = 100) { where(seo_traffic: threshold..) } |
.needs_extraction ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are needs extraction. Active Record Scope
130 |
# File 'app/models/site_map.rb', line 130 scope :needs_extraction, -> { active.where(category: EMBEDDABLE_CATEGORIES, extracted_at: nil) } |
.needs_seo_sync ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are needs seo sync. Active Record Scope
135 |
# File 'app/models/site_map.rb', line 135 scope :needs_seo_sync, -> { active.where(seo_synced_at: nil).or(active.where(seo_synced_at: ..7.days.ago)) } |
.page_friendly_id_for(category:, path:, resource_type: nil, resource_id: nil) ⇒ String
Deterministic page_friendly_id derivation, shared by FriendlyId (on create)
and Sitemap::SitemapGenerator (which computes it directly for its bulk
upsert). Keep the two in lockstep — both must produce the same value for the
same page or the generator can't match an existing row to update in place.
220 221 222 223 224 225 226 |
# File 'app/models/site_map.rb', line 220 def self.page_friendly_id_for(category:, path:, resource_type: nil, resource_id: nil) if resource_id.present? && resource_type.present? "#{category}-#{resource_type.to_s.underscore.dasherize}-#{resource_id}" else "#{category}-#{path.to_s.delete_prefix('/').parameterize}" end end |
.purge_edge_cache_by_pattern(pattern, async: false, delay: nil) ⇒ Object
Pattern can be e.g "/floor-heating/" for everything floor heating related
Purges the edge cache for all URLs matching the given pattern.
The pattern can contain * as a wildcard. Converts * to % for the SQL pattern matching.
Fetches all matching URLs from the database and purges them from the edge cache.
Can run asynchronously by queueing jobs, with optional delay.
Logs any errors to AppSignal.
331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 |
# File 'app/models/site_map.rb', line 331 def self.purge_edge_cache_by_pattern(pattern, async: false, delay: nil) return :disabled unless Cache::EdgeCacheUtility.edge_cache_enabled? sql_pattern = pattern.tr('*', '%') # Match against path column instead of full URL records = where(SiteMap[:path].matches(sql_pattern)) urls = records.map(&:url) begin if async if delay EdgeCacheWorker.perform_in(delay, 'urls' => urls) else EdgeCacheWorker.perform_async('urls' => urls) end else Cache::EdgeCacheUtility.instance.purge_url(urls) end rescue StandardError => e ErrorReporting.error e, url: urls.first end end |
.schema_stale ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are schema stale. Active Record Scope
157 |
# File 'app/models/site_map.rb', line 157 scope :schema_stale, ->(since = 30.days.ago) { where(rendered_schema_at: nil).or(where(rendered_schema_at: ..since)) } |
.seo_analysis_fresh ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are seo analysis fresh. Active Record Scope
500 501 502 503 |
# File 'app/models/site_map.rb', line 500 scope :seo_analysis_fresh, -> { where("seo_report->>'analyzed_at' IS NOT NULL") .where.not(STALE_ANALYSIS_SQL) } |
.seo_analysis_none ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are seo analysis none. Active Record Scope
504 505 506 507 508 |
# File 'app/models/site_map.rb', line 504 scope :seo_analysis_none, -> { where(seo_report: nil) .or(where(seo_report: {})) .or(where("seo_report->>'analyzed_at' IS NULL")) } |
.seo_analysis_stale ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are seo analysis stale. Active Record Scope
499 |
# File 'app/models/site_map.rb', line 499 scope :seo_analysis_stale, -> { where(STALE_ANALYSIS_SQL) } |
.with_extracted_content ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are with extracted content. Active Record Scope
129 |
# File 'app/models/site_map.rb', line 129 scope :with_extracted_content, -> { where.not(extracted_content: nil) } |
.with_rendered_schema ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are with rendered schema. Active Record Scope
154 |
# File 'app/models/site_map.rb', line 154 scope :with_rendered_schema, -> { where.not(rendered_schema: nil) } |
.with_schema_type ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are with schema type. Active Record Scope
156 |
# File 'app/models/site_map.rb', line 156 scope :with_schema_type, ->(type) { where("rendered_schema @> ?", [{ '@type' => type }].to_json) } |
.with_seo_data ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are with seo data. Active Record Scope
134 |
# File 'app/models/site_map.rb', line 134 scope :with_seo_data, -> { where.not(seo_synced_at: nil) } |
.without_rendered_schema ⇒ ActiveRecord::Relation<SiteMap>
A relation of SiteMaps that are without rendered schema. Active Record Scope
155 |
# File 'app/models/site_map.rb', line 155 scope :without_rendered_schema, -> { where(rendered_schema: nil) } |
Instance Method Details
#cannibalization_risks ⇒ Array<Hash>
Check for keyword cannibalization
566 567 568 569 570 571 572 573 574 575 576 577 578 579 |
# File 'app/models/site_map.rb', line 566 def cannibalization_risks competing = [] seo_page_keywords.at_risk.each do |kw| kw.competing_pages.at_risk.each do |competing_kw| competing << { keyword: kw.keyword, this_position: kw.position, competing_url: competing_kw.site_map.url, competing_position: competing_kw.position } end end competing.uniq { |c| [c[:keyword], c[:competing_url]] } end |
#content_for_embedding(_content_type = :primary) ⇒ String?
Generate content for semantic search embedding
For static pages, uses extracted content from crawler
For other categories, delegates to the linked resource
425 426 427 428 429 430 431 432 433 434 |
# File 'app/models/site_map.rb', line 425 def (_content_type = :primary) case category when 'static_page' build_static_page_content else # For other categories (post, video, etc.), the resource model handles embedding # This prevents duplicate embeddings nil end end |
#data_points ⇒ ActiveRecord::Relation<SiteMapDataPoint>
102 |
# File 'app/models/site_map.rb', line 102 has_many :data_points, class_name: 'SiteMapDataPoint', dependent: :destroy |
#embedding_content_changed? ⇒ Boolean
Check if embedding content has changed
437 438 439 |
# File 'app/models/site_map.rb', line 437 def saved_change_to_extracted_content? || saved_change_to_extracted_title? end |
#extract_content!(force: false) ⇒ Object
Crawl this page and extract content
451 452 453 454 455 456 457 |
# File 'app/models/site_map.rb', line 451 def extract_content!(force: false) return extracted_content if extracted_content.present? && !force Cache::SiteCrawler.new.process(pages: SiteMap.where(id: id), extract_content: true) reload extracted_content end |
#has_rendered_schema_type?(type) ⇒ Boolean
Check if the rendered page has a specific schema type
394 395 396 |
# File 'app/models/site_map.rb', line 394 def has_rendered_schema_type?(type) rendered_schema_types.include?(type) end |
#historical_urls ⇒ Array<String>
Get all historical URLs for this resource (for matching external data)
Uses FriendlyId slug history when available
634 635 636 637 638 639 640 641 642 643 |
# File 'app/models/site_map.rb', line 634 def historical_urls return [] if resource.blank? return [] unless resource.respond_to?(:slugs) resource.slugs.map do |slug_record| url.sub(resource.slug, slug_record.slug) end rescue StandardError [] end |
#inbound_links ⇒ ActiveRecord::Relation<SiteMapLink>
109 |
# File 'app/models/site_map.rb', line 109 has_many :inbound_links, class_name: 'SiteMapLink', foreign_key: :to_site_map_id, dependent: :nullify, inverse_of: :to_site_map |
#locale_for_embedding ⇒ String
Locale for embedding - uses the SiteMap's locale column
Preserves full locale (en-US, en-CA) to allow region-specific search
445 446 447 |
# File 'app/models/site_map.rb', line 445 def locale.to_s.presence || 'en' end |
#outbound_links ⇒ ActiveRecord::Relation<SiteMapLink>
Internal link graph
108 |
# File 'app/models/site_map.rb', line 108 has_many :outbound_links, class_name: 'SiteMapLink', foreign_key: :from_site_map_id, dependent: :delete_all, inverse_of: :from_site_map |
#path_histories ⇒ ActiveRecord::Relation<SiteMapPathHistory>
Prior URLs this page was served at, for self-healing 301s after a rename.
112 |
# File 'app/models/site_map.rb', line 112 has_many :path_histories, class_name: 'SiteMapPathHistory', dependent: :delete_all |
#production_url ⇒ String
Always returns the production URL regardless of environment.
Use this for SEO analysis, external tools, and display purposes.
257 258 259 |
# File 'app/models/site_map.rb', line 257 def production_url build_url('https://www.warmlyyours.com') end |
#purge_edge_cache(async: true, extra_urls: []) ⇒ Object
360 361 362 363 364 365 366 367 368 369 370 371 372 373 |
# File 'app/models/site_map.rb', line 360 def purge_edge_cache(async: true, extra_urls: []) return :disabled unless Cache::EdgeCacheUtility.edge_cache_enabled? urls = ([url] + extra_urls).compact.uniq begin if async EdgeCacheWorker.perform_async('urls' => urls) else urls.each { |url| Cache::EdgeCacheUtility.instance.purge_url(url) } end rescue StandardError => e ErrorReporting.error e, url: url end end |
#ranking_keywords_count ⇒ Integer
Get count of ranking keywords (position 1-100)
Uses actual records rather than cached counter for accuracy
560 561 562 |
# File 'app/models/site_map.rb', line 560 def ranking_keywords_count seo_page_keywords.ranking.count end |
#recommendations ⇒ ActiveRecord::Relation<SiteMapRecommendation>
SEO recommendations extracted from seo_report
105 |
# File 'app/models/site_map.rb', line 105 has_many :recommendations, class_name: 'SiteMapRecommendation', dependent: :destroy |
#rendered_faq_count ⇒ Integer
Count of FAQ questions in rendered FAQPage schema
409 410 411 412 413 |
# File 'app/models/site_map.rb', line 409 def rendered_faq_count rendered_schemas_by_type('FAQPage') .flat_map { |s| Array(s['mainEntity']) } .size end |
#rendered_schema_types ⇒ Array<String>
Schema types found on the rendered page (e.g., ["FAQPage", "Article", "BreadcrumbList"])
386 387 388 389 390 |
# File 'app/models/site_map.rb', line 386 def rendered_schema_types return [] if rendered_schema.blank? rendered_schema.flat_map { |s| Array(s['@type']) }.compact.uniq end |
#rendered_schemas_by_type(type) ⇒ Array<Hash>
Get schemas of a specific type from the rendered page
401 402 403 404 405 |
# File 'app/models/site_map.rb', line 401 def rendered_schemas_by_type(type) return [] if rendered_schema.blank? rendered_schema.select { |s| Array(s['@type']).include?(type) } end |
#resource ⇒ Resource
98 |
# File 'app/models/site_map.rb', line 98 belongs_to :resource, polymorphic: true, optional: true |
#section_cache_urls ⇒ Array<String>
Product detail pages defer their tab content (documents, reviews, faq, …) to
lazy Turbo-Frame endpoints at //products/code//section/
(Www::ProductsController#section). Cloudflare caches EACH fragment as its own
edge entry, independent of this page's url — so purging only url leaves
the fragments stale (e.g. a revised publication keeps rendering the OLD
revision in the Documents tab even though the page HTML is fresh). Enumerate
the fragment URLs so a product-page purge invalidates them too. Non-product
pages have no such fragments.
The section list is sourced from Www::ProductCatalogPresenter::LAZY_SECTIONS
(the same constant the page uses to emit the frames) so the two never drift.
192 193 194 195 196 197 198 199 200 201 |
# File 'app/models/site_map.rb', line 192 def section_cache_urls return [] unless category == 'product' sku = path.to_s.split('/').reject(&:blank?).last return [] if sku.blank? Www::ProductCatalogPresenter::LAZY_SECTIONS.map do |section| "#{WEB_URL}/#{locale}/products/code/#{sku}/section/#{section}" end end |
#seo_analysis_stale? ⇒ Boolean
Whether the AI analysis is stale (page was recrawled after analysis ran).
Matches Crm::SeoDashboardComponent#report_stale?
472 473 474 475 476 477 478 |
# File 'app/models/site_map.rb', line 472 def seo_analysis_stale? analyzed_at = seo_report_analyzed_at return false unless analyzed_at (rendered_schema_at.present? && rendered_schema_at > analyzed_at) || (extracted_at.present? && extracted_at > analyzed_at) end |
#seo_avg_position ⇒ BigDecimal?
GSC average search position (28-day window)
529 530 531 |
# File 'app/models/site_map.rb', line 529 def seo_avg_position latest_data_point_value(:gsc_avg_position) end |
#seo_ctr ⇒ BigDecimal?
GSC click-through rate (28-day window)
523 524 525 |
# File 'app/models/site_map.rb', line 523 def seo_ctr latest_data_point_value(:gsc_ctr) end |
#seo_data? ⇒ Boolean
Check if SEO data has been synced or analyzed
464 465 466 |
# File 'app/models/site_map.rb', line 464 def seo_data? seo_synced_at.present? || seo_report.present? || data_points.exists? end |
#seo_impressions ⇒ Integer?
GSC impressions (28-day window)
517 518 519 |
# File 'app/models/site_map.rb', line 517 def seo_impressions latest_data_point_value(:gsc_impressions)&.to_i end |
#seo_page_keywords ⇒ ActiveRecord::Relation<SeoPageKeyword>
SEO metrics associations
101 |
# File 'app/models/site_map.rb', line 101 has_many :seo_page_keywords, dependent: :destroy |
#seo_report_analyzed_at ⇒ Time?
Parsed analyzed_at from seo_report (ISO 8601 string).
482 483 484 485 486 487 |
# File 'app/models/site_map.rb', line 482 def seo_report_analyzed_at raw = seo_report&.dig('analyzed_at') raw.present? ? Time.zone.parse(raw) : nil rescue ArgumentError, TypeError nil end |
#seo_traffic_trend ⇒ Symbol
Get traffic trend based on historical data
541 542 543 544 545 546 547 548 |
# File 'app/models/site_map.rb', line 541 def seo_traffic_trend # Use Ahrefs traffic from data_points if available, fall back to legacy if data_points.for_metric(:ahrefs_traffic).exists? data_points.trend_direction(:ahrefs_traffic) else :unknown end end |
#sibling_site_maps ⇒ ActiveRecord::Relation<SiteMap>
Other locales for the same path (e.g. en-CA when this is en-US).
Used so SEO analysis and Sunny fix prompts have all-country context for shared content (e.g. blogs).
292 293 294 295 296 |
# File 'app/models/site_map.rb', line 292 def sibling_site_maps return SiteMap.none if path.blank? SiteMap.active.where(path: path).where.not(id: id).order(:locale) end |
#skip_cache_warmup? ⇒ Boolean
Those resources do not need cache warmup
376 377 378 |
# File 'app/models/site_map.rb', line 376 def skip_cache_warmup? category&.in?&.[]('publication', 'video') end |
#slug_candidates ⇒ Object
FriendlyId base for page_friendly_id. Stable across URL changes: derived
from the backing resource (immutable id, NOT its slug — the slug is what
changes on a rename), prefixed by category so a resource that backs several
pages (a CatalogItem has a product page AND a support page) doesn't collide.
Resource-less pages (static pages, filters, tags) fall back to the path —
they have no stable anchor and rarely rename.
209 210 211 212 |
# File 'app/models/site_map.rb', line 209 def slug_candidates self.class.page_friendly_id_for(category: category, path: path, resource_type: resource_type, resource_id: resource_id) end |
#suggested_keyword_target_for(keyword, search_volume: nil) ⇒ String
Intent-based suggestion: compare keyword intent with this page's intent and with other
pages that rank for the same keyword. Returns 'desired', 'undesired', or 'ignore'.
- ignore: noise keyword, or very low search volume, or no clear signal
- desired: this page already has keyword_target='desired' for this keyword,
or the target_query matches - undesired: another page (same locale) has keyword_target='desired' for this keyword
592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 |
# File 'app/models/site_map.rb', line 592 def suggested_keyword_target_for(keyword, search_volume: nil) return 'ignore' if keyword.blank? normalized = keyword.to_s.strip.downcase return 'ignore' if normalized.blank? # Prefer AI-generated suggestions from SEO analysis when present suggestions = seo_report.is_a?(Hash) && seo_report['keyword_suggestions'].is_a?(Hash) ? seo_report['keyword_suggestions'] : nil if suggestions.present? ai_value = suggestions[normalized] || suggestions[keyword] return ai_value if ai_value.present? && SeoPageKeyword.keyword_targets.key?(ai_value.to_s) end return 'ignore' if SeoPageKeyword.noise?(keyword) return 'ignore' if search_volume.present? && search_volume.to_i < 10 # This page already has this keyword marked as desired return 'desired' if seo_page_keywords.desired.exists?(keyword: keyword) return 'desired' if target_query.present? && target_query.strip.downcase == normalized # Another page (same locale) has this keyword marked as desired → we shouldn't compete other_owns = SeoPageKeyword .joins(:site_map) .where(keyword: keyword, keyword_target: :desired) .where.not(site_map_id: id) .merge(SiteMap.where(locale: locale)) .exists? return 'undesired' if other_owns 'ignore' end |
#to_param ⇒ Object
Keep CRM/admin routes on the unambiguous primary key. FriendlyId's :slugged
otherwise overrides #to_param to emit page_friendly_id, which — being only
per-locale unique — collides across locales and resolves to the wrong row.
82 83 84 |
# File 'app/models/site_map.rb', line 82 def to_param id&.to_s end |
#top_keywords(limit: 10) ⇒ Array<SeoPageKeyword>
Get top keywords for this page
553 554 555 |
# File 'app/models/site_map.rb', line 553 def top_keywords(limit: 10) seo_page_keywords.ranking.by_traffic.limit(limit) end |
#url ⇒ String
Constructs the full URL from WEB_URL + locale + path
Example: locale='en-US', path='/products/foo' => 'https://www.warmlyyours.com/en-US/products/foo'
175 176 177 |
# File 'app/models/site_map.rb', line 175 def url build_url(WEB_URL) end |
#url=(full_url) ⇒ Object
Alias for backward compatibility - some code may use url=
299 300 301 302 303 304 |
# File 'app/models/site_map.rb', line 299 def url=(full_url) return if full_url.blank? # Extract path from full URL self.path = self.class.extract_path_from_url(full_url) end |
#url_path ⇒ String
Extract URL path for matching with external data
Now simply returns the stored path
627 628 629 |
# File 'app/models/site_map.rb', line 627 def url_path path end |
#visit_count_90d ⇒ Integer?
Visit count over 90-day window
535 536 537 |
# File 'app/models/site_map.rb', line 535 def visit_count_90d latest_data_point_value(:visits_90d)&.to_i end |
#warm_cache ⇒ Object
353 354 355 356 357 358 |
# File 'app/models/site_map.rb', line 353 def warm_cache return :disabled unless Cache::EdgeCacheUtility.edge_cache_enabled? # Pass path (not full url) since SiteCrawler filters against the path column Cache::SiteCrawler.new.process(url: path) end |