Class: Seo::CloudflareSyncService

Inherits:
BaseService show all
Defined in:
app/services/seo/cloudflare_sync_service.rb

Overview

Syncs per-page request counts from Cloudflare edge analytics into
site_map_data_points (metric type cloudflare_requests).

Cloudflare counts every request at the edge — bots, cache hits, and clients
with JavaScript disabled or blocked — so this is a ground-truth traffic
measure that complements the JS-dependent visit counts SeoVisitsSyncService
writes from the Visit table. The gap between the two is itself a signal.

For publications specifically it is also the only accurate download count:
publication PDFs are edge-cached for 30 days, so most downloads never reach
the Rails origin to be counted by items.requested_counter.

Cloudflare caps adaptive-analytics groups per query (~10k), so each run
records the top paths by request volume for the day; the low-traffic long
tail is not recorded. Counts are written as single-day data points.

Examples:

Nightly sync (yesterday)

Seo::CloudflareSyncService.new.process

Constant Summary collapse

METRIC =

site_map_data_points metric type written by this service.

:cloudflare_requests
LOCALE_PREFIX_RE =

Matches a leading site-locale segment: /en-US/foo -> captures "en-US".

%r{\A/(#{LocaleUtility::SITE_LOCALES.join('|')})(?=/|\z)}

Instance Attribute Summary

Attributes inherited from BaseService

#options

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from BaseService

#log_debug, #log_error, #log_info, #log_warning, #logger, #tagged_logger

Constructor Details

#initialize(options = {}) ⇒ CloudflareSyncService

Returns a new instance of CloudflareSyncService.



30
31
32
33
34
35
# File 'app/services/seo/cloudflare_sync_service.rb', line 30

def initialize(options = {})
  super
  @start_date = options[:start_date] || Date.yesterday
  @end_date = options[:end_date] || Date.yesterday
  @stats = { pages_updated: 0, paths_matched: 0, paths_unmatched: 0, errors: [] }
end

Class Method Details

.backfill(start_date:, end_date: Date.yesterday) ⇒ Hash

One-time history load: writes a separate single-day data point for every
day in the range, so rolling-window sums stay correct (a single ranged
#process call would instead write one fat multi-day point).

Cloudflare retains only ~32 days of adaptive analytics — days older than
that return an API error, which #process records in :errors and skips.
The nightly worker handles every day from deploy onward.

Parameters:

  • start_date (Date)

    earliest day to fetch

  • end_date (Date) (defaults to: Date.yesterday)

    latest day to fetch (default: yesterday)

Returns:

  • (Hash)

    stats aggregated across all days



48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'app/services/seo/cloudflare_sync_service.rb', line 48

def self.backfill(start_date:, end_date: Date.yesterday)
  totals = { days: 0, pages_updated: 0, paths_matched: 0, paths_unmatched: 0, errors: [] }

  (start_date..end_date).each do |day|
    stats = new(start_date: day, end_date: day).process
    totals[:days] += 1
    totals[:pages_updated] += stats[:pages_updated]
    totals[:paths_matched] += stats[:paths_matched]
    totals[:paths_unmatched] += stats[:paths_unmatched]
    totals[:errors].concat(stats[:errors])
  end

  totals
end

Instance Method Details

#processObject



63
64
65
66
67
68
69
70
71
72
73
74
75
# File 'app/services/seo/cloudflare_sync_service.rb', line 63

def process
  @logger.info "[CloudflareSyncService] Starting sync for #{@start_date} to #{@end_date}"

  rows = fetch_path_counts
  return @stats if rows.nil?

  per_site_map = aggregate_by_site_map(rows)
  record_metrics(per_site_map)

  @logger.info "[CloudflareSyncService] Completed: #{@stats[:pages_updated]} pages updated " \
               "(#{@stats[:paths_matched]} paths matched, #{@stats[:paths_unmatched]} unmatched)"
  @stats
end