Skip to content

SEO Metrics Data Model

This document describes the time-series data model for SEO metrics, replacing the previous “wide table” approach with a flexible fact table design.

  1. Temporal accuracy: Each metric knows exactly what time period it covers
  2. Flexibility: Add new metrics without schema changes
  3. Granularity: Support daily, weekly, monthly, or custom periods
  4. Context preservation: Keyword-specific metrics retain their keyword reference
  5. Query efficiency: Optimized for trend analysis and period comparisons
CREATE TYPE seo_metric_type AS ENUM (
-- Google Search Console (28-day periods typically)
'gsc_clicks',
'gsc_impressions',
'gsc_ctr',
'gsc_avg_position',
-- Google Analytics 4 (flexible periods)
'ga4_page_views',
'ga4_sessions',
'ga4_users',
'ga4_bounce_rate',
'ga4_engagement_rate',
'ga4_avg_session_duration',
-- Ahrefs (point-in-time snapshots)
'ahrefs_traffic',
'ahrefs_keywords_count',
'ahrefs_traffic_value',
'ahrefs_domain_rating',
-- Keyword-specific (requires reference)
'keyword_position',
'keyword_search_volume',
'keyword_traffic_share',
-- Internal (first-party)
'visit_count'
);
CREATE TABLE site_map_data_points (
id BIGSERIAL PRIMARY KEY,
site_map_id BIGINT NOT NULL REFERENCES site_maps(id) ON DELETE CASCADE,
-- Temporal
recorded_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), -- When we captured this
period DATERANGE NOT NULL, -- Time range this covers
-- Metric identification
metric_type seo_metric_type NOT NULL,
reference TEXT, -- Context (e.g., keyword text)
-- Value (decimal handles all numeric types)
value DECIMAL(15, 4) NOT NULL,
-- Metadata
source_batch_id UUID, -- Group related data points
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Primary query pattern: get metrics for a site_map in a date range
CREATE INDEX idx_data_points_lookup
ON site_map_data_points (site_map_id, metric_type, period)
USING GIST (period);
-- Find overlapping periods
CREATE INDEX idx_data_points_period
ON site_map_data_points USING GIST (period);
-- Query by source batch
CREATE INDEX idx_data_points_batch
ON site_map_data_points (source_batch_id)
WHERE source_batch_id IS NOT NULL;
# GSC sync (28-day window)
SiteMapDataPoint.create!(
site_map: site_map,
period: (28.days.ago.to_date..Date.yesterday),
metric_type: :gsc_clicks,
value: 1234,
source_batch_id: batch_id
)
# GA4 sync (30-day window)
SiteMapDataPoint.create!(
site_map: site_map,
period: (30.days.ago.to_date..Date.yesterday),
metric_type: :ga4_page_views,
value: 5678,
source_batch_id: batch_id
)
# Keyword position (point-in-time, single day)
SiteMapDataPoint.create!(
site_map: site_map,
period: (Date.current..Date.current),
metric_type: :keyword_position,
reference: 'heated driveway cost',
value: 3.2,
source_batch_id: batch_id
)
# Get GA4 page views trend over last 6 months
site_map.data_points
.where(metric_type: :ga4_page_views)
.where("period && daterange(?, ?)", 6.months.ago.to_date, Date.current)
.order(:recorded_at)
# Compare two periods
current_period = (30.days.ago.to_date..Date.yesterday)
previous_period = (60.days.ago.to_date..31.days.ago.to_date)
current = site_map.data_points
.where(metric_type: :ga4_page_views)
.where("period @> ?::date", 15.days.ago.to_date)
.order(recorded_at: :desc)
.first&.value
previous = site_map.data_points
.where(metric_type: :ga4_page_views)
.where("period @> ?::date", 45.days.ago.to_date)
.order(recorded_at: :desc)
.first&.value
# Track keyword position over time
site_map.data_points
.where(metric_type: :keyword_position)
.where(reference: 'heated driveway cost')
.order(:recorded_at)
.pluck(:recorded_at, :value)
OperatorMeaning
@>Contains element
<@Contained by
&&Overlaps
-|-Adjacent to
  • Create site_map_data_points table
  • Update sync services to write to new table
  • Keep existing site_maps columns for backward compatibility
  • Migrate existing seo_metrics_histories data
  • Convert wide rows to individual data points
  • Update dashboards to read from new table
  • Add trend visualizations
  • Remove deprecated columns from site_maps
  • Drop seo_metrics_histories table
AspectOld (Wide Table)New (Fact Table)
New metricsMigration requiredJust add enum value
Time periodsSingle snapshot_dateExplicit daterange per metric
Keyword trackingSeparate tableSame table with reference
Query flexibilityFixed columnsDynamic pivoting
Storage efficiencySparse NULLsDense rows
Audit trailOverwrittenAppend-only history
  1. Query complexity: Pivoting requires more SQL, but PostgreSQL handles this well
  2. Index strategy: GIST indexes on daterange are essential
  3. Volume management: Consider partitioning by month if data grows large
  4. Aggregation: May want materialized views for common aggregations