SEO Metrics Data Model
Overview
This document describes the time-series data model for SEO metrics, replacing the
previous "wide table" approach with a flexible fact table design.
Design Goals
- Temporal accuracy: Each metric knows exactly what time period it covers
- Flexibility: Add new metrics without schema changes
- Granularity: Support daily, weekly, monthly, or custom periods
- Context preservation: Keyword-specific metrics retain their keyword reference
- Query efficiency: Optimized for trend analysis and period comparisons
Schema
CREATE TYPE seo_metric_type AS ENUM (
-- Google Search Console (28-day periods typically)
'gsc_clicks',
'gsc_impressions',
'gsc_ctr',
'gsc_avg_position',
-- Google Analytics 4 (flexible periods)
'ga4_page_views',
'ga4_sessions',
'ga4_users',
'ga4_bounce_rate',
'ga4_engagement_rate',
'ga4_avg_session_duration',
-- Ahrefs (point-in-time snapshots)
'ahrefs_traffic',
'ahrefs_keywords_count',
'ahrefs_traffic_value',
'ahrefs_domain_rating',
-- Keyword-specific (requires reference)
'keyword_position',
'keyword_search_volume',
'keyword_traffic_share',
-- Internal (first-party)
'visit_count'
);
CREATE TABLE site_map_data_points (
id BIGSERIAL PRIMARY KEY,
site_map_id BIGINT NOT NULL REFERENCES site_maps(id) ON DELETE CASCADE,
-- Temporal
recorded_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), -- When we captured this
period DATERANGE NOT NULL, -- Time range this covers
-- Metric identification
metric_type seo_metric_type NOT NULL,
reference TEXT, -- Context (e.g., keyword text)
-- Value (decimal handles all numeric types)
value DECIMAL(15, 4) NOT NULL,
-- Metadata
source_batch_id UUID, -- Group related data points
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Primary query pattern: get metrics for a site_map in a date range
CREATE INDEX idx_data_points_lookup
ON site_map_data_points (site_map_id, metric_type, period)
USING GIST (period);
-- Find overlapping periods
CREATE INDEX idx_data_points_period
ON site_map_data_points USING GIST (period);
-- Query by source batch
CREATE INDEX idx_data_points_batch
ON site_map_data_points (source_batch_id)
WHERE source_batch_id IS NOT NULL;
Usage Patterns
Recording Metrics
SiteMapDataPoint.create!(
site_map: site_map,
period: (28.days.ago.to_date..Date.yesterday),
metric_type: :gsc_clicks,
value: 1234,
source_batch_id: batch_id
)
SiteMapDataPoint.create!(
site_map: site_map,
period: (30.days.ago.to_date..Date.yesterday),
metric_type: :ga4_page_views,
value: 5678,
source_batch_id: batch_id
)
SiteMapDataPoint.create!(
site_map: site_map,
period: (Date.current..Date.current),
metric_type: :keyword_position,
reference: 'heated driveway cost',
value: 3.2,
source_batch_id: batch_id
)
Querying Trends
site_map.data_points
.where(metric_type: :ga4_page_views)
.where("period && daterange(?, ?)", 6.months.ago.to_date, Date.current)
.order(:recorded_at)
current_period = (30.days.ago.to_date..Date.yesterday)
previous_period = (60.days.ago.to_date..31.days.ago.to_date)
current = site_map.data_points
.where(metric_type: :ga4_page_views)
.where("period @> ?::date", 15.days.ago.to_date)
.order(recorded_at: :desc)
.first&.value
previous = site_map.data_points
.where(metric_type: :ga4_page_views)
.where("period @> ?::date", 45.days.ago.to_date)
.order(recorded_at: :desc)
.first&.value
site_map.data_points
.where(metric_type: :keyword_position)
.where(reference: 'heated driveway cost')
.order(:recorded_at)
.pluck(:recorded_at, :value)
PostgreSQL Range Operators
| Operator |
Meaning |
@> |
Contains element |
<@ |
Contained by |
&& |
Overlaps |
-|- |
Adjacent to |
Migration Strategy
Phase 1: Create New Table
- Create
site_map_data_points table
- Update sync services to write to new table
- Keep existing
site_maps columns for backward compatibility
Phase 2: Backfill
- Migrate existing
seo_metrics_histories data
- Convert wide rows to individual data points
Phase 3: Update UI
- Update dashboards to read from new table
- Add trend visualizations
Phase 4: Cleanup
- Remove deprecated columns from
site_maps
- Drop
seo_metrics_histories table
Advantages Over Previous Design
| Aspect |
Old (Wide Table) |
New (Fact Table) |
| New metrics |
Migration required |
Just add enum value |
| Time periods |
Single snapshot_date |
Explicit daterange per metric |
| Keyword tracking |
Separate table |
Same table with reference |
| Query flexibility |
Fixed columns |
Dynamic pivoting |
| Storage efficiency |
Sparse NULLs |
Dense rows |
| Audit trail |
Overwritten |
Append-only history |
Considerations
- Query complexity: Pivoting requires more SQL, but PostgreSQL handles this well
- Index strategy: GIST indexes on daterange are essential
- Volume management: Consider partitioning by month if data grows large
- Aggregation: May want materialized views for common aggregations