Image Embeddings (Gemini Embedding 2)
Note: The CLIP/Jina pipeline was removed in early 2026. This document now describes
the replacement: Gemini Embedding 2 multimodal embeddings.
Overview
Images are embedded using Gemini Embedding 2 (gemini-embedding-2-preview), a native
multimodal model that encodes images and text into a shared 1536-dimensional vector space.
This enables cross-modal search: describe what you want in words and find matching images.
Features
1. Visual Semantic Search
Find images using natural language descriptions (text → image cross-modal search).
2. Duplicate Detection
Identify duplicate or near-duplicate images using perceptual hash (pHash) fingerprinting
stored in digital_assets.fingerprint.
3. Product Image Search
Find images visually similar to a product by querying the unified embedding space.
Architecture
ImageFullAnalysisWorker
│
├─ Step 1: pHash fingerprint (local, no API)
│ → stored in digital_assets.fingerprint
│
├─ Step 2: Gemini Embedding 2 (image + metadata text)
│ → stored in content_embeddings (content_type='unified',
│ embedding_model='gemini-embedding-2-preview',
│ unified_embedding vector(1536))
│
└─ Step 3: Gemini Flash vision analysis (optional, for CRM metadata)
→ stored in digital_assets.ai_visual_description
Nightly Backfill
ImageEmbeddingPopulationWorker (Sidekiq::IterableJob) runs nightly at 2:30 AM CT.
It queues up to 5,000 images per run, prioritising product primary images first.
Uses cursor-based checkpointing so a deploy mid-run resumes from the last record.
Database Schema
The content_embeddings table (partitioned; image partition: content_embeddings_images):
| Column | Type | Description |
|---|---|---|
| content_type | string | 'unified' for Gemini image embeddings |
| embedding_model | string | 'gemini-embedding-2-preview' |
| unified_embedding | vector | 1536-dimensional vector |
| embedding_dimensions | integer | 1536 |
Fingerprints live on the parent record:
| Column | Type | Description |
|---|---|---|
| digital_assets.fingerprint | bigint | pHash perceptual hash |
Configuration
The Gemini API key is stored in Rails credentials at google.gemini.api_key.
Rate limiting is handled by Embedding::Gemini via a Redis sliding-window limiter
(default: 300 requests/minute, configurable via GEMINI_EMBED_REQUESTS_PER_MINUTE).
Usage
Queue full analysis for an image
ImageFullAnalysisWorker.perform_async(image.id)
Semantic image search
# Find images matching a text description
ContentEmbedding::ImageEmbedding.semantic_search("bathroom with heated floors", limit: 10)
# Via the top-level service
ContentEmbedding.unified_visual_search("snow melting driveway", limit: 10)
Find similar images
image = Image.find(123)
image..unified_content
.nearest_neighbors(:unified_embedding, image_vector, distance: :cosine)
.limit(10)
Rake Tasks
# Check backfill progress
rake embeddings:progress
# Check detailed image essentials (fingerprints + Gemini coverage)
rake embeddings:essentials_stats
# Trigger nightly backfill worker manually (all active images)
# Enqueues ImageEmbeddingPopulationWorker via Sidekiq
rake embeddings:queue_all_image_full
# Product primary images only
rake embeddings:queue_all_product_full
# Incremental batches (resumable)
rake embeddings:populate_image_full # full pipeline, batch: 50
rake embeddings:populate_image_vision # vision analysis only, batch: 100
# Fingerprints
rake embeddings:populate_fingerprints # incremental, batch: 100
rake embeddings:queue_all_fingerprints # all at once
# Duplicate detection (pHash)
rake embeddings:find_phash_duplicates
rake embeddings:find_phash_duplicates[10] # Hamming distance <= 10
# Test Gemini API connectivity
rake embeddings:test_gemini_embed
# Embedding statistics
rake embeddings:stats
Performance
- Embedding latency: ~1–3s per image (includes image download + Gemini API call)
- Rate limit: 300 requests/minute (shared across all workers)
- Vector dimensions: 1536 (Matryoshka truncation from 3072 full quality)
- Index type: HNSW with cosine distance, per-partition (~1–5ms queries)
- Storage: ~6KB per embedding (1536 floats × 4 bytes)