Image Embeddings (Gemini Embedding 2)
Note: The CLIP/Jina pipeline was removed in early 2026. This document now describes the replacement: Gemini Embedding 2 multimodal embeddings.
Overview
Section titled “Overview”Images are embedded using Gemini Embedding 2 (gemini-embedding-2-preview), a native
multimodal model that encodes images and text into a shared 1536-dimensional vector space.
This enables cross-modal search: describe what you want in words and find matching images.
Features
Section titled “Features”1. Visual Semantic Search
Section titled “1. Visual Semantic Search”Find images using natural language descriptions (text → image cross-modal search).
2. Duplicate Detection
Section titled “2. Duplicate Detection”Identify duplicate or near-duplicate images using perceptual hash (pHash) fingerprinting
stored in digital_assets.fingerprint.
3. Product Image Search
Section titled “3. Product Image Search”Find images visually similar to a product by querying the unified embedding space.
Architecture
Section titled “Architecture”ImageFullAnalysisWorker │ ├─ Step 1: pHash fingerprint (local, no API) │ → stored in digital_assets.fingerprint │ ├─ Step 2: Gemini Embedding 2 (image + metadata text) │ → stored in content_embeddings (content_type='unified', │ embedding_model='gemini-embedding-2-preview', │ unified_embedding vector(1536)) │ └─ Step 3: Gemini Flash vision analysis (optional, for CRM metadata) → stored in digital_assets.ai_visual_descriptionNightly Backfill
Section titled “Nightly Backfill”ImageEmbeddingPopulationWorker (Sidekiq::IterableJob) runs nightly at 2:30 AM CT.
It queues up to 5,000 images per run, prioritising product primary images first.
Uses cursor-based checkpointing so a deploy mid-run resumes from the last record.
Database Schema
Section titled “Database Schema”The content_embeddings table (partitioned; image partition: content_embeddings_images):
| Column | Type | Description |
|---|---|---|
| content_type | string | 'unified' for Gemini image embeddings |
| embedding_model | string | 'gemini-embedding-2-preview' |
| unified_embedding | vector | 1536-dimensional vector |
| embedding_dimensions | integer | 1536 |
Fingerprints live on the parent record:
| Column | Type | Description |
|---|---|---|
| digital_assets.fingerprint | bigint | pHash perceptual hash |
Configuration
Section titled “Configuration”The Gemini API key is stored in Rails credentials at google.gemini.api_key.
Rate limiting is handled by Embedding::Gemini via a Redis sliding-window limiter
(default: 300 requests/minute, configurable via GEMINI_EMBED_REQUESTS_PER_MINUTE).
Queue full analysis for an image
Section titled “Queue full analysis for an image”ImageFullAnalysisWorker.perform_async(image.id)Semantic image search
Section titled “Semantic image search”# Find images matching a text descriptionContentEmbedding::ImageEmbedding.semantic_search("bathroom with heated floors", limit: 10)
# Via the top-level serviceContentEmbedding.unified_visual_search("snow melting driveway", limit: 10)Find similar images
Section titled “Find similar images”image = Image.find(123)image.image_embeddings.unified_content .nearest_neighbors(:unified_embedding, image_vector, distance: :cosine) .limit(10)Rake Tasks
Section titled “Rake Tasks”# Check backfill progressrake embeddings:progress
# Check detailed image essentials (fingerprints + Gemini coverage)rake embeddings:essentials_stats
# Trigger nightly backfill worker manually (all active images)# Enqueues ImageEmbeddingPopulationWorker via Sidekiqrake embeddings:queue_all_image_full
# Product primary images onlyrake embeddings:queue_all_product_full
# Incremental batches (resumable)rake embeddings:populate_image_full # full pipeline, batch: 50rake embeddings:populate_image_vision # vision analysis only, batch: 100
# Fingerprintsrake embeddings:populate_fingerprints # incremental, batch: 100rake embeddings:queue_all_fingerprints # all at once
# Duplicate detection (pHash)rake embeddings:find_phash_duplicatesrake embeddings:find_phash_duplicates[10] # Hamming distance <= 10
# Test Gemini API connectivityrake embeddings:test_gemini_embed
# Embedding statisticsrake embeddings:statsPerformance
Section titled “Performance”- Embedding latency: ~1–3s per image (includes image download + Gemini API call)
- Rate limit: 300 requests/minute (shared across all workers)
- Vector dimensions: 1536 (Matryoshka truncation from 3072 full quality)
- Index type: HNSW with cosine distance, per-partition (~1–5ms queries)
- Storage: ~6KB per embedding (1536 floats × 4 bytes)