Class: Embedding::Gemini

Inherits:
Object
  • Object
show all
Defined in:
app/services/embedding/gemini.rb

Overview

Gemini Embedding 2 service for multimodal embeddings.

Natively embeds images, text, and interleaved image+text into a unified
vector space via the Gemini API embedContent endpoint.

== Architecture

Image Analysis: pHash → Gemini Embedding 2 (image + metadata)
Vision Desc: Gemini Flash (independent, on-demand)

== Embedding Dimensions

Gemini Embedding 2 supports Matryoshka Representation Learning:

  • 3072: Full quality (default from API)
  • 1536: Used here for HNSW compatibility (pgvector 2000-dim limit)
  • 768: For constrained environments

== Rate Limiting

Redis-based sliding window rate limiter.
Configurable via GEMINI_EMBED_REQUESTS_PER_MINUTE (default: 300).

Examples:

Embed an image with metadata text

Embedding::Gemini.embed_image("https://cdn.example.com/photo.jpg",
  text: "Towel warmer, brushed nickel finish")

Embed text for semantic search

Embedding::Gemini.embed_text("radiant floor heating installation")

Embed a search query

Embedding::Gemini.embed_query("bathroom heating")

Defined Under Namespace

Classes: ApiError, ConfigurationError, Error, RateLimitError

Constant Summary collapse

BASE_URL =
'https://generativelanguage.googleapis.com'
API_VERSION =
'v1beta'
EMBED_MODEL =
'gemini-embedding-2-preview'
MODEL_NAME =
'gemini-embedding-2-preview'
DEFAULT_DIMENSIONS =
1536
DEFAULT_TEXT_DIMENSIONS =
1536
DEFAULT_VISUAL_DIMENSIONS =
1536
TIMEOUT =
120
RATE_LIMIT_KEY =
'gemini_embed:rate_limit'
REQUESTS_PER_MINUTE =
ENV.fetch('GEMINI_EMBED_REQUESTS_PER_MINUTE', 300).to_i
RATE_LIMIT_WINDOW =

seconds

60
MAX_RETRIES =
5
BASE_RETRY_DELAY =
2
TASK_TYPES =

Gemini task types for embedding optimization

{
  query: 'RETRIEVAL_QUERY',
  document: 'RETRIEVAL_DOCUMENT',
  similarity: 'SEMANTIC_SIMILARITY',
  classification: 'CLASSIFICATION',
  clustering: 'CLUSTERING'
}.freeze
MIME_TYPES =
{
  '.jpg' => 'image/jpeg',
  '.jpeg' => 'image/jpeg',
  '.png' => 'image/png'
}.freeze
RETRYABLE_EXCEPTIONS =
[
  RateLimitError,
  Faraday::TimeoutError,
  Faraday::ConnectionFailed
].freeze

Class Method Summary collapse

Class Method Details

.available?Boolean

Check if the API is configured

Returns:

  • (Boolean)

    true if API key is configured



145
146
147
148
149
# File 'app/services/embedding/gemini.rb', line 145

def available?
  api_key.present?
rescue ConfigurationError
  false
end

.embed_image(image_url, text: nil, dimensions: DEFAULT_VISUAL_DIMENSIONS) ⇒ Array<Float>

Embed an image, optionally with accompanying text metadata.
Sends the image as inlineData (base64) with optional text parts.

Parameters:

  • image_url (String)

    URL of the image to embed

  • text (String, nil) (defaults to: nil)

    Optional metadata text to embed alongside the image

  • dimensions (Integer) (defaults to: DEFAULT_VISUAL_DIMENSIONS)

    Output vector dimensions

Returns:

  • (Array<Float>)

    Embedding vector



91
92
93
94
95
96
97
# File 'app/services/embedding/gemini.rb', line 91

def embed_image(image_url, text: nil, dimensions: DEFAULT_VISUAL_DIMENSIONS)
  parts = []
  parts << { text: text } if text.present?
  parts << build_image_part(image_url)

  embed_content(parts, task_type: :document, dimensions: dimensions)
end

.embed_image_file(path, text: nil, dimensions: DEFAULT_VISUAL_DIMENSIONS) ⇒ Array<Float>

Embed a local image file

Parameters:

  • path (String)

    Path to the local image file

  • text (String, nil) (defaults to: nil)

    Optional metadata text

  • dimensions (Integer) (defaults to: DEFAULT_VISUAL_DIMENSIONS)

    Output vector dimensions

Returns:

  • (Array<Float>)

    Embedding vector

Raises:



105
106
107
108
109
110
111
112
113
# File 'app/services/embedding/gemini.rb', line 105

def embed_image_file(path, text: nil, dimensions: DEFAULT_VISUAL_DIMENSIONS)
  raise Error, "Image file not found: #{path}" unless File.exist?(path)

  parts = []
  parts << { text: text } if text.present?
  parts << build_file_image_part(path)

  embed_content(parts, task_type: :document, dimensions: dimensions)
end

.embed_image_url(url, dimensions: DEFAULT_VISUAL_DIMENSIONS) ⇒ Object

Alias for embed_image



157
158
159
# File 'app/services/embedding/gemini.rb', line 157

def embed_image_url(url, dimensions: DEFAULT_VISUAL_DIMENSIONS)
  embed_image(url, dimensions: dimensions)
end

.embed_query(text, dimensions: DEFAULT_TEXT_DIMENSIONS) ⇒ Array<Float>

Embed text for a search query

Parameters:

  • text (String)

    Query text

  • dimensions (Integer) (defaults to: DEFAULT_TEXT_DIMENSIONS)

    Output vector dimensions

Returns:

  • (Array<Float>)

    Embedding vector



120
121
122
# File 'app/services/embedding/gemini.rb', line 120

def embed_query(text, dimensions: DEFAULT_TEXT_DIMENSIONS)
  embed_content([{ text: text }], task_type: :query, dimensions: dimensions)
end

.embed_text(text, dimensions: DEFAULT_TEXT_DIMENSIONS) ⇒ Array<Float>

Embed text for storage/indexing

Parameters:

  • text (String)

    Text content to embed

  • dimensions (Integer) (defaults to: DEFAULT_TEXT_DIMENSIONS)

    Output vector dimensions

Returns:

  • (Array<Float>)

    Embedding vector



129
130
131
# File 'app/services/embedding/gemini.rb', line 129

def embed_text(text, dimensions: DEFAULT_TEXT_DIMENSIONS)
  embed_content([{ text: text }], task_type: :document, dimensions: dimensions)
end

.embed_visual_query(text, dimensions: DEFAULT_VISUAL_DIMENSIONS) ⇒ Array<Float>

Embed a text query for visual search (cross-modal text → image)

Parameters:

  • text (String)

    Text description to search for

  • dimensions (Integer) (defaults to: DEFAULT_VISUAL_DIMENSIONS)

    Output vector dimensions

Returns:

  • (Array<Float>)

    Embedding vector



138
139
140
# File 'app/services/embedding/gemini.rb', line 138

def embed_visual_query(text, dimensions: DEFAULT_VISUAL_DIMENSIONS)
  embed_query(text, dimensions: dimensions)
end

.model_nameString

Returns Model name for database storage.

Returns:

  • (String)

    Model name for database storage



152
153
154
# File 'app/services/embedding/gemini.rb', line 152

def model_name
  MODEL_NAME
end