Class: Seo::GeminiBatchClient

Inherits:
Object
  • Object
show all
Defined in:
app/services/seo/gemini_batch_client.rb

Overview

Faraday client for the Gemini Batch API (v1beta).
Submits SEO analysis requests at 50% of standard Gemini pricing.

API reference: https://ai.google.dev/gemini-api/docs/batch-api

Examples:

Submit inline batch

client = Seo::GeminiBatchClient.new
requests = items.map { |item| client.class.build_request(...) }
response = client.create_inline_batch('gemini-3-flash-preview', requests)
status = client.get_batch(response['name'])

Defined Under Namespace

Classes: BatchError, RateLimitError

Constant Summary collapse

BASE_URL =

URL for base.

'https://generativelanguage.googleapis.com'
API_VERSION =

Api version.

'v1beta'
TERMINAL_STATES =

Recognised terminal states. The Gemini Developer API
(generativelanguage v1beta) reports batch state as BATCH_STATE_; older
Vertex-style payloads use JOB_STATE_
. Accept both so polling terminates
regardless of which enum the endpoint returns.

%w[
  BATCH_STATE_SUCCEEDED BATCH_STATE_FAILED BATCH_STATE_CANCELLED BATCH_STATE_EXPIRED
  JOB_STATE_SUCCEEDED JOB_STATE_FAILED JOB_STATE_CANCELLED JOB_STATE_EXPIRED
].freeze

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(api_key: nil) ⇒ GeminiBatchClient

Returns a new instance of GeminiBatchClient.

Raises:

  • (ArgumentError)


35
36
37
38
# File 'app/services/seo/gemini_batch_client.rb', line 35

def initialize(api_key: nil)
  @api_key = api_key || Heatwave::Configuration.fetch(:google, :gemini, :api_key)
  raise ArgumentError, 'Gemini API key is required' if @api_key.blank?
end

Class Method Details

.build_request(custom_id:, user_prompt:, system_prompt: nil, schema: nil, cached_content: nil, temperature: Seo::PageAnalysisService::TEMPERATURE, max_tokens: Seo::PageAnalysisService::MAX_OUTPUT_TOKENS) ⇒ Hash

Build a single inline batch request entry.

Parameters:

  • custom_id (String)

    Unique identifier (e.g. "seo_page_123")

  • system_prompt (String) (defaults to: nil)

    System instruction text (ignored when cached_content is set)

  • user_prompt (String)

    User message text

  • schema (Hash) (defaults to: nil)

    JSON schema for structured output

  • cached_content (String) (defaults to: nil)

    Cache name from create_cache (omits systemInstruction)

  • temperature (Float) (defaults to: Seo::PageAnalysisService::TEMPERATURE)

    Sampling temperature

  • max_tokens (Integer) (defaults to: Seo::PageAnalysisService::MAX_OUTPUT_TOKENS)

    Maximum output tokens

Returns:

  • (Hash)

    A request object for create_inline_batch



168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
# File 'app/services/seo/gemini_batch_client.rb', line 168

def self.build_request(custom_id:, user_prompt:, system_prompt: nil,
                       schema: nil, cached_content: nil,
                       temperature: Seo::PageAnalysisService::TEMPERATURE,
                       max_tokens: Seo::PageAnalysisService::MAX_OUTPUT_TOKENS)
  request_body = {
    contents: [
      { role: 'user', parts: [{ text: user_prompt }] }
    ],
    generationConfig: {
      temperature: temperature,
      maxOutputTokens: max_tokens,
      responseMimeType: 'application/json'
    }
  }

  if cached_content
    request_body[:cachedContent] = cached_content
  elsif system_prompt
    request_body[:systemInstruction] = { parts: [{ text: system_prompt }] }
  end

  request_body[:generationConfig][:responseJsonSchema] = normalize_schema(schema) if schema

  {
    request: request_body,
    metadata: { key: custom_id }
  }
end

.normalize_schema(schema) ⇒ Object

Normalize the ANALYSIS_SCHEMA (Ruby symbol keys) into the Gemini-expected
format (string keys, no additionalProperties which Gemini doesn't use).



199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
# File 'app/services/seo/gemini_batch_client.rb', line 199

def self.normalize_schema(schema)
  deep_stringify = lambda do |obj|
    case obj
    when Hash
      obj.each_with_object({}) do |(k, v), h|
        next if k.to_s == 'additionalProperties'

        h[k.to_s] = deep_stringify.call(v)
      end
    when Array
      obj.map { |v| deep_stringify.call(v) }
    else
      obj
    end
  end

  deep_stringify.call(schema)
end

Instance Method Details

#cancel_batch(batch_name) ⇒ Hash

Cancel a batch job in progress.

Parameters:

  • batch_name (String)

    The batch name

Returns:

  • (Hash)

    Updated batch status



122
123
124
125
# File 'app/services/seo/gemini_batch_client.rb', line 122

def cancel_batch(batch_name)
  response = connection.post("/#{API_VERSION}/#{batch_name}:cancel")
  handle_response(response)
end

#create_cache(model:, system_prompt:, ttl: '14400s') ⇒ String

Create a cached content object for the system prompt.
Returns the cache name (e.g. "cachedContents/abc123") for use in requests.

Parameters:

  • model (String)

    Full model path (e.g. "models/gemini-3-flash-preview")

  • system_prompt (String)

    System instruction text to cache

  • ttl (String) (defaults to: '14400s')

    Time-to-live (e.g. "14400s" for 4 hours)

Returns:

  • (String)

    Cache name for use in build_request's cached_content param



134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
# File 'app/services/seo/gemini_batch_client.rb', line 134

def create_cache(model:, system_prompt:, ttl: '14400s')
  body = {
    model: "models/#{model}",
    systemInstruction: { parts: [{ text: system_prompt }] },
    ttl: ttl
  }

  response = connection.post(
    "/#{API_VERSION}/cachedContents",
    body.to_json
  )

  result = handle_response(response)
  result['name']
end

#create_inline_batch(model, requests, display_name: nil) ⇒ Hash

Create a batch job with inline requests (suitable for batches <20MB).

Parameters:

  • model (String)

    Gemini model ID (e.g. "gemini-3-flash-preview")

  • requests (Array<Hash>)

    Array of request objects from build_request

  • display_name (String) (defaults to: nil)

    Human-readable batch name

Returns:

  • (Hash)

    Batch operation response with 'name' for polling



46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# File 'app/services/seo/gemini_batch_client.rb', line 46

def create_inline_batch(model, requests, display_name: nil)
  display_name ||= "seo-batch-#{Time.current.strftime('%Y%m%d-%H%M')}"

  body = {
    batch: {
      display_name: display_name,
      input_config: {
        requests: {
          requests: requests
        }
      }
    }
  }

  response = connection.post(
    "/#{API_VERSION}/models/#{model}:batchGenerateContent",
    body.to_json
  )

  handle_response(response)
end

#delete_cache(cache_name) ⇒ Object

Delete a cached content object to stop ongoing storage charges.

Parameters:

  • cache_name (String)

    The cache name (e.g. "cachedContents/abc123")

Raises:



153
154
155
156
# File 'app/services/seo/gemini_batch_client.rb', line 153

def delete_cache(cache_name)
  response = connection.delete("/#{API_VERSION}/#{cache_name}")
  raise BatchError, "Failed to delete cache: #{response.status}" unless response.success?
end

#download_results(file_name) ⇒ String

Download results from a file-based batch (for future use with large batches).

Parameters:

  • file_name (String)

    The file name (e.g. "files/abc123")

Returns:

  • (String)

    Raw JSONL content

Raises:



110
111
112
113
114
115
116
# File 'app/services/seo/gemini_batch_client.rb', line 110

def download_results(file_name)
  response = connection.get("/download/#{API_VERSION}/#{file_name}:download", alt: 'media')

  raise BatchError, "Failed to download results: #{response.status}" unless response.success?

  response.body
end

#get_batch(batch_name) ⇒ Hash

Retrieve the current status of a batch job.

Parameters:

  • batch_name (String)

    The batch name (e.g. "batches/123456")

Returns:

  • (Hash)

    Batch status with 'metadata.state' and optionally 'response'



72
73
74
75
# File 'app/services/seo/gemini_batch_client.rb', line 72

def get_batch(batch_name)
  response = connection.get("/#{API_VERSION}/#{batch_name}")
  handle_response(response)
end

#inline_responses(batch_response) ⇒ Array<Hash>

Extract inline responses from a completed batch.

The v1beta inline batch response double-nests the array:
response.inlinedResponses.inlinedResponses => [ metadata, … ]
(mirroring the request's input_config.requests.requests). File-based and
older payloads return the array directly, so unwrap one level only when
the intermediate value is the wrapper Hash.

Parameters:

  • batch_response (Hash)

    The full batch response from get_batch

Returns:

  • (Array<Hash>)

    Array of response objects



100
101
102
103
104
# File 'app/services/seo/gemini_batch_client.rb', line 100

def inline_responses(batch_response)
  inlined = batch_response.dig('response', 'inlinedResponses')
  inlined = inlined['inlinedResponses'] if inlined.is_a?(Hash)
  inlined || []
end

#state(batch_response) ⇒ Object

Extract the state string from a batch response.



86
87
88
# File 'app/services/seo/gemini_batch_client.rb', line 86

def state(batch_response)
  batch_response.dig('metadata', 'state')
end

#terminal?(batch_response) ⇒ Boolean

Check if a batch job has reached a terminal state. The top-level done
flag is the most reliable terminal signal; fall back to the state enum.

Returns:

  • (Boolean)


79
80
81
82
83
# File 'app/services/seo/gemini_batch_client.rb', line 79

def terminal?(batch_response)
  return true if batch_response['done'] == true

  TERMINAL_STATES.include?(state(batch_response))
end