Class: Seo::GeminiBatchClient

Inherits:

Object

Object
Seo::GeminiBatchClient

show all

Defined in:: app/services/seo/gemini_batch_client.rb

Overview

Faraday client for the Gemini Batch API (v1beta).
Submits SEO analysis requests at 50% of standard Gemini pricing.

API reference: https://ai.google.dev/gemini-api/docs/batch-api

Examples:

Submit inline batch

client = Seo::GeminiBatchClient.new
requests = items.map { |item| client.class.build_request(...) }
response = client.create_inline_batch('gemini-3-flash-preview', requests)
status = client.get_batch(response['name'])

Defined Under Namespace

Classes: BatchError, RateLimitError

Constant Summary collapse

BASE_URL = URL for base.

'https://generativelanguage.googleapis.com'

API_VERSION = Api version.

'v1beta'

TERMINAL_STATES = Recognised terminal states. The Gemini Developer API (generativelanguage v1beta) reports batch state as BATCH_STATE_; older Vertex-style payloads use JOB_STATE_. Accept both so polling terminates regardless of which enum the endpoint returns.

%w[
  BATCH_STATE_SUCCEEDED BATCH_STATE_FAILED BATCH_STATE_CANCELLED BATCH_STATE_EXPIRED
  JOB_STATE_SUCCEEDED JOB_STATE_FAILED JOB_STATE_CANCELLED JOB_STATE_EXPIRED
].freeze

Class Method Summary collapse

.build_request(custom_id:, user_prompt:, system_prompt: nil, schema: nil, cached_content: nil, temperature: Seo::PageAnalysisService::TEMPERATURE, max_tokens: Seo::PageAnalysisService::MAX_OUTPUT_TOKENS) ⇒ Hash
Build a single inline batch request entry.
.normalize_schema(schema) ⇒ Object
Normalize the ANALYSIS_SCHEMA (Ruby symbol keys) into the Gemini-expected format (string keys, no additionalProperties which Gemini doesn't use).

Instance Method Summary collapse

#cancel_batch(batch_name) ⇒ Hash
Cancel a batch job in progress.
#create_cache(model:, system_prompt:, ttl: '14400s') ⇒ String
Create a cached content object for the system prompt.
#create_inline_batch(model, requests, display_name: nil) ⇒ Hash
Create a batch job with inline requests (suitable for batches <20MB).
#delete_cache(cache_name) ⇒ Object
Delete a cached content object to stop ongoing storage charges.
#download_results(file_name) ⇒ String
Download results from a file-based batch (for future use with large batches).
#get_batch(batch_name) ⇒ Hash
Retrieve the current status of a batch job.
#initialize(api_key: nil) ⇒ GeminiBatchClient constructor
A new instance of GeminiBatchClient.
#inline_responses(batch_response) ⇒ Array<Hash>
Extract inline responses from a completed batch.
#state(batch_response) ⇒ Object
Extract the state string from a batch response.
#terminal?(batch_response) ⇒ Boolean
Check if a batch job has reached a terminal state.

Constructor Details

#initialize(api_key: nil) ⇒ `GeminiBatchClient`

Returns a new instance of GeminiBatchClient.

Raises:

(ArgumentError)

# File 'app/services/seo/gemini_batch_client.rb', line 35

def initialize(api_key: nil)
  @api_key = api_key || Heatwave::Configuration.fetch(:google, :gemini, :api_key)
  raise ArgumentError, 'Gemini API key is required' if @api_key.blank?
end

Class Method Details

.build_request(custom_id:, user_prompt:, system_prompt: nil, schema: nil, cached_content: nil, temperature: Seo::PageAnalysisService::TEMPERATURE, max_tokens: Seo::PageAnalysisService::MAX_OUTPUT_TOKENS) ⇒ `Hash`

Build a single inline batch request entry.

Parameters:

custom_id (String) —
Unique identifier (e.g. "seo_page_123")
system_prompt (String) (defaults to: nil) —
System instruction text (ignored when cached_content is set)
user_prompt (String) —
User message text
schema (Hash) (defaults to: nil) —
JSON schema for structured output
cached_content (String) (defaults to: nil) —
Cache name from create_cache (omits systemInstruction)
temperature (Float) (defaults to: Seo::PageAnalysisService::TEMPERATURE) —
Sampling temperature
max_tokens (Integer) (defaults to: Seo::PageAnalysisService::MAX_OUTPUT_TOKENS) —
Maximum output tokens

Returns:

(Hash) —
A request object for create_inline_batch

# File 'app/services/seo/gemini_batch_client.rb', line 168

def self.build_request(custom_id:, user_prompt:, system_prompt: nil,
                       schema: nil, cached_content: nil,
                       temperature: Seo::PageAnalysisService::TEMPERATURE,
                       max_tokens: Seo::PageAnalysisService::MAX_OUTPUT_TOKENS)
  request_body = {
    contents: [
      { role: 'user', parts: [{ text: user_prompt }] }
    ],
    generationConfig: {
      temperature: temperature,
      maxOutputTokens: max_tokens,
      responseMimeType: 'application/json'
    }
  }

  if cached_content
    request_body[:cachedContent] = cached_content
  elsif system_prompt
    request_body[:systemInstruction] = { parts: [{ text: system_prompt }] }
  end

  request_body[:generationConfig][:responseJsonSchema] = normalize_schema(schema) if schema

  {
    request: request_body,
    metadata: { key: custom_id }
  }
end

.normalize_schema(schema) ⇒ `Object`

Normalize the ANALYSIS_SCHEMA (Ruby symbol keys) into the Gemini-expected
format (string keys, no additionalProperties which Gemini doesn't use).

# File 'app/services/seo/gemini_batch_client.rb', line 199

def self.normalize_schema(schema)
  deep_stringify = lambda do |obj|
    case obj
    when Hash
      obj.each_with_object({}) do |(k, v), h|
        next if k.to_s == 'additionalProperties'

        h[k.to_s] = deep_stringify.call(v)
      end
    when Array
      obj.map { |v| deep_stringify.call(v) }
    else
      obj
    end
  end

  deep_stringify.call(schema)
end

Instance Method Details

#cancel_batch(batch_name) ⇒ `Hash`

Cancel a batch job in progress.

Parameters:

batch_name (String) —
The batch name

Returns:

(Hash) —
Updated batch status

# File 'app/services/seo/gemini_batch_client.rb', line 122

def cancel_batch(batch_name)
  response = connection.post("/#{API_VERSION}/#{batch_name}:cancel")
  handle_response(response)
end

#create_cache(model:, system_prompt:, ttl: '14400s') ⇒ `String`

Create a cached content object for the system prompt.
Returns the cache name (e.g. "cachedContents/abc123") for use in requests.

Parameters:

model (String) —
Full model path (e.g. "models/gemini-3-flash-preview")
system_prompt (String) —
System instruction text to cache
ttl (String) (defaults to: '14400s') —
Time-to-live (e.g. "14400s" for 4 hours)

Returns:

(String) —
Cache name for use in build_request's cached_content param

# File 'app/services/seo/gemini_batch_client.rb', line 134

def create_cache(model:, system_prompt:, ttl: '14400s')
  body = {
    model: "models/#{model}",
    systemInstruction: { parts: [{ text: system_prompt }] },
    ttl: ttl
  }

  response = connection.post(
    "/#{API_VERSION}/cachedContents",
    body.to_json
  )

  result = handle_response(response)
  result['name']
end

#create_inline_batch(model, requests, display_name: nil) ⇒ `Hash`

Create a batch job with inline requests (suitable for batches <20MB).

Parameters:

model (String) —
Gemini model ID (e.g. "gemini-3-flash-preview")
requests (Array<Hash>) —
Array of request objects from build_request
display_name (String) (defaults to: nil) —
Human-readable batch name

Returns:

(Hash) —
Batch operation response with 'name' for polling

# File 'app/services/seo/gemini_batch_client.rb', line 46

def create_inline_batch(model, requests, display_name: nil)
  display_name ||= "seo-batch-#{Time.current.strftime('%Y%m%d-%H%M')}"

  body = {
    batch: {
      display_name: display_name,
      input_config: {
        requests: {
          requests: requests
        }
      }
    }
  }

  response = connection.post(
    "/#{API_VERSION}/models/#{model}:batchGenerateContent",
    body.to_json
  )

  handle_response(response)
end

#delete_cache(cache_name) ⇒ `Object`

Delete a cached content object to stop ongoing storage charges.

Parameters:

cache_name (String) —
The cache name (e.g. "cachedContents/abc123")

Raises:

(BatchError)

# File 'app/services/seo/gemini_batch_client.rb', line 153

def delete_cache(cache_name)
  response = connection.delete("/#{API_VERSION}/#{cache_name}")
  raise BatchError, "Failed to delete cache: #{response.status}" unless response.success?
end

#download_results(file_name) ⇒ `String`

Download results from a file-based batch (for future use with large batches).

Parameters:

file_name (String) —
The file name (e.g. "files/abc123")

Returns:

(String) —
Raw JSONL content

Raises:

(BatchError)

# File 'app/services/seo/gemini_batch_client.rb', line 110

def download_results(file_name)
  response = connection.get("/download/#{API_VERSION}/#{file_name}:download", alt: 'media')

  raise BatchError, "Failed to download results: #{response.status}" unless response.success?

  response.body
end

#get_batch(batch_name) ⇒ `Hash`

Retrieve the current status of a batch job.

Parameters:

batch_name (String) —
The batch name (e.g. "batches/123456")

Returns:

(Hash) —
Batch status with 'metadata.state' and optionally 'response'

# File 'app/services/seo/gemini_batch_client.rb', line 72

def get_batch(batch_name)
  response = connection.get("/#{API_VERSION}/#{batch_name}")
  handle_response(response)
end

#inline_responses(batch_response) ⇒ `Array<Hash>`

Extract inline responses from a completed batch.

The v1beta inline batch response double-nests the array:
response.inlinedResponses.inlinedResponses => [ metadata, … ]
(mirroring the request's input_config.requests.requests). File-based and
older payloads return the array directly, so unwrap one level only when
the intermediate value is the wrapper Hash.

Parameters:

batch_response (Hash) —
The full batch response from get_batch

Returns:

(Array<Hash>) —
Array of response objects

# File 'app/services/seo/gemini_batch_client.rb', line 100

def inline_responses(batch_response)
  inlined = batch_response.dig('response', 'inlinedResponses')
  inlined = inlined['inlinedResponses'] if inlined.is_a?(Hash)
  inlined || []
end

#state(batch_response) ⇒ `Object`

Extract the state string from a batch response.



86
87
88

# File 'app/services/seo/gemini_batch_client.rb', line 86

def state(batch_response)
  batch_response.dig('metadata', 'state')
end

#terminal?(batch_response) ⇒ `Boolean`

Check if a batch job has reached a terminal state. The top-level done
flag is the most reliable terminal signal; fall back to the state enum.

Returns:

(Boolean)

# File 'app/services/seo/gemini_batch_client.rb', line 79

def terminal?(batch_response)
  return true if batch_response['done'] == true

  TERMINAL_STATES.include?(state(batch_response))
end

Class: Seo::GeminiBatchClient

Overview

Examples:

Submit inline batch

Defined Under Namespace

Constant Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(api_key: nil) ⇒ GeminiBatchClient

Class Method Details

.build_request(custom_id:, user_prompt:, system_prompt: nil, schema: nil, cached_content: nil, temperature: Seo::PageAnalysisService::TEMPERATURE, max_tokens: Seo::PageAnalysisService::MAX_OUTPUT_TOKENS) ⇒ Hash

.normalize_schema(schema) ⇒ Object

Instance Method Details

#cancel_batch(batch_name) ⇒ Hash

#create_cache(model:, system_prompt:, ttl: '14400s') ⇒ String

#create_inline_batch(model, requests, display_name: nil) ⇒ Hash

#delete_cache(cache_name) ⇒ Object

#download_results(file_name) ⇒ String

#get_batch(batch_name) ⇒ Hash

#inline_responses(batch_response) ⇒ Array<Hash>

#state(batch_response) ⇒ Object

#terminal?(batch_response) ⇒ Boolean

#initialize(api_key: nil) ⇒ `GeminiBatchClient`

.build_request(custom_id:, user_prompt:, system_prompt: nil, schema: nil, cached_content: nil, temperature: Seo::PageAnalysisService::TEMPERATURE, max_tokens: Seo::PageAnalysisService::MAX_OUTPUT_TOKENS) ⇒ `Hash`

.normalize_schema(schema) ⇒ `Object`

#cancel_batch(batch_name) ⇒ `Hash`

#create_cache(model:, system_prompt:, ttl: '14400s') ⇒ `String`

#create_inline_batch(model, requests, display_name: nil) ⇒ `Hash`

#delete_cache(cache_name) ⇒ `Object`

#download_results(file_name) ⇒ `String`

#get_batch(batch_name) ⇒ `Hash`

#inline_responses(batch_response) ⇒ `Array<Hash>`

#state(batch_response) ⇒ `Object`

#terminal?(batch_response) ⇒ `Boolean`