Class: ImageGeneration::Service

Inherits:
Object
  • Object
show all
Defined in:
app/services/image_generation/service.rb

Overview

Provider-agnostic image generation service.

Supports any model that RubyLLM.paint understands (OpenAI, Google Gemini,
Imagen). For Gemini models, when reference images are supplied, it falls
back to a direct generateContent REST API call (the only provider that
supports inline reference images in this way).

Usage

Text-to-image (any provider)

service = ImageGeneration::Service.new(model: "dall-e-3")
jpeg = service.generate("A heated marble floor in a spa")

Image-to-image with references (Gemini only; other providers ignore refs)

service = ImageGeneration::Service.new(model: "gemini-3.1-flash-image")
jpeg = service.generate("Make it warmer", reference_images: [img1, img2])

Adding a new model

  1. Confirm the RubyLLM model ID via RubyLLM.models.
  2. Add an entry to MODELS with the correct :provider.
  3. Set :supports_references to true only for Gemini generateContent models.

Defined Under Namespace

Classes: AuthenticationError, ContentPolicyError, Error, GenerateResult, NoImageReturnedError

Constant Summary collapse

MODELS =

Model registry

{
  # Google — Gemini native image generation (supports reference images via generateContent)
  # Model IDs verified against the v1beta ListModels endpoint on 2026-06-05.
  # The "Nano Banana" family is GA; default is Nano Banana 2 (gemini-3.1-flash-image).
  'gemini-3.1-flash-image' => {
    label:               'Gemini 3.1 Flash Image (Nano Banana 2)',
    provider:            :google,
    supports_references: true,
    default:             true,
    description:         'Latest fast native image model ("Nano Banana 2"). Strong prompt ' \
                         'adherence and colour fidelity. Supports reference images for style ' \
                         'transfer and variation. Best all-round default. ~45–90 s.'
  },
  'gemini-3-pro-image' => {
    label:               'Gemini 3 Pro Image (Nano Banana Pro)',
    provider:            :google,
    supports_references: true,
    premium:             true,
    description:         'Highest-quality Gemini image model ("Nano Banana Pro"). Best for ' \
                         'complex scenes, fine detail, and demanding photorealism. Supports ' \
                         'reference images. Slower and higher cost — use when quality is ' \
                         'critical. ~90–180 s.'
  },
  'gemini-2.5-flash-image' => {
    label:               'Gemini 2.5 Flash Image (Nano Banana)',
    provider:            :google,
    supports_references: true,
    description:         'Previous-generation fast image model ("Nano Banana"). Reliable and ' \
                         'lower cost. Supports reference images. Good fallback when 3.1 Flash ' \
                         'Image is unavailable. ~30–60 s.'
  },
  # OpenAI
  'dall-e-3' => {
    label:               'DALL-E 3',
    provider:            :openai,
    supports_references: false,
    description:         'OpenAI\'s creative image model. Excels at artistic styles, ' \
                         'illustration, and abstract concepts. Does not support reference ' \
                         'images — generates entirely from the text prompt. ~15–30 s.'
  },
  'chatgpt-image-latest' => {
    label:               'GPT Image (latest)',
    provider:            :openai,
    supports_references: false,
    description:         'OpenAI\'s most advanced image model. Combines deep language ' \
                         'understanding with high-fidelity rendering. Great for product and ' \
                         'marketing visuals. No reference image support. ~20–45 s.'
  }
}.freeze
DEFAULT_MODEL =

Default model.

MODELS.find { |_, cfg| cfg[:default] }&.first ||
'gemini-3.1-flash-image'
PREMIUM_MODEL =

Premium / highest-quality model — the single source for callers that want
to explicitly request the top-tier image model instead of hardcoding its id.

MODELS.find { |_, cfg| cfg[:premium] }&.first ||
'gemini-3-pro-image'

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(model: DEFAULT_MODEL) ⇒ Service


Instance



129
130
131
132
133
# File 'app/services/image_generation/service.rb', line 129

def initialize(model: DEFAULT_MODEL)
  resolved = MODELS.key?(model.to_s) ? model.to_s : DEFAULT_MODEL
  @model   = resolved
  @config  = MODELS[@model]
end

Class Method Details

.available?Boolean

True if at least one model's provider is configured with credentials.

Returns:

  • (Boolean)


109
110
111
# File 'app/services/image_generation/service.rb', line 109

def self.available?
  available_models.any?
end

.available_modelsObject

Returns the subset of MODELS whose provider credentials are present.
Keys are model IDs; values are the config hashes.



115
116
117
# File 'app/services/image_generation/service.rb', line 115

def self.available_models
  MODELS.select { |_id, cfg| provider_configured?(cfg[:provider]) }
end

.model_label(model_id) ⇒ Object

Human-readable label for any model ID in the registry, with a sensible
fallback for unknown IDs.



121
122
123
# File 'app/services/image_generation/service.rb', line 121

def self.model_label(model_id)
  MODELS.dig(model_id.to_s, :label) || model_id.to_s
end

Instance Method Details

#generate(prompt, reference_images: [], aspect_ratio: '1:1', image_size: '1K') ⇒ Object



153
154
155
156
157
158
159
160
161
162
163
164
165
# File 'app/services/image_generation/service.rb', line 153

def generate(prompt, reference_images: [], aspect_ratio: '1:1', image_size: '1K')
  refs = Array(reference_images).compact

  if @config[:provider] == :google
    generate_via_gemini_rest(prompt, refs, aspect_ratio, image_size)
  elsif refs.any?
    Rails.logger.warn "[ImageGeneration::Service] #{@model} does not support " \
                      "reference images — generating text-to-image instead"
    generate_via_ruby_llm(prompt)
  else
    generate_via_ruby_llm(prompt)
  end
end