Class: ImageGeneration::Service

Inherits:
Object
  • Object
show all
Defined in:
app/services/image_generation/service.rb

Overview

Provider-agnostic image generation service.

Supports any model that RubyLLM.paint understands (OpenAI, Google Gemini,
Imagen). For Gemini models, when reference images are supplied, it falls
back to a direct generateContent REST API call (the only provider that
supports inline reference images in this way).

Usage

Text-to-image (any provider)

service = ImageGeneration::Service.new(model: "dall-e-3")
jpeg = service.generate("A heated marble floor in a spa")

Image-to-image with references (Gemini only; other providers ignore refs)

service = ImageGeneration::Service.new(model: "gemini-2.5-flash-image-preview")
jpeg = service.generate("Make it warmer", reference_images: [img1, img2])

Adding a new model

  1. Confirm the RubyLLM model ID via RubyLLM.models.
  2. Add an entry to MODELS with the correct :provider.
  3. Set :supports_references to true only for Gemini generateContent models.

Defined Under Namespace

Classes: AuthenticationError, ContentPolicyError, Error, GenerateResult, NoImageReturnedError

Constant Summary collapse

MODELS =

Model registry

{
  # Google — Gemini native image generation (supports reference images via generateContent)
  # Model IDs verified against v1beta ListModels endpoint on 2026-03-01.
  'gemini-2.5-flash-image' => {
    label:               'Gemini 2.5 Flash Image',
    provider:            :google,
    supports_references: true,
    default:             true,
    description:         'Fast, versatile and reliable. Supports reference images for style ' \
                         'transfer and variation. Great all-round choice for product shots, ' \
                         'edits, and photorealistic renders. ~30–60 s generation time.'
  },
  'gemini-3.1-flash-image-preview' => {
    label:               'Gemini 3.1 Flash Image (preview)',
    provider:            :google,
    supports_references: true,
    description:         'Next-generation Flash model. Improved prompt adherence and colour ' \
                         'fidelity over 2.5. Still supports reference images. Preview — may ' \
                         'occasionally be unavailable. ~45–90 s.'
  },
  'gemini-3-pro-image-preview' => {
    label:               'Gemini 3 Pro Image (preview)',
    provider:            :google,
    supports_references: true,
    description:         'Highest-quality Gemini image model. Best for complex scenes, fine ' \
                         'detail, and demanding photorealism. Supports reference images. ' \
                         'Slower and higher cost — use when quality is critical. ~90–180 s.'
  },
  'nano-banana-pro-preview' => {
    label:               'Nano Banana Pro (preview)',
    provider:            :google,
    supports_references: true,
    description:         'Google\'s experimental next-gen image model (internal codename ' \
                         '"Nano Banana Pro"). Latest research capabilities. May produce ' \
                         'surprising results. Preview — expect occasional downtime. ~60–120 s.'
  },
  # OpenAI
  'dall-e-3' => {
    label:               'DALL-E 3',
    provider:            :openai,
    supports_references: false,
    description:         'OpenAI\'s creative image model. Excels at artistic styles, ' \
                         'illustration, and abstract concepts. Does not support reference ' \
                         'images — generates entirely from the text prompt. ~15–30 s.'
  },
  'chatgpt-image-latest' => {
    label:               'GPT Image (latest)',
    provider:            :openai,
    supports_references: false,
    description:         'OpenAI\'s most advanced image model. Combines deep language ' \
                         'understanding with high-fidelity rendering. Great for product and ' \
                         'marketing visuals. No reference image support. ~20–45 s.'
  }
}.freeze
DEFAULT_MODEL =
MODELS.find { |_, cfg| cfg[:default] }&.first ||
'gemini-2.5-flash-image'

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(model: DEFAULT_MODEL) ⇒ Service


Instance



124
125
126
127
128
# File 'app/services/image_generation/service.rb', line 124

def initialize(model: DEFAULT_MODEL)
  resolved = MODELS.key?(model.to_s) ? model.to_s : DEFAULT_MODEL
  @model   = resolved
  @config  = MODELS[@model]
end

Class Method Details

.available?Boolean

True if at least one model's provider is configured with credentials.

Returns:

  • (Boolean)


104
105
106
# File 'app/services/image_generation/service.rb', line 104

def self.available?
  available_models.any?
end

.available_modelsObject

Returns the subset of MODELS whose provider credentials are present.
Keys are model IDs; values are the config hashes.



110
111
112
# File 'app/services/image_generation/service.rb', line 110

def self.available_models
  MODELS.select { |_id, cfg| provider_configured?(cfg[:provider]) }
end

.model_label(model_id) ⇒ Object

Human-readable label for any model ID in the registry, with a sensible
fallback for unknown IDs.



116
117
118
# File 'app/services/image_generation/service.rb', line 116

def self.model_label(model_id)
  MODELS.dig(model_id.to_s, :label) || model_id.to_s
end

Instance Method Details

#generate(prompt, reference_images: [], aspect_ratio: '1:1', image_size: '1K') ⇒ Object



148
149
150
151
152
153
154
155
156
157
158
159
160
# File 'app/services/image_generation/service.rb', line 148

def generate(prompt, reference_images: [], aspect_ratio: '1:1', image_size: '1K')
  refs = Array(reference_images).compact

  if @config[:provider] == :google
    generate_via_gemini_rest(prompt, refs, aspect_ratio, image_size)
  elsif refs.any?
    Rails.logger.warn "[ImageGeneration::Service] #{@model} does not support " \
                      "reference images — generating text-to-image instead"
    generate_via_ruby_llm(prompt)
  else
    generate_via_ruby_llm(prompt)
  end
end