Class: Assistant::CostCalculator

Inherits:

Object

Object
Assistant::CostCalculator

show all

Defined in:: app/services/assistant/cost_calculator.rb

Overview

Calculates the cost of AI model usage based on token counts and provider pricing.

Pricing is per million tokens (USD), sourced from provider pricing pages.
Cache pricing uses multipliers on the base input price:
Anthropic: cache_read = 0.1x, cache_write = 1.25x
OpenAI: cache_read = 0.25x
Gemini: cache_read = 0.25x

Usage:
Assistant::CostCalculator.cost_for('claude-sonnet', input_tokens: 50_000, output_tokens: 2_000)

=> 0.18

Assistant::CostCalculator.pricing_for('claude-sonnet')

=> { input: 3.00, output: 15.00, cache_read: 0.30, cache_write: 3.75 }

Constant Summary collapse

MODEL_PRICING = Per-million-token pricing (USD). Updated Mar 2026 from provider pricing pages. Keys match Assistant::ChatService::MODELS keys.

{
  'claude-haiku'  => { input: 1.00,  output: 5.00,  cache_read: 0.10,   cache_write: 1.25 },
  'claude-sonnet' => { input: 3.00,  output: 15.00, cache_read: 0.30,   cache_write: 3.75 },
  'claude-opus'   => { input: 5.00,  output: 25.00, cache_read: 0.50,   cache_write: 6.25 },
  'gpt-5'         => { input: 1.25,  output: 10.00, cache_read: 0.3125, cache_write: 0.0 },
  'gpt-5.4'       => { input: 2.50,  output: 20.00, cache_read: 0.625,  cache_write: 0.0 },
  'gpt-5-mini'    => { input: 0.25,  output: 2.00,  cache_read: 0.0625, cache_write: 0.0 },
  'gemini-flash'  => { input: 0.30,  output: 2.50,  cache_read: 0.075,  cache_write: 0.0 },
  'gemini-pro'    => { input: 1.25,  output: 10.00, cache_read: 0.3125, cache_write: 0.0 }
}.freeze

Class Method Summary collapse

.cost_for(model_key, input_tokens:, output_tokens:, cached_tokens: 0, cache_creation_tokens: 0) ⇒ Float
Calculate the cost (in USD) for a single response given token counts and model key.
.pricing_for(model_key) ⇒ Hash^?
Look up pricing for a model key.

Class Method Details

.cost_for(model_key, input_tokens:, output_tokens:, cached_tokens: 0, cache_creation_tokens: 0) ⇒ `Float`

Calculate the cost (in USD) for a single response given token counts and model key.

Parameters:

model_key (String) —
Key from ChatService::MODELS (e.g. 'claude-sonnet')
input_tokens (Integer) —
Total input tokens
output_tokens (Integer) —
Total output tokens
cached_tokens (Integer) (defaults to: 0) —
Tokens served from cache (cheaper)
cache_creation_tokens (Integer) (defaults to: 0) —
Tokens written to cache (more expensive)

Returns:

(Float) —
Cost in USD

# File 'app/services/assistant/cost_calculator.rb', line 41

def self.cost_for(model_key, input_tokens:, output_tokens:, cached_tokens: 0, cache_creation_tokens: 0)
  pricing = MODEL_PRICING[model_key]
  return 0.0 unless pricing

  # Cached tokens are a subset of input tokens billed at the cache_read rate.
  # Cache creation tokens are additional tokens billed at the cache_write rate.
  # Regular input = total input - cached (the non-cached portion at full price).
  regular_input = [input_tokens - cached_tokens, 0].max

  cost = 0.0
  cost += (regular_input / 1_000_000.0) * pricing[:input]
  cost += (output_tokens / 1_000_000.0) * pricing[:output]
  cost += (cached_tokens / 1_000_000.0) * pricing[:cache_read]
  cost += (cache_creation_tokens / 1_000_000.0) * pricing[:cache_write]
  cost
end

.pricing_for(model_key) ⇒ `Hash`^?

Look up pricing for a model key.