Class: Assistant::CostCalculator
- Inherits:
-
Object
- Object
- Assistant::CostCalculator
- Defined in:
- app/services/assistant/cost_calculator.rb
Overview
Calculates the cost of AI model usage based on token counts and provider pricing.
Pricing is per million tokens (USD), sourced from provider pricing pages.
Cache pricing uses multipliers on the base input price:
Anthropic: cache_read = 0.1x, cache_write = 1.25x
OpenAI: cache_read = 0.25x
Gemini: cache_read = 0.25x
Usage:
Assistant::CostCalculator.cost_for('claude-sonnet', input_tokens: 50_000, output_tokens: 2_000)
=> 0.18
Assistant::CostCalculator.pricing_for('claude-sonnet')
=> { input: 3.00, output: 15.00, cache_read: 0.30, cache_write: 3.75 }
Constant Summary collapse
- MODEL_PRICING =
Per-million-token pricing (USD). Updated Mar 2026 from provider pricing pages.
Keys match Assistant::ChatService::MODELS keys. { 'claude-haiku' => { input: 1.00, output: 5.00, cache_read: 0.10, cache_write: 1.25 }, 'claude-sonnet' => { input: 3.00, output: 15.00, cache_read: 0.30, cache_write: 3.75 }, 'claude-opus' => { input: 5.00, output: 25.00, cache_read: 0.50, cache_write: 6.25 }, 'gpt-5' => { input: 1.25, output: 10.00, cache_read: 0.3125, cache_write: 0.0 }, 'gpt-5.4' => { input: 2.50, output: 20.00, cache_read: 0.625, cache_write: 0.0 }, 'gpt-5-mini' => { input: 0.25, output: 2.00, cache_read: 0.0625, cache_write: 0.0 }, 'gemini-flash' => { input: 0.30, output: 2.50, cache_read: 0.075, cache_write: 0.0 }, 'gemini-pro' => { input: 1.25, output: 10.00, cache_read: 0.3125, cache_write: 0.0 } }.freeze
Class Method Summary collapse
-
.cost_for(model_key, input_tokens:, output_tokens:, cached_tokens: 0, cache_creation_tokens: 0) ⇒ Float
Calculate the cost (in USD) for a single response given token counts and model key.
-
.pricing_for(model_key) ⇒ Hash?
Look up pricing for a model key.
Class Method Details
.cost_for(model_key, input_tokens:, output_tokens:, cached_tokens: 0, cache_creation_tokens: 0) ⇒ Float
Calculate the cost (in USD) for a single response given token counts and model key.
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
# File 'app/services/assistant/cost_calculator.rb', line 41 def self.cost_for(model_key, input_tokens:, output_tokens:, cached_tokens: 0, cache_creation_tokens: 0) pricing = MODEL_PRICING[model_key] return 0.0 unless pricing # Cached tokens are a subset of input tokens billed at the cache_read rate. # Cache creation tokens are additional tokens billed at the cache_write rate. # Regular input = total input - cached (the non-cached portion at full price). regular_input = [input_tokens - cached_tokens, 0].max cost = 0.0 cost += (regular_input / 1_000_000.0) * pricing[:input] cost += (output_tokens / 1_000_000.0) * pricing[:output] cost += (cached_tokens / 1_000_000.0) * pricing[:cache_read] cost += (cache_creation_tokens / 1_000_000.0) * pricing[:cache_write] cost end |
.pricing_for(model_key) ⇒ Hash?
Look up pricing for a model key.
62 63 64 |
# File 'app/services/assistant/cost_calculator.rb', line 62 def self.pricing_for(model_key) MODEL_PRICING[model_key] end |