Module: AssistantConversationTokenTrackable
- Extended by:
- ActiveSupport::Concern
- Included in:
- AssistantConversation
- Defined in:
- app/models/concerns/assistant_conversation_token_trackable.rb
Instance Method Summary collapse
-
#computed_token_totals ⇒ Object
Compute token totals from assistant_messages (source of truth).
-
#computed_total_cost ⇒ Object
Compute total cost (USD) for this conversation from per-message token data.
-
#sync_token_totals! ⇒ Object
Sync aggregate metadata from assistant_messages (call after responses complete).
-
#total_tokens ⇒ Object
Total tokens used — returns cached metadata totals (synced after each response), falls back to computing from assistant_messages only when metadata is empty.
-
#track_error! ⇒ Object
Track an error.
-
#track_query!(model:, input_tokens: 0, output_tokens: 0, response_time: nil, tool_stats: {}) ⇒ Object
Track a completed query with its metrics.
Instance Method Details
#computed_token_totals ⇒ Object
Compute token totals from assistant_messages (source of truth).
Returns { input: N, output: N, thinking: N, cached: N, cache_creation: N, total: N }
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
# File 'app/models/concerns/assistant_conversation_token_trackable.rb', line 47 def computed_token_totals sums = .unscope(:order) .where(role: 'assistant') .pick( Arel.sql('COALESCE(SUM(input_tokens), 0)'), Arel.sql('COALESCE(SUM(output_tokens), 0)'), Arel.sql('COALESCE(SUM(thinking_tokens), 0)'), Arel.sql('COALESCE(SUM(cached_tokens), 0)'), Arel.sql('COALESCE(SUM(cache_creation_tokens), 0)') ) || [0, 0, 0, 0, 0] { input: sums[0].to_i, output: sums[1].to_i, thinking: sums[2].to_i, cached: sums[3].to_i, cache_creation: sums[4].to_i, total: sums[0].to_i + sums[1].to_i } end |
#computed_total_cost ⇒ Object
Compute total cost (USD) for this conversation from per-message token data.
Uses each message's associated LlmModel to look up the correct pricing.
Falls back to the conversation-level model when a message has no model association.
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
# File 'app/models/concerns/assistant_conversation_token_trackable.rb', line 70 def computed_total_cost # Build a model_id → model_key lookup from ChatService::MODELS model_id_to_key = Assistant::ChatService::MODELS.transform_values { |v| v[:id] }.invert .where(role: 'assistant') .includes(:llm_model) .sum do || model_key = if .llm_model model_id_to_key[.llm_model.model_id] || llm_model_name else llm_model_name end Assistant::CostCalculator.cost_for( model_key, input_tokens: .input_tokens || 0, output_tokens: .output_tokens || 0, cached_tokens: .cached_tokens || 0, cache_creation_tokens: .cache_creation_tokens || 0 ) end end |
#sync_token_totals! ⇒ Object
Sync aggregate metadata from assistant_messages (call after responses complete).
Fixes the issue where track_query! only captures last-chunk tokens.
Also computes and caches total cost for fast sidebar display.
Uses a database-level JSONB merge (metadata || patch) instead of a
Ruby-side Hash#merge so that keys written by concurrent callers — most
importantly compaction_summary set by ContextCompactor — are never
clobbered by a stale in-memory copy of metadata.
102 103 104 105 106 107 108 109 110 111 112 113 |
# File 'app/models/concerns/assistant_conversation_token_trackable.rb', line 102 def sync_token_totals! totals = computed_token_totals cost = computed_total_cost patch = { 'total_input_tokens' => totals[:input], 'total_output_tokens' => totals[:output], 'total_cost_cents' => cost }.to_json self.class.where(id: id).update_all(["metadata = metadata || ?::jsonb", patch]) reload end |
#total_tokens ⇒ Object
Total tokens used — returns cached metadata totals (synced after each response),
falls back to computing from assistant_messages only when metadata is empty.
This avoids N+1 queries when displaying token counts in conversation lists.
38 39 40 41 42 43 |
# File 'app/models/concerns/assistant_conversation_token_trackable.rb', line 38 def total_tokens cached = (total_input_tokens || 0) + (total_output_tokens || 0) return cached if cached.positive? computed_token_totals[:total] end |
#track_error! ⇒ Object
Track an error
30 31 32 33 |
# File 'app/models/concerns/assistant_conversation_token_trackable.rb', line 30 def track_error! self.error_count = (error_count || 0) + 1 save! end |
#track_query!(model:, input_tokens: 0, output_tokens: 0, response_time: nil, tool_stats: {}) ⇒ Object
Track a completed query with its metrics
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
# File 'app/models/concerns/assistant_conversation_token_trackable.rb', line 7 def track_query!(model:, input_tokens: 0, output_tokens: 0, response_time: nil, tool_stats: {}) self.llm_model_name = model self.total_input_tokens = (total_input_tokens || 0) + input_tokens self.total_output_tokens = (total_output_tokens || 0) + output_tokens self.total_queries = (total_queries || 0) + 1 self.last_query_at = Time.current if response_time current_avg = average_response_time || 0 current_count = (total_queries || 1) - 1 self.average_response_time = ((current_avg * current_count) + response_time) / total_queries end if tool_stats.present? self.total_tool_calls = (total_tool_calls || 0) + (tool_stats[:total_tool_calls] || 0) self.total_tool_errors = (total_tool_errors || 0) + (tool_stats[:sql_errors] || 0) + (tool_stats[:patch_errors] || 0) end save! end |