Class: CallRecordProcessing::BulkTranscriptionService
- Inherits:
-
Object
- Object
- CallRecordProcessing::BulkTranscriptionService
- Defined in:
- app/services/call_record_processing/bulk_transcription_service.rb
Overview
Lightweight transcription service for historical call record backfill.
Uses RubyLLM.transcribe with Gemini's native audio (generateContent) path
instead of AssemblyAI, providing a large cost reduction for bulk historical
data. Per audio-hour (Gemini bills audio input at 32 tokens/sec ⇒
1,920 tokens/min):
AssemblyAI (+ LeMUR): $0.06/call (premium recent-call pipeline)$0.003/min) — prior backfill model
gpt-4o-mini-transcribe: $0.18/hour (
gemini-3.1-flash-lite: $0.076/hour (~$0.0013/min) — ~2.3× cheaper, current
The model comes from AiModelConstants.id(:transcription); pass an explicit
model: (e.g. 'gpt-4o-mini-transcribe') to override. The provider is
inferred from the model id, so the same code path serves both Gemini and
OpenAI transcription models.
Trade-offs vs AssemblyAI TranscriptionService:
- No speaker diarization (plain text transcript, no Speaker A/B labels)
- No LeMUR analysis (no ai_summary, action_items, call_phases, etc.)
- No PII redaction, custom spelling, or sentiment analysis
- Good enough for semantic search and keyword discovery on historical calls
The transcript is still useful for:
- Embedding generation (semantic search over call content)
- Full-text search (tsvector)
- Manual review and keyword discovery
Constant Summary collapse
- MIN_DURATION_SECONDS =
Minimum duration seconds.
30- MIN_DURATION_SECONDS_VOICEMAIL =
Minimum duration seconds voicemail.
5- DEFAULT_MODEL =
Default model — sourced from the canonical registry.
AiModelConstants.id(:transcription)
- MAX_FILE_SIZE_GEMINI =
Maximum audio file size. Gemini transcription inlines the audio as base64
in the request body (RubyLLM's generateContent path), and base64 inflates
by ~33%, so the raw file must stay under Gemini's ~20MB inline-request cap
— 15MB raw ≈ 20MB encoded. OpenAI's transcribe endpoint accepts up to 25MB. 15.megabytes
- MAX_FILE_SIZE_OPENAI =
25.megabytes
Instance Attribute Summary collapse
-
#call_record ⇒ Object
readonly
Returns the value of attribute call_record.
-
#model ⇒ Object
readonly
Returns the value of attribute model.
Instance Method Summary collapse
-
#initialize(call_record, model: DEFAULT_MODEL) ⇒ BulkTranscriptionService
constructor
A new instance of BulkTranscriptionService.
- #transcribe(force: false) ⇒ Object
Constructor Details
#initialize(call_record, model: DEFAULT_MODEL) ⇒ BulkTranscriptionService
Returns a new instance of BulkTranscriptionService.
56 57 58 59 |
# File 'app/services/call_record_processing/bulk_transcription_service.rb', line 56 def initialize(call_record, model: DEFAULT_MODEL) @call_record = call_record @model = model end |
Instance Attribute Details
#call_record ⇒ Object (readonly)
Returns the value of attribute call_record.
54 55 56 |
# File 'app/services/call_record_processing/bulk_transcription_service.rb', line 54 def call_record @call_record end |
#model ⇒ Object (readonly)
Returns the value of attribute model.
54 55 56 |
# File 'app/services/call_record_processing/bulk_transcription_service.rb', line 54 def model @model end |
Instance Method Details
#transcribe(force: false) ⇒ Object
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
# File 'app/services/call_record_processing/bulk_transcription_service.rb', line 61 def transcribe(force: false) return skip_result(:already_transcribed) if already_transcribed? && !force return skip_result(:too_short) if too_short? return skip_result(:no_audio) unless has_audio? begin call_record.update!(transcription_state: :processing) temp_file = download_audio return error_result(:no_audio_file, 'Could not download audio') unless temp_file return error_result(:file_too_large, "#{File.size(temp_file)} bytes exceeds #{max_file_size} limit") if File.size(temp_file) > max_file_size transcription = RubyLLM.transcribe( temp_file, model: model, provider: provider, assume_model_exists: true, language: 'en' ) transcript_text = transcription.text.to_s.strip return error_result(:empty_transcript, 'Transcription returned empty text') if transcript_text.blank? call_record.update!( transcript: transcript_text, transcription_state: :completed, transcribed_at: Time.current ) EmbeddingWorker.perform_async('CallRecord', call_record.id) { status: :success, word_count: transcript_text.split.size, model: model, duration_secs: call_record.duration_secs } rescue RubyLLM::Error => e log_error "RubyLLM error: #{e.}" call_record.update!(transcription_state: :error) error_result(:transcription_failed, e.) rescue StandardError => e log_error "Unexpected error: #{e.}" call_record.update!(transcription_state: :error) error_result(:transcription_failed, e.) ensure cleanup_temp_file(temp_file) if temp_file end end |