Call Recording System
Overview
Section titled “Overview”The Call Recording System captures, transcribes, and analyzes phone calls from two sources running in parallel:
| Source | Recording Type | Speaker Separation | Schedule |
|---|---|---|---|
| Switchvox | Mono | AI diarization (~90-95% accuracy) | Hourly 6am-7pm |
| Twilio SIP Trunk | Stereo | Channel-based (100% accuracy) | Hourly 24/7 |
Both sources flow through the same AI pipeline for transcription, LeMUR analysis, and semantic search embeddings.
Architecture
Section titled “Architecture”┌─────────────────────────────────────────────────────────────────────────────┐│ RECORDING IMPORT │├─────────────────────────────────┬───────────────────────────────────────────┤│ │ ││ Switchvox PBX │ Twilio SIP Trunk ││ │ │ │ ││ ▼ │ ▼ ││ CallRecordImporterWorker │ TwilioRecordingImportWorker ││ (reads from R2) │ (API polling) ││ │ │ │ ││ ▼ │ ▼ ││ SwitchvoxCloudStorage │ TwilioRecordingImporter ││ ├─ Read from R2 store │ ├─ Download WAV ││ ├─ Already compressed │ ├─ Compress to AAC ││ └─ Switchvox account match │ ├─ Direction-aware matching ││ │ │ └─ Store raw Twilio JSON ││ ▼ │ │ ││ CallRecord │ ▼ ││ recording_source: nil │ CallRecord ││ audio_channels: 1 │ recording_source: 'twilio' ││ │ audio_channels: 2 │└─────────────────────────────────┴───────────────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────────────────────┐│ TRANSCRIPTION PIPELINE │├─────────────────────────────────────────────────────────────────────────────┤│ ││ CallRecordTranscriptionWorker ││ │ ││ ▼ ││ TranscriptionService ││ ├─ Download audio from S3/Dragonfly ││ ├─ Upload to AssemblyAI ││ └─ Submit transcription request ││ │ ││ ├─── Mono Recording ──────────────────┐ ││ │ speaker_labels: true │ ││ │ speech_model: 'slam-1' │ ││ │ keyterms_prompt: [...] │ ││ │ speech_understanding: {...} │ ││ │ │ ││ └─── Stereo Recording ────────────────┤ ││ multichannel: true │ ││ speaker_labels: false │ ││ (default speech model) │ ││ │ ││ ▼ ││ AssemblyAI API ││ ├─ Transcription ││ ├─ PII Redaction ││ ├─ Sentiment Analysis ││ └─ Custom Spelling ││ │ ││ ▼ ││ Webhook Callback ││ │ │└──────────────────────────────────────────────┼──────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────────────────────┐│ AI ANALYSIS │├─────────────────────────────────────────────────────────────────────────────┤│ ││ AssemblyAI LeMUR (via LLM Gateway → Claude) ││ ├─ Summary generation ││ ├─ Action item extraction ││ ├─ Call phase detection ││ ├─ Customer satisfaction inference ││ ├─ Agent performance scoring ││ └─ Key topic extraction ││ │ ││ ▼ ││ EmbeddingWorker (OpenAI) ││ └─ Generate semantic search embeddings ││ │└─────────────────────────────────────────────────────────────────────────────┘Database Schema
Section titled “Database Schema”CallRecord Fields
Section titled “CallRecord Fields”| Column | Type | Description |
|---|---|---|
| Recording Source | ||
recording_source | string | ’twilio’ or nil (Switchvox) |
audio_channels | integer | 1 (mono) or 2 (stereo) |
agent_speaker_label | string | Detected agent speaker (A, B, 1, 2, or name) |
| Twilio-Specific | ||
twilio_recording_sid | string | Unique Twilio Recording SID (indexed) |
twilio_call_sid | string | Twilio Call SID |
twilio_call_details | jsonb | Raw Twilio API response |
| Switchvox-Specific | ||
switchvox_recorded_call_id | integer | Switchvox recording ID |
switchvox_from_account_id | integer | Caller’s Switchvox account |
switchvox_to_account_id | integer | Recipient’s Switchvox account |
| Transcription | ||
transcript | text | Full text transcript |
structured_transcript_json | jsonb | Utterances with timestamps, confidence, sentiment |
assemblyai_transcript_id | string | For LeMUR analysis reference |
transcription_state | enum | pending, processing, completed, error, no_audio, too_short |
transcribed_at | datetime | When transcription completed |
| AI Analysis | ||
ai_summary | text | LeMUR-generated summary |
action_items | jsonb | Tasks with responsible party and priority |
call_phases | jsonb | Segments with timestamps |
customer_satisfaction | enum | very_satisfied, satisfied, neutral, frustrated, angry |
agent_performance_score | integer | 0-100 score |
key_topics | string[] | Main topics discussed |
lemur_analyzed_at | datetime | When LeMUR analysis completed |
| Call Metadata | ||
call_direction | enum | inbound, outbound |
call_outcome | enum | unknown, sale, support, inquiry, voicemail |
Twilio JSONB Accessors
Section titled “Twilio JSONB Accessors”The twilio_call_details column provides typed accessors:
call_record.twilio_caller_name # CNAM lookup resultcall_record.twilio_direction # 'trunking-originating' or 'trunking-terminating'call_record.twilio_from # Originating number/SIPcall_record.twilio_to # Destination number/SIPcall_record.twilio_trunk_sid # SIP trunk identifiercall_record.twilio_price # Call costcall_record.twilio_price_unit # Currency (USD)call_record.twilio_start_time # DateTimecall_record.twilio_end_time # DateTimecall_record.twilio_recording_channels # 2 for stereocall_record.twilio_recording_duration # SecondsRecording Import
Section titled “Recording Import”Switchvox (Mono)
Section titled “Switchvox (Mono)”Worker: CallRecordImporterWorker — triggered two ways:
-
Real-time (primary): SFTPGo fires its
uploadaction hook the moment the PBX finishes writing a recording, hittingWebhooks::V1::SftpgoController, which enqueues the single-file import (perform_async(wav_key)). Recordings land in the UI within seconds instead of waiting for the next poll. See SFTPGo § Real-time import hook. -
Hourly poll (backstop): the scheduled
import_new_recordsscan (6am-7pm CT) still runs, catching anything the hook misses (a dropped notification, SFTPGo downtime, an xml-less.wav). The two paths converge on the sameCallRecordImporterWorker.perform_async(wav_key)call, so the worker’s:until_executedlock + the importer’sfirst_or_initializemake double-delivery a no-op — a recording is never imported twice. -
Reads
.wavrecordings from the Cloudflare R2 bucket the SFTPGo gateway writes (the Switchvox PBX uploads over SFTP → SFTPGo → R2). The worker no longer SFTPs anywhere itself. See SFTPGo. -
Pre-compressed audio files
-
Party matching via Switchvox account IDs → Employee lookup
Twilio SIP Trunk (Stereo)
Section titled “Twilio SIP Trunk (Stereo)”Worker: TwilioRecordingImportWorker (hourly 24/7)
- Polls Twilio API for new recordings
- Downloads WAV, compresses to AAC (93% size reduction)
- Direction-aware party matching
Party Matching Logic
Section titled “Party Matching Logic”Company Main Numbers (excluded from matching):
COMPANY_MAIN_NUMBERS = %w[ +18008755285 # US 800# +18664361444 # Canada toll-free +18475502400 # Main line].freezeDirection-Aware Matching:
| Call Direction | Origin Party | Destination Party |
|---|---|---|
| Inbound (trunking-originating) | Match by caller number | Skip (main line) |
| Outbound (trunking-terminating) | Match by agent DID | Match by destination |
For outbound calls, agent DID is extracted from SIP address:
sip:18475502430@warmlyyours.pstn.twilio.com → +18475502430 → Employee matchTranscription
Section titled “Transcription”Mono vs Stereo Modes
Section titled “Mono vs Stereo Modes”| Feature | Mono (Switchvox) | Stereo (Twilio) |
|---|---|---|
speaker_labels | true (AI diarization) | false |
multichannel | false | true |
speech_model | 'slam-1' | default |
keyterms_prompt | Company terms | N/A |
speech_understanding | Agent identification | N/A |
| Speaker detection | Heuristic + LeMUR fallback | By channel + direction |
PII Redaction
Section titled “PII Redaction”Automatically redacted:
banking_informationcredit_card_cvv,credit_card_expiration,credit_card_numberus_social_security_numberpassport_numberpassword
Redacted text replaced with [CREDIT_CARD_NUMBER], etc.
Custom Spelling
Section titled “Custom Spelling”Dynamic corrections for:
- Company name variations (“Warmly Yours”, “Warm Lee Yours”, etc.)
- Employee name phonetic variations (auto-generated from active employees)
- Custom corrections from Settings
LeMUR Analysis
Section titled “LeMUR Analysis”Using AssemblyAI’s LLM Gateway with Claude:
- Summary - 2-4 sentence call summary
- Action Items - Tasks with responsible party (agent/customer) and priority
- Call Phases - Segments: greeting, problem identification, solution, closing
- Customer Satisfaction - Inferred satisfaction level
- Agent Performance Score - 0-100 based on professionalism and problem-solving
- Key Topics - Main topics discussed
UI Features
Section titled “UI Features”Call Records Index
Section titled “Call Records Index”- Recording Source Filter: Dropdown to filter by Twilio/Switchvox/All
- Stereo Badge: Visual indicator for dual-channel recordings
- Transcription State: Color-coded status badges
Call Record Show Page
Section titled “Call Record Show Page”Tabs:
- Overview: Audio player, call metadata, parties, AI summary
- Transcript: Structured transcript with speaker avatars, timestamps, sentiment badges
- AI Analysis: Action items, call phases, performance metrics
- Twilio (if applicable): Recording metadata, raw JSON data
Actions:
- Re-transcribe: Queue new transcription
- Re-analyze: Run LeMUR again
- Swap Speakers: Manual speaker correction
- Generate Embedding: Create semantic search vector
Automated Processing
Section titled “Automated Processing”Scheduler
Section titled “Scheduler”# Switchvox import (6am-7pm hourly)call_record_importer_worker: cron: '0 6-19 * * * America/Chicago' class: CallRecordImporterWorker
# Twilio import (all hours)twilio_recording_import_worker: cron: '0 * * * * America/Chicago' class: TwilioRecordingImportWorker
# Daily transcription (6 AM)daily_call_transcription: cron: '0 6 * * * America/Chicago' class: DailyCallRecordTranscriptionWorkerDaily Transcription Worker
Section titled “Daily Transcription Worker”Runs at 6 AM daily:
- Processes all new calls from previous 24 hours
- Backfills up to 500 older eligible calls
- Uses
ai_embeddingsqueue for controlled throughput
Manual Operations
Section titled “Manual Operations”Rake Tasks
Section titled “Rake Tasks”# View statisticsbundle exec rake call_records:stats
# Backfill transcriptions (most recent first)bundle exec rake call_records:backfill_transcriptions[LIMIT,DAYS_BACK]
# Backfill LeMUR analysisbundle exec rake call_records:backfill_lemur[LIMIT]
# Process a single callbundle exec rake call_records:process_one[CALL_RECORD_ID]
# Twilio operationsbundle exec rake call_records:twilio_checkbundle exec rake call_records:twilio_import[LIMIT]Ruby Console
Section titled “Ruby Console”# Import Twilio recordingsimporter = CallRecordTwilioRecordingImporter.newimporter.import_new_recordings(limit: 50, since: 24.hours.ago)
# Dry runimporter = CallRecordTwilioRecordingImporter.new(dry_run: true)importer.import_new_recordings
# Transcribe a single recordCallRecordTranscriptionWorker.perform_async(call_record_id: 123, force: true)
# Re-run LeMUR analysisCallRecordSummaryWorker.perform_async(call_record_id: 123)Filtering & Queries
Section titled “Filtering & Queries”# By sourceCallRecord.where(recording_source: 'twilio')CallRecord.where(recording_source: [nil, 'switchvox'])
# Stereo recordingsCallRecord.where('audio_channels >= 2')
# Transcription stateCallRecord.where(transcription_state: :completed)CallRecord.where(transcription_state: [:pending, :error])
# Eligible for transcriptionCallRecord.joins(:upload) .where(transcription_state: [:pending, :error]) .where('duration_secs >= 30')
# Ransack (for UI)CallRecord.ransack(recording_source_eq: 'twilio')Troubleshooting
Section titled “Troubleshooting”Recordings Not Importing
Section titled “Recordings Not Importing”Switchvox:
- Check SFTP connectivity
- Verify Switchvox recording paths
- Check
CallRecordImporterWorkerlogs
Twilio:
- Check Sidekiq logs for
TwilioRecordingImportWorker - Verify Twilio credentials in
config/credentials.yml.enc - Check trunk ID matches configured value
Wrong Speaker Labels
Section titled “Wrong Speaker Labels”Stereo recordings:
- Verify
call_directionis set correctly - Use “Swap Speakers” button for manual correction
Mono recordings:
- Check if heuristic detection found agent greeting
- LeMUR should have identified speaker
- Use “Swap Speakers” if still incorrect
AssemblyAI Errors
Section titled “AssemblyAI Errors”| Error | Solution |
|---|---|
"Invalid endpoint schema" | Ensure speaker_labels: false for multichannel |
"custom_spelling 'to' fields must contain only one word" | Filter multi-word targets |
400 on transcription submit | Check all parameters are valid |
Twilio API Errors
Section titled “Twilio API Errors”grep '\[TwilioClient\]' log/production.log | tail -50Known Limitations
Section titled “Known Limitations”Agent Identification for Inbound Twilio Calls
Section titled “Agent Identification for Inbound Twilio Calls”Twilio records at the trunk level before PBX routing, so we cannot identify which agent answered inbound calls. The to address is the company’s main line.
Workarounds:
- Use Swap Speakers button for manual correction
- Future: Correlate with Switchvox
call_logsby time/number - Future: Create employee DID registry for direct-dial matching
Potential Duplicates
Section titled “Potential Duplicates”The same call could exist as both Switchvox (mono) and Twilio (stereo) recordings. Currently stored separately - use recording_source filter to analyze independently.
Cost Considerations
Section titled “Cost Considerations”Per-Call Costs
Section titled “Per-Call Costs”| Service | Cost | Notes |
|---|---|---|
| AssemblyAI Transcription | ~$0.00025/sec | Average 2.9 min call = ~$0.044 |
| AssemblyAI LeMUR | ~$0.015/call | Claude Sonnet, ~2000 tokens |
| OpenAI Embedding | ~$0.0001/1K tokens | ~$0.0002/call |
| Total per call | ~$0.06 | Conservative estimate |
Storage Costs
Section titled “Storage Costs”| Item | Cost | Notes |
|---|---|---|
| Twilio Recording Storage | ~$0.0025/min | Per minute stored |
| S3 Storage (compressed) | Minimal | AAC ~93% smaller than WAV |
Volume Estimates
Section titled “Volume Estimates”| Metric | Value |
|---|---|
| Daily new calls | ~124/day |
| Daily processing cost | ~$7.50/day |
| Monthly processing cost | ~$225/month |
| Backfill (2 years) | ~$6,679 |
Related Files
Section titled “Related Files”| File | Purpose |
|---|---|
app/models/call_record.rb | Model with jsonb_accessor, embeddable |
app/services/call_record_processing/transcription_service.rb | Orchestrates transcription |
app/services/assemblyai_client.rb | AssemblyAI API client |
app/services/call_record/twilio_recording_importer.rb | Twilio import service |
app/services/call_record/switchvox_importer_sftp.rb | Switchvox import service |
app/services/twilio_client.rb | Twilio API client |
app/workers/call_record_transcription_worker.rb | Transcription worker |
app/workers/twilio_recording_import_worker.rb | Twilio import worker |
app/workers/daily_call_record_transcription_worker.rb | Daily processing |
app/helpers/call_records_helper.rb | Speaker detection helper |
app/controllers/call_records_controller.rb | Controller with actions |
lib/tasks/call_records.rake | Manual rake tasks |