AI Agent Architecture
Status: Production Ready
Last Updated: February 2026
Overview
The AI Agent (Sunny) is an internal assistant for WarmlyYours employees, accessible at /assistant in the CRM. It answers business questions using real-time database queries, content search, blog management, and multi-model LLM reasoning. The system is built on RubyLLM's acts_as_chat + Agent patterns, with role-based data access, a persistent self-learning brain, tool-loop guardrails, audit logging, and streaming Turbo Stream responses.
Key capabilities:
- Natural-language SQL: Users ask questions; the LLM writes, executes, and interprets read-only SQL
- Content search: Semantic search across products, FAQs, blog posts, videos, showcases, and reviews
- Blog management: Create, update, and enrich blog posts with linked assets (images, videos, FAQs, product cards), proper tagging, and SEO metadata
- Multi-model support: Auto-selects from Claude, GPT-4.1, or Gemini based on query complexity
- Extended thinking: Activates reasoning scratchpad for analytical queries (Anthropic/Gemini)
- Data domain access control: CanCanCan roles map to data domains; each database table declares which domains it belongs to
- Streaming UI: Real-time token-by-token streaming with live preview, tool status, and formatted output
- Conversation sharing: Owner can share conversations with colleagues at
viewerorcollaboratorlevel - Context compaction: Sliding-window summary for long conversations to stay within token limits
- Sunny Brain: Persistent, editable knowledge base of learned rules injected into every system prompt; self-updates via semantic embeddings
Architecture Diagram
Browser (CRM) Rails Server External
┌──────────────────┐ POST ┌──────────────────────────────┐
│ Stimulus │──/ask──────►│ Crm::AssistantChatController │
│ assistant_chat │ │ ├─ build_user_context() │
│ controller │ │ ├─ sanitize_tool_services() │
│ │ │ └─ AssistantChatWorker │
│ MutationObserver │ .perform_async() │
│ auto-scroll │ └───────────┬──────────────────┘
│ completion detect │ │ Sidekiq
└─────▲────────────┘ ▼
│ ┌──────────────────────────────┐
│ Turbo Streams │ AssistantChatWorker │
│ (ActionCable) │ ├─ ChatService.new(...) │
│ │ │ ├─ auto_select_model() │
│ │ │ ├─ configure_conversation()│
│ │ │ │ ├─ with_model() │
│ │ │ │ ├─ with_instructions() │
│ │ │ │ ├─ with_tools() │ ┌──────────┐
│ │ │ │ ├─ with_thinking() │──►│ Anthropic │
│ │ │ │ └─ on_tool_call() │ │ OpenAI │
│ │ │ └─ conversation.ask() │ │ Gemini │
│ │ │ ├─ LLM streaming │ └──────────┘
│ │ │ ├─ tool execution ─────┤
│ │ │ └─ auto-persist msgs │
│ │ ├─ broadcast_chunk() │
│◄──────────────────────────│ ├─ broadcast_complete() │
│ │ │ └─ ResponseFormatter │
│ │ └─ finalize_response() │
│ └──────────────────────────────┘
│
│ Tool Execution Layer
│ ┌────────────────────────────────────────────┐
│ │ Assistant::ChatToolBuilder │
│ │ ├─ Content tools (semantic_search, FAQs) │──► pgvector / OpenAI Embeddings
│ │ └─ PostgreSQL tools │
│ │ ├─ describe_available_data │──► CommentManifest (db/comments/*.yml)
│ │ ├─ execute_sql ──► SqlBroker │──► PostgreSQL (read-only)
│ │ ├─ list_schemas │
│ │ ├─ list_objects │
│ │ ├─ get_object_details │
│ │ └─ explain_query │
│ └────────────────────────────────────────────┘
│
│ Security Layer
│ ┌────────────────────────────────────────────┐
│ │ Assistant::DataDomainPolicy ◄── data_domains.yml│
│ │ ├─ role → domain mapping │
│ │ └─ CommentManifest.objects_for_domains() │
│ │ │
│ │ Assistant::DataPolicy │
│ │ ├─ ALWAYS_BLOCKED_OBJECTS │
│ │ ├─ object_allowed?() │
│ │ └─ sensitive_columns_for_query() │
│ │ │
│ │ Assistant::ToolLoopGuard │
│ │ ├─ MAX_IDENTICAL_CALLS = 2 │
│ │ └─ MAX_CONSECUTIVE_SQL_FAILURES = 3 │
│ │ │
│ │ AssistantSqlAuditLog (every SQL execution) │
│ └────────────────────────────────────────────┘
Components
1. Agent — Assistant::SunnyAgent
File: app/agents/assistant/sunny_agent.rb
RubyLLM::Agent subclass that serves as the canonical factory and configuration baseline for Sunny conversations. Follows the RubyLLM Agent pattern: chat_model wires create!/find to AssistantConversation, and instructions { nil } disables automatic discovery so ChatService can apply per-request configuration.
# Create a new conversation
conversation = Assistant::SunnyAgent.create!(user: current_user, messages: [])
# Load an existing conversation
conversation = Assistant::SunnyAgent.find(conversation_id)
All dynamic configuration (model, temperature, system prompt, tools, thinking, Anthropic prompt caching) is applied per-request by ChatService#configure_conversation. The agent is intentionally lean — it's a factory, not an orchestrator.
2. Controller — Crm::AssistantChatController
File: app/controllers/crm/assistant_chat_controller.rb
The HTTP entry point. Handles conversation CRUD and delegates LLM processing to a Sidekiq worker.
| Action | Method | Description |
|---|---|---|
index |
GET | Load most recent conversation; list sidebar conversations |
show |
GET | Load a specific conversation (owned or shared) |
create |
POST | Create a new conversation via SunnyAgent.create! |
ask |
POST | Accept user message, enqueue AssistantChatWorker |
cancel |
POST | Set Redis cancellation flag; worker checks between chunks |
destroy |
DELETE | Delete a conversation |
Key responsibilities:
- User context assembly:
build_user_context()serializes identity (name, party_id, department, job_title), role flags (is_admin, is_manager), and resolvedanalytics_domainsinto a Hash passed to the worker. - Tool service selection:
available_chat_services()limits non-admins tocontent+postgres_production. Admins also getpostgres_versions(audit trail DB). - Processing lock: Checks
conversation.processing?before enqueuing; blocks if a job is already running. - Access control: Viewers (shared conversations) cannot submit new messages.
3. Worker — AssistantChatWorker
File: app/workers/assistant_chat_worker.rb
Sidekiq job that runs the full LLM interaction loop in the background.
Lifecycle:
- Acquire processing lock (
acquire_processing_lock!) - Load conversation, instantiate
Assistant::ChatService - Call
service.call { |chunk| broadcast_chunk(chunk) } - LLM streams response tokens → worker broadcasts live preview via
Turbo::StreamsChannel - On tool calls: LLM pauses, tool executes, result fed back, LLM continues
- On completion:
ResponseFormatterrenders final markdown → HTML; broadcast replaces preview finalize_response(): track metrics, sync token totals, dual-write legacy JSONB- Release processing lock
Streaming: The worker throttles preview updates to ~8fps (0.12s interval) for smooth UX.
Cancellation: cancel action sets cancelled:{jid} in Redis (TTL 300s). Worker checks cancelled? between chunks and stops streaming.
Error handling: broadcast_error maps known error patterns (rate limits, timeouts, invalid model IDs, JSON parse failures) to user-friendly messages.
4. Chat Service — Assistant::ChatService
File: app/services/assistant/chat_service.rb
Orchestrates a single chat turn: model selection, system prompt assembly, tool registration, thinking activation, and streaming execution.
Model Selection
MODELS = {
'claude-haiku' => { id: LlmDefaults::DEFAULT_HAIKU_MODEL, provider: :anthropic, cost: :low, supports_thinking: false },
'claude-sonnet' => { id: LlmDefaults::DEFAULT_SONNET_MODEL, provider: :anthropic, cost: :medium, supports_thinking: true, thinking_effort_default: :medium },
'claude-opus' => { id: LlmDefaults::DEFAULT_OPUS_MODEL, provider: :anthropic, cost: :high, supports_thinking: true, thinking_effort_default: :high },
'gpt-4.1' => { id: 'gpt-4.1', provider: :openai, cost: :medium, supports_thinking: false },
'gpt-4.1-mini' => { id: 'gpt-4.1-mini', provider: :openai, cost: :low, supports_thinking: false },
'gemini-flash' => { id: 'gemini-2.5-flash', provider: :gemini, cost: :low, supports_thinking: true, thinking_effort_default: :low },
'gemini-pro' => { id: 'gemini-2.5-pro', provider: :gemini, cost: :medium, supports_thinking: true, thinking_effort_default: :medium },
}
Auto-selection (model: 'auto') uses regex pattern matching + token estimation:
- Simple lookups (< 50 tokens, "show me", "list", "count") →
claude-haiku - Complex/comparison queries ("trend", "correlate", "why", > 150 tokens) →
claude-sonnet - Default →
claude-sonnet
Model IDs are managed through config/initializers/llm_defaults.rb to ensure a single source of truth.
System Prompt
Assembled from layers, joined with "\n\n":
- Base template (
app/prompts/assistant/sunny_agent/instructions.txt.erb): Identity, date macros, formatting guidelines, source citation rules user_context_prompt: User identity (## CURRENT USER), party_id mappings for "my" queries, data access domainsschema_hint_for_role: Instructs the LLM to calldescribe_available_data; power users also getlist_schemas/get_object_detailstools_system_prompt: Tool names, tool-mode rules, and service-specific rules (content, blog, GA4, GSC, Ahrefs, etc.)crm_url_templates_prompt: CRM URL patterns for linking records by IDbrain_prompt: Hybrid-retrievedAssistantBrainEntryrules injected under## LEARNED RULES; contextually filtered to entries whoseapplies_to_servicesoverlap with the active tool services
For Anthropic models, the system prompt is wrapped with cache: true via RubyLLM::Providers::Anthropic::Content. This enables up to 90% input token savings on subsequent turns.
Extended Thinking
When the model supports it and the query matches THINKING_QUERY_PATTERNS:
- Activated via
conversation.with_thinking(effort:, budget:) - Budget tokens: 4K (low) / 8K (medium) / 16K (high, Opus only)
- Anthropic requires
temperature: 1when thinking is active - Thinking traces are persisted to
assistant_messages.thinking_text
Conversation Configuration (configure_conversation)
conversation.with_model(model_id, provider:, assume_exists: true)
conversation.with_instructions(cacheable_system_prompt, replace: true)
conversation.with_thinking(effort:, budget:) # if applicable
conversation.with_params(max_tokens: budget + 16_000) # if thinking with budget
conversation.with_temperature(thinking? ? 1 : 0.3)
conversation.with_tools(*tools) # from ChatToolBuilder
conversation.on_new_message { emit_status(:thinking) }
conversation.on_tool_call { |tc| emit_status(...) }
conversation.on_tool_result { ... }
conversation.on_end_message { emit_status(:composing) }
5. Tool Builder — Assistant::ChatToolBuilder
File: app/services/assistant/chat_tool_builder.rb
Dynamically builds RubyLLM::Tool subclasses for the chat. No HTTP hop to the MCP server — tools execute in-process.
Tool registration is keyed on tool_services — an array passed from the controller that determines which tool groups are active for this conversation. Non-admins are limited to content + postgres_production; admins also get postgres_versions.
Content Tools (content)
| Tool | Description |
|---|---|
semantic_search |
Semantic vector search across all content types |
find_faqs |
FAQ search with product line filtering |
find_call_recordings |
Permission-gated call transcript search |
PostgreSQL Tools (postgres_production, postgres_versions)
Built per database service key. Access level depends on role:
| Tool | Admin/Manager | Employee |
|---|---|---|
describe_available_data |
Yes | Yes |
execute_sql |
Full | Domain-restricted |
list_schemas |
Yes | No |
list_objects |
Yes | No |
get_object_details |
Yes | No |
explain_query |
Yes | No |
Blog Management Tools (blog_management)
| Tool | Description |
|---|---|
create_blog_post |
Create a draft post (body, SEO, tags, preview image) |
update_blog_post |
Update fields + atomically replace tags; creates a revision |
get_blog_post |
Read all editable fields, current tags, revision count |
insert_image |
oEmbed rendered HTML for an image embed |
insert_video |
oEmbed rendered HTML for a video embed (Cloudflare or YouTube) |
insert_faqs |
oEmbed rendered HTML for an FAQ section |
insert_product |
oEmbed rendered HTML for a product card (Liquid tag) |
Brain Tool (all services)
| Tool | Description |
|---|---|
propose_brain_entry |
Propose a new learned rule; creates a pending brain entry |
6. SQL Broker — Assistant::SqlBroker
File: app/services/assistant/sql_broker.rb
Centralized execution layer for all AI-generated SQL.
Enforcement chain:
- Read-only check: Only
SELECT,SHOW,WITH,EXPLAIN,SETare allowed - SQL normalization: Strip comments, collapse whitespace
- Object extraction:
PgQuery.parse(sql).tablesvia AST - Object-level access:
DataPolicy.object_allowed?()checks againstALWAYS_BLOCKED_OBJECTS+ domain-resolvedallowed_objects - Query execution: Read-only transaction with statement timeout (8s)
- Column-level redaction: PII columns redacted from results
- Row cap: Truncates to 50 rows; adds truncation notice
- Audit logging: Every execution logged to
assistant_sql_audit_logs
7. Data Domain Access Control
Same domain system as described in the Analytics section:
- File:
config/analytics/data_domains.yml - 5 business domains:
sales,financial,workforce,operations,marketing - Role mappings:
sales_rep,sales_manager,accounting_manager,marketing_manager, etc. - Database tables/views declare their domain(s) in
db/comments/*.ymlmanifests - Admins get
nil(unrestricted);DataDomainPolicyresolves roles → domains → allowed objects
8. Context Compactor — Assistant::ContextCompactor
File: app/services/assistant/context_compactor.rb
Sliding-window context management for long conversations:
- When token count exceeds the threshold,
ensure_context_summary!generates (or returns a cached) summary of older messages - Summary injected as synthetic user + assistant messages in
to_llm(after system messages) - Only recent messages (after
compaction_through_message_id) are sent verbatim - Summary and cutoff ID are cached in
AssistantConversation's JSONB metadata
9. Cost Calculator — Assistant::CostCalculator
File: app/services/assistant/cost_calculator.rb
Computes per-turn and per-conversation cost from token counts and model pricing:
- Looks up per-token price from
ChatService::MODELS - Accounts for cached tokens (90% discount) and cache-creation tokens
- Called by
AssistantConversation#computed_total_costandsync_token_totals!
10. Response Formatter — Assistant::ResponseFormatter
File: app/services/assistant/response_formatter.rb
| Transformation | Description |
|---|---|
| Markdown → HTML | Kramdown with GFM + Rouge syntax highlighting |
| HTML sanitization | Sanitize gem strips XSS vectors |
| SQL collapse | SQL blocks wrapped in Bootstrap collapsible accordions |
| Table enhancement | Bootstrap tables with cell formatting and export buttons (Copy, CSV, Excel) |
| Entity linking | Order numbers, customer IDs, SKUs linked to CRM pages |
| Source panel | Source citations styled distinctively |
11. Sunny Brain — AssistantBrainEntry
Files: app/models/assistant_brain_entry.rb, app/controllers/crm/assistant_brain_controller.rb
The Brain is a persistent, CRM-editable knowledge base of rules and facts that are injected into Sunny's system prompt on every conversation. It replaces hard-coded domain rules in prompt files with database-backed entries that any manager can inspect, approve, or correct without a code deploy.
Entry Model
assistant_brain_entries
id, category, title, rule (text),
scope ('global'|'user'), user_id,
applies_to_services (varchar[]),
status ('active'|'pending'|'inactive'),
source ('manual'|'auto_learned'),
suggested_by_id, approved_by_id,
created_at, updated_at
scope: 'global'— shared across all users; managed by managers /sunny_administratorrolescope: 'user'— personal preferences for one user; self-managed, never shown to othersapplies_to_services— contextual filtering: an empty array means "always inject"; a populated array means "only inject when one of these tool services is active in the current conversation" (e.g.['blog_management']keeps blog rules out of analytics sessions)
Categories
url_rules, product_data, content_rules, analytics, schema_knowledge, general
System Prompt Injection
ChatService#brain_prompt is called during prompt assembly and appends a ## LEARNED RULES section. It uses hybrid retrieval:
- Universal entries (
applies_to_services: []) — always injected - Service-specific entries — when the total exceeds
BRAIN_SEMANTIC_THRESHOLD(40) and a user message is present, pgvector cosine similarity selects the topBRAIN_SEMANTIC_TOP_K(20) most relevant entries from candidates; otherwise all are injected - Entries are sorted by category then title before injection
Semantic Embeddings
Each AssistantBrainEntry includes Models::Embeddable. When title or rule changes, an Events::ContentEmbeddingRequired event is published to the Rails Event Store. Embedding::ContentChangedHandler (async handler) dequeues an EmbeddingWorker job on the ai_embeddings queue, which calls OpenAI text-embedding-3-small and stores a 1536-dimension vector in the content_embeddings_assistant_brain_entries partition (PostgreSQL list partition of content_embeddings keyed on embeddable_type = 'AssistantBrainEntry').
The partition has:
- A UNIQUE index on
(embeddable_id, content_type, locale) - An HNSW index with
vector_cosine_opsfor fast approximate nearest-neighbor search
Self-Learning via ProposeBrainEntryTool
Sunny can propose new rules after user corrections by calling the propose_brain_entry tool (registered in ChatToolBuilder). Proposed entries are created with status: 'pending'. Managers see a Pending Approval panel in the Brain CRM page and can approve or reject with a single click.
CRM Interface
GET /assistant_brain — lists:
- Global active entries (grouped by category) — visible to all users with access
- Pending entries awaiting approval — sunny_admin only
- Inactive entries — sunny_admin only
- Current user's personal entries
Access control uses CanCan:
# ability.rb — inside is_manager? block
can :manage, AssistantBrainEntry
# Plus standalone for non-managers with explicit role
can :manage, AssistantBrainEntry if account.has_role?('sunny_administrator', admin_check: false)
The sunny_administrator role (created in 20260227140000_add_sunny_administrator_role) can be granted to non-managers who need brain management access (e.g. content specialists).
12. Blog Management Tool Service
File: app/services/assistant/blog_tool_builder.rb
The blog_management tool service provides Sunny with 8 tools for creating and maintaining blog posts.
Tools
| Tool | Description |
|---|---|
create_blog_post |
Create a draft post with HTML body, SEO fields, tags, preview image |
update_blog_post |
Update any post field; creates a revision; replaces tags atomically |
get_blog_post |
Read all editable fields + current tags, revisions, effective meta description |
insert_image |
Fetch rendered oEmbed HTML for an image embed |
insert_video |
Fetch rendered oEmbed HTML for a video embed |
insert_faqs |
Fetch rendered oEmbed HTML for an FAQ section |
insert_product |
Fetch rendered oEmbed HTML for a product card |
propose_brain_entry |
Propose a new brain rule (available in all services) |
Tagging System (Two-Tier)
Blog posts use two distinct tag types, both applied via the tags: parameter:
Tier 1 — Page-placement tags (public: false): internal tags that make a post appear in the content section of a specific landing page. Format: for-{page-path-parameterized}-page. Used by PagesHelper#page_posts, page_videos, page_showcases, etc. Never shown to visitors (filtered by BlogHelper#tags_with_links).
Examples: for-towel-warmer-page, for-towel-warmer-matte-black-page, for-floor-heating-bathroom-page
Tier 2 — Public navigation tags (public: true): drive the blog tag cloud at /posts/{tag}/tag. 22 tags in production: towel-warmers, installation, indoor-heating, snow-melting, design-trends, etc.
The tags: parameter in update_blog_post replaces the full tag set (clear + add). Always call get_blog_post first to read existing tags before updating. Only tags that already exist in the database are applied — unrecognised names are silently skipped (assign_tags never creates new tags).
Link Validation
Before calling update_blog_post or create_blog_post with any HTML containing <a> elements, Sunny must validate every link using fetch_url. Links that return 404, redirect loops, or error pages must be corrected before the post is saved. This is enforced by a Brain rule (Validate All Links Before Saving a Blog Post) injected when blog_management + web_fetch are active.
Style Guide (BlogToolBuilder::STYLE_GUIDE)
A condensed style guide is injected into the create_blog_post and update_blog_post tool descriptions, covering headings (H2+), callouts, feature boxes, sidebars, tables, Liquid tags, Liquid variables, internal link patterns (always use {{ locale }}), and the oEmbed embed workflow.
13. Embedding Infrastructure
Files: app/concerns/models/embeddable.rb, app/subscribers/embedding/content_changed_handler.rb, app/workers/embedding_worker.rb
Models::Embeddable Concern
Included by any model that needs semantic search. Provides:
content_for_embedding(content_type)— override to define what text is embeddedembedding_content_changed?— override to define which attribute changes trigger re-embeddingqueue_embedding_generation— publishesEvents::ContentEmbeddingRequiredto the event store
The concern uses the Rails Event Store pub/sub system rather than direct worker enqueue, ensuring every embedding regeneration trigger has a permanent, auditable record in event_store_events.
Event Flow
Model#save → after_commit → queue_embedding_generation
→ event_store.publish(Events::ContentEmbeddingRequired, data: {type, id})
→ Embedding::ContentChangedHandler (AsyncJob, ai_embeddings queue)
→ EmbeddingWorker.perform_async(type, id)
→ OpenAI text-embedding-3-small API
→ ContentEmbedding.upsert(embeddable, vector)
ContentEmbedding Table (Partitioned)
The content_embeddings parent table is list-partitioned by embeddable_type. Each model that uses embeddings has its own partition created via pg_party's create_list_partition_of. Per-partition indexes:
- UNIQUE on
(embeddable_id, content_type, locale) - HNSW (
vector_cosine_ops) for cosine ANN search
pgvector HNSW indexes cannot be placed on partitioned parent tables; they must be on each partition individually.
Models using embeddings
| Model | Partition | Embedding content |
|---|---|---|
Article / Post / Faq |
content_embeddings_articles |
Subject + solution body |
Video |
content_embeddings_videos |
Title + description + transcript |
Showcase |
content_embeddings_showcases |
Title + description |
AssistantBrainEntry |
content_embeddings_assistant_brain_entries |
Title + rule text |
Data Models
AssistantConversation
Table: assistant_conversations
id, user_id, title, messages (jsonb, legacy), metadata (jsonb), llm_model_id,
processing_by_id, processing_since, timestamps
acts_as_chatfrom RubyLLM — providesask(),with_model(),with_tools(),with_instructions(),with_thinking()- Messages auto-persist to
assistant_messagestable metadatastores viajsonb_accessor: token totals, cost, queries, compaction summary, tool serviceswith_instructionsoverride fixes RubyLLM v1.11.0 bug whereContent::Rawobjects were serialized as#<RubyLLM::Content::Raw:0x...>instead of their contentto_llmoverride: eager-loads to prevent N+1 queries, integrates context compaction, filters empty/orphaned/duplicate messages- Processing lock via
processing_by_id+processing_sincecolumns (5-minute staleness threshold)
AssistantMessage
Table: assistant_messages
id, assistant_conversation_id, role, content, content_raw (json),
input_tokens, output_tokens, cached_tokens, cache_creation_tokens,
thinking_text, thinking_tokens, thinking_signature,
llm_model_id, assistant_tool_call_id, timestamps
AssistantToolCall
Table: assistant_tool_calls
id, assistant_message_id, tool_call_id, name, arguments (jsonb), timestamps
AssistantConversationShare
Table: assistant_conversation_shares
id, assistant_conversation_id, shared_with_type, shared_with_id, access_level, timestamps
Access levels: viewer (read-only), collaborator (can send messages).
AssistantBrainEntry
Table: assistant_brain_entries
id, category, title, rule (text),
scope ('global'|'user'), user_id,
applies_to_services (varchar[]),
status ('active'|'pending'|'inactive'),
source ('manual'|'auto_learned'),
suggested_by_id, approved_by_id,
created_at, updated_at
- Includes
Models::Auditable(PaperTrail) — all changes tracked inrecord_versionstable - Includes
Models::Embeddable— rule text vectorised tocontent_embeddings_assistant_brain_entriespartition - Scopes:
.active,.pending,.global,.for_user(user_id),.for_services(service_keys)
ContentEmbedding
Table: content_embeddings (list-partitioned by embeddable_type)
id, embeddable_type, embeddable_id, content_type, locale, embedding (vector(1536)), timestamps
Used by: AssistantBrainEntry, Article/Post/Faq, Video, Showcase. Cosine ANN search via has_neighbors from the neighbor gem.
UI Layer
Stimulus Controller — assistant_chat_controller.js
| Target | Description |
|---|---|
messages |
Chat message container (auto-scroll target) |
input |
Text input field |
submitButton |
Send button (loading state management) |
placeholder |
Empty-state placeholder |
modelSelect |
Model dropdown selector |
Features:
- MutationObserver: Watches
messagestarget; auto-scrolls on new content - Completion detection: Scans for
[data-chat-complete="true"]marker to re-enable input - Double-submit prevention:
isProcessingflag blocks re-submission while worker runs
Turbo Stream Flow
- User submits → Turbo handles
POST /assistant/ask - Controller responds with
ask.turbo_stream.erb→ appends user message bubble + processing indicator - Worker broadcasts via
Turbo::StreamsChannel:broadcast_replace_to→ live preview with gray monospace text + blinking caret- Throttled to ~8fps for smooth streaming
- On completion: Worker broadcasts final rendered HTML replacing the preview
- Stimulus detects
[data-chat-complete]→ re-enables input
Stream name: assistant_chat:{conversation_id}
Configuration
Model IDs — config/initializers/ai_model_constants.rb
AiModelConstants is the canonical registry for every model ID. LlmDefaults
is a thin backward-compatibility shim that delegates to it:
module LlmDefaults
DEFAULT_SONNET_MODEL = AiModelConstants.id(:anthropic_sonnet) # claude-sonnet-4-6
DEFAULT_OPUS_MODEL = AiModelConstants.id(:anthropic_opus) # claude-opus-4-8
DEFAULT_HAIKU_MODEL = AiModelConstants.id(:anthropic_haiku) # claude-haiku-4-5-20251001
end
All model references in ChatService::MODELS use these constants. Never hardcode
model IDs elsewhere. WeeklyLlmModelSyncWorker syncs the live provider registry
weekly and emails a report when a pinned default has fallen behind.
RubyLLM Provider Keys
| Provider | Env Variable |
|---|---|
| Anthropic | ANTHROPIC_API_KEY |
| OpenAI | OPENAI_API_KEY |
GEMINI_API_KEY |
Database Services
| Service Key | Label | Connection Class |
|---|---|---|
postgres_production |
App DB | ActiveRecord::Base |
postgres_versions |
Versions DB | VersionsDb::Base |
File Index
| File | Purpose |
|---|---|
app/agents/assistant/sunny_agent.rb |
RubyLLM Agent: factory for AssistantConversation records |
app/prompts/assistant/sunny_agent/instructions.txt.erb |
Base system prompt template (identity, date macros, guidelines) |
app/controllers/crm/assistant_chat_controller.rb |
HTTP entry point, user context, tool service selection |
app/controllers/crm/assistant_brain_controller.rb |
CRM CRUD for brain entries; access-gated via CanCan |
app/workers/assistant_chat_worker.rb |
Sidekiq job: LLM loop, streaming, response formatting |
app/services/assistant/chat_service.rb |
Model selection, prompt assembly (incl. brain_prompt), thinking, tool registration |
app/services/assistant/chat_tool_builder.rb |
Builds RubyLLM::Tool instances (content, DB, blog, brain tools) |
app/services/assistant/blog_tool_builder.rb |
Blog create/update/get/embed tools + tagging helpers |
app/services/assistant/sql_broker.rb |
SQL execution: read-only, access control, redaction, audit |
app/services/assistant/data_policy.rb |
Object + column access rules, PII redaction |
app/services/assistant/data_domain_policy.rb |
Role → domain → allowed objects resolution |
app/services/assistant/comment_manifest.rb |
YAML manifest reader: domains, restricted columns, comments |
app/services/assistant/tool_loop_guard.rb |
Prevents repetitive tool call loops |
app/services/assistant/context_compactor.rb |
Sliding-window conversation summarisation |
app/services/assistant/cost_calculator.rb |
Per-turn and per-conversation cost from token counts |
app/services/assistant/response_formatter.rb |
Markdown → HTML with tables, SQL collapse, entity links |
app/models/assistant_conversation.rb |
Conversation record (acts_as_chat) with token tracking |
app/models/assistant_message.rb |
Per-message record (acts_as_message) with token tracking |
app/models/assistant_tool_call.rb |
Tool invocation record (acts_as_tool_call) |
app/models/assistant_conversation_share.rb |
Conversation sharing (viewer / collaborator) |
app/models/assistant_brain_entry.rb |
Learned rules model; embeddable + auditable |
app/models/content_embedding.rb |
Polymorphic vector store (has_neighbors); partitioned by type |
app/concerns/models/embeddable.rb |
Concern: queue_embedding_generation via Rails Event Store |
app/events/events/content_embedding_required.rb |
Pub/sub event: signals an embedding regeneration is needed |
app/subscribers/embedding/content_changed_handler.rb |
Async handler: enqueues EmbeddingWorker on ai_embeddings queue |
app/workers/embedding_worker.rb |
Calls OpenAI text-embedding-3-small; upserts to content_embeddings |
config/initializers/llm_defaults.rb |
Canonical Anthropic model ID constants |
config/analytics/data_domains.yml |
Domain descriptions + role mappings |
db/comments/*.yml |
Per-table manifests with domain, comments, restricted flags |
db/data/brain_entry_embeddings_seed.json |
Pre-computed embeddings for initial brain entries (avoids API costs on deploy) |
Access Control
Who can use Sunny
All CRM accounts. Tool services available vary by role:
| Tool service | Non-admin | Admin / Manager |
|---|---|---|
content |
Yes | Yes |
postgres_production |
Domain-restricted | Full |
postgres_versions (audit DB) |
No | Yes |
blog_management |
No | Yes |
Who can manage the Brain
The can :manage, AssistantBrainEntry CanCan ability is granted to:
- Any account where
is_manager?returnstrue(inside the shared manager ability block inability.rb) - Any account with the
sunny_administratorrole (can be assigned to non-managers, e.g. content specialists)
All CRM views (index.html.erb, histories.html.erb) use can?(:manage, AssistantBrainEntry) to conditionally render the "Sunny Brain" button. The controller uses the same ability check via require_sunny_admin!.
Testing
| Test File | Coverage |
|---|---|
test/agents/assistant/sunny_agent_create_test.rb |
create! returns persisted conversation; no llm_model; no system messages at factory time; title defaults and custom title |
test/agents/assistant/sunny_agent_find_test.rb |
find returns existing conversation without mutating it; no llm_model; no system messages |
test/integration/sunny_blog_editor_end_to_end_test.rb |
Julia-style blog tool flows (conv 1565); assertion helpers in test/support/sunny_blog_editor_julia_flow_helpers.rb |
test/models/assistant_conversation_test.rb |
to_llm nil guard, track_query!, processing lock, scopes |
test/models/assistant_brain_entry_test.rb |
Scopes, status transitions, for_services filtering |
test/services/assistant/chat_service_test.rb |
MODELS registry, auto_select_model, estimate_tokens, base prompt, brain prompt injection |
test/workers/assistant_chat_worker_test.rb |
broadcast_error mappings, graceful error handling |
test/initializers/llm_defaults_test.rb |
Model IDs valid, Anthropic pattern, no future dates, RubyLLM resolution |
Run AI assistant tests:
mise exec -- bin/rails test test/agents/ \
test/models/assistant_conversation_test.rb \
test/models/assistant_brain_entry_test.rb \
test/services/assistant/chat_service_test.rb \
test/workers/assistant_chat_worker_test.rb \
test/initializers/llm_defaults_test.rb
Migration History (Sunny features)
| Migration | Description |
|---|---|
20260227100000_create_assistant_brain_entries |
Creates assistant_brain_entries table |
20260227110000_add_scope_to_brain_entries |
Adds scope, user_id, applies_to_services columns |
20260227140000_add_sunny_administrator_role |
Creates sunny_administrator role in roles table |
20260228100001_seed_assistant_brain_entries |
Seeds 14 initial brain entries (stub class pattern) |
20260228160000_add_brain_entry_embeddings |
Creates content_embeddings partitioned table + AssistantBrainEntry partition; seeds pre-computed vectors |
20260228170000_seed_brain_entry_embeddings |
Inserts pre-computed embedding vectors for the 14 initial entries |
20260228190000_add_blog_tagging_brain_entry |
Seeds two-tier tagging brain entry; queues embedding via after_commit |
Migration pattern: Data migrations that create
AssistantBrainEntryrecords during a deploy sequence must use a local stub class (class BrainEntry < ApplicationRecord; self.table_name = 'assistant_brain_entries'; end) if the migration runs before all schema migrations are applied. This avoidsNoMethodErrorfrom model validations referencing not-yet-existing columns. The final tagging entry (20260228190000) intentionally uses the real model so theafter_commitembedding callback fires.