AI Agent Architecture
Status: Production Ready Last Updated: February 2026
Overview
Section titled “Overview”The AI Agent (Sunny) is an internal assistant for WarmlyYours employees, accessible at /assistant in the CRM. It answers business questions using real-time database queries, content search, blog management, and multi-model LLM reasoning. The system is built on RubyLLM’s acts_as_chat + Agent patterns, with role-based data access, a persistent self-learning brain, tool-loop guardrails, audit logging, and streaming Turbo Stream responses.
Key capabilities:
- Natural-language SQL: Users ask questions; the LLM writes, executes, and interprets read-only SQL
- Content search: Semantic search across products, FAQs, blog posts, videos, showcases, and reviews
- Blog management: Create, update, and enrich blog posts with linked assets (images, videos, FAQs, product cards), proper tagging, and SEO metadata
- Multi-model support: Auto-selects from Claude, GPT-4.1, or Gemini based on query complexity
- Extended thinking: Activates reasoning scratchpad for analytical queries (Anthropic/Gemini)
- Data domain access control: CanCanCan roles map to data domains; each database table declares which domains it belongs to
- Streaming UI: Real-time token-by-token streaming with live preview, tool status, and formatted output
- Conversation sharing: Owner can share conversations with colleagues at
viewerorcollaboratorlevel - Context compaction: Sliding-window summary for long conversations to stay within token limits
- Sunny Brain: Persistent, editable knowledge base of learned rules injected into every system prompt; self-updates via semantic embeddings
Architecture Diagram
Section titled “Architecture Diagram” Browser (CRM) Rails Server External┌──────────────────┐ POST ┌──────────────────────────────┐│ Stimulus │──/ask──────►│ Crm::AssistantChatController ││ assistant_chat │ │ ├─ build_user_context() ││ controller │ │ ├─ sanitize_tool_services() ││ │ │ └─ AssistantChatWorker ││ MutationObserver │ .perform_async() ││ auto-scroll │ └───────────┬──────────────────┘│ completion detect │ │ Sidekiq└─────▲────────────┘ ▼ │ ┌──────────────────────────────┐ │ Turbo Streams │ AssistantChatWorker │ │ (ActionCable) │ ├─ ChatService.new(...) │ │ │ │ ├─ auto_select_model() │ │ │ │ ├─ configure_conversation()│ │ │ │ │ ├─ with_model() │ │ │ │ │ ├─ with_instructions() │ │ │ │ │ ├─ with_tools() │ ┌──────────┐ │ │ │ │ ├─ with_thinking() │──►│ Anthropic │ │ │ │ │ └─ on_tool_call() │ │ OpenAI │ │ │ │ └─ conversation.ask() │ │ Gemini │ │ │ │ ├─ LLM streaming │ └──────────┘ │ │ │ ├─ tool execution ─────┤ │ │ │ └─ auto-persist msgs │ │ │ ├─ broadcast_chunk() │ │◄──────────────────────────│ ├─ broadcast_complete() │ │ │ │ └─ ResponseFormatter │ │ │ └─ finalize_response() │ │ └──────────────────────────────┘ │ │ Tool Execution Layer │ ┌────────────────────────────────────────────┐ │ │ Assistant::ChatToolBuilder │ │ │ ├─ Content tools (semantic_search, FAQs) │──► pgvector / OpenAI Embeddings │ │ └─ PostgreSQL tools │ │ │ ├─ describe_available_data │──► CommentManifest (db/comments/*.yml) │ │ ├─ execute_sql ──► SqlBroker │──► PostgreSQL (read-only) │ │ ├─ list_schemas │ │ │ ├─ list_objects │ │ │ ├─ get_object_details │ │ │ └─ explain_query │ │ └────────────────────────────────────────────┘ │ │ Security Layer │ ┌────────────────────────────────────────────┐ │ │ Assistant::DataDomainPolicy ◄── data_domains.yml│ │ │ ├─ role → domain mapping │ │ │ └─ CommentManifest.objects_for_domains() │ │ │ │ │ │ Assistant::DataPolicy │ │ │ ├─ ALWAYS_BLOCKED_OBJECTS │ │ │ ├─ object_allowed?() │ │ │ └─ sensitive_columns_for_query() │ │ │ │ │ │ Assistant::ToolLoopGuard │ │ │ ├─ MAX_IDENTICAL_CALLS = 2 │ │ │ └─ MAX_CONSECUTIVE_SQL_FAILURES = 3 │ │ │ │ │ │ AssistantSqlAuditLog (every SQL execution) │ │ └────────────────────────────────────────────┘Components
Section titled “Components”1. Agent — Assistant::SunnyAgent
Section titled “1. Agent — Assistant::SunnyAgent”File: app/agents/assistant/sunny_agent.rb
RubyLLM::Agent subclass that serves as the canonical factory and configuration baseline for Sunny conversations. Follows the RubyLLM Agent pattern: chat_model wires create!/find to AssistantConversation, and instructions { nil } disables automatic discovery so ChatService can apply per-request configuration.
# Create a new conversationconversation = Assistant::SunnyAgent.create!(user: current_user, messages: [])
# Load an existing conversationconversation = Assistant::SunnyAgent.find(conversation_id)All dynamic configuration (model, temperature, system prompt, tools, thinking, Anthropic prompt caching) is applied per-request by ChatService#configure_conversation. The agent is intentionally lean — it’s a factory, not an orchestrator.
2. Controller — Crm::AssistantChatController
Section titled “2. Controller — Crm::AssistantChatController”File: app/controllers/crm/assistant_chat_controller.rb
The HTTP entry point. Handles conversation CRUD and delegates LLM processing to a Sidekiq worker.
| Action | Method | Description |
|---|---|---|
index | GET | Load most recent conversation; list sidebar conversations |
show | GET | Load a specific conversation (owned or shared) |
create | POST | Create a new conversation via SunnyAgent.create! |
ask | POST | Accept user message, enqueue AssistantChatWorker |
cancel | POST | Set Redis cancellation flag; worker checks between chunks |
destroy | DELETE | Delete a conversation |
Key responsibilities:
- User context assembly:
build_user_context()serializes identity (name, party_id, department, job_title), role flags (is_admin, is_manager), and resolvedanalytics_domainsinto a Hash passed to the worker. - Tool service selection:
available_chat_services()limits non-admins tocontent+postgres_production. Admins also getpostgres_versions(audit trail DB). - Processing lock: Checks
conversation.processing?before enqueuing; blocks if a job is already running. - Access control: Viewers (shared conversations) cannot submit new messages.
3. Worker — AssistantChatWorker
Section titled “3. Worker — AssistantChatWorker”File: app/workers/assistant_chat_worker.rb
Sidekiq job that runs the full LLM interaction loop in the background.
Lifecycle:
- Acquire processing lock (
acquire_processing_lock!) - Load conversation, instantiate
Assistant::ChatService - Call
service.call { |chunk| broadcast_chunk(chunk) } - LLM streams response tokens → worker broadcasts live preview via
Turbo::StreamsChannel - On tool calls: LLM pauses, tool executes, result fed back, LLM continues
- On completion:
ResponseFormatterrenders final markdown → HTML; broadcast replaces preview finalize_response(): track metrics, sync token totals, dual-write legacy JSONB- Release processing lock
Streaming: The worker throttles preview updates to ~8fps (0.12s interval) for smooth UX.
Cancellation: cancel action sets cancelled:{jid} in Redis (TTL 300s). Worker checks cancelled? between chunks and stops streaming.
Error handling: broadcast_error maps known error patterns (rate limits, timeouts, invalid model IDs, JSON parse failures) to user-friendly messages.
4. Chat Service — Assistant::ChatService
Section titled “4. Chat Service — Assistant::ChatService”File: app/services/assistant/chat_service.rb
Orchestrates a single chat turn: model selection, system prompt assembly, tool registration, thinking activation, and streaming execution.
Model Selection
Section titled “Model Selection”MODELS = { 'claude-haiku' => { id: LlmDefaults::DEFAULT_HAIKU_MODEL, provider: :anthropic, cost: :low, supports_thinking: false }, 'claude-sonnet' => { id: LlmDefaults::DEFAULT_SONNET_MODEL, provider: :anthropic, cost: :medium, supports_thinking: true, thinking_effort_default: :medium }, 'claude-opus' => { id: LlmDefaults::DEFAULT_OPUS_MODEL, provider: :anthropic, cost: :high, supports_thinking: true, thinking_effort_default: :high }, 'gpt-4.1' => { id: 'gpt-4.1', provider: :openai, cost: :medium, supports_thinking: false }, 'gpt-4.1-mini' => { id: 'gpt-4.1-mini', provider: :openai, cost: :low, supports_thinking: false }, 'gemini-flash' => { id: 'gemini-2.5-flash', provider: :gemini, cost: :low, supports_thinking: true, thinking_effort_default: :low }, 'gemini-pro' => { id: 'gemini-2.5-pro', provider: :gemini, cost: :medium, supports_thinking: true, thinking_effort_default: :medium },}Auto-selection (model: 'auto') uses regex pattern matching + token estimation:
- Simple lookups (< 50 tokens, “show me”, “list”, “count”) →
claude-haiku - Complex/comparison queries (“trend”, “correlate”, “why”, > 150 tokens) →
claude-sonnet - Default →
claude-sonnet
Model IDs are managed through config/initializers/llm_defaults.rb to ensure a single source of truth.
System Prompt
Section titled “System Prompt”Assembled from layers, joined with "\n\n":
- Base template (
app/prompts/assistant/sunny_agent/instructions.txt.erb): Identity, date macros, formatting guidelines, source citation rules user_context_prompt: User identity (## CURRENT USER), party_id mappings for “my” queries, data access domainsschema_hint_for_role: Instructs the LLM to calldescribe_available_data; power users also getlist_schemas/get_object_detailstools_system_prompt: Tool names, tool-mode rules, and service-specific rules (content, blog, GA4, GSC, Ahrefs, etc.)crm_url_templates_prompt: CRM URL patterns for linking records by IDbrain_prompt: Hybrid-retrievedAssistantBrainEntryrules injected under## LEARNED RULES; contextually filtered to entries whoseapplies_to_servicesoverlap with the active tool services
For Anthropic models, the system prompt is wrapped with cache: true via RubyLLM::Providers::Anthropic::Content. This enables up to 90% input token savings on subsequent turns.
Extended Thinking
Section titled “Extended Thinking”When the model supports it and the query matches THINKING_QUERY_PATTERNS:
- Activated via
conversation.with_thinking(effort:, budget:) - Budget tokens: 4K (low) / 8K (medium) / 16K (high, Opus only)
- Anthropic requires
temperature: 1when thinking is active - Thinking traces are persisted to
assistant_messages.thinking_text
Conversation Configuration (configure_conversation)
Section titled “Conversation Configuration (configure_conversation)”conversation.with_model(model_id, provider:, assume_exists: true)conversation.with_instructions(cacheable_system_prompt, replace: true)conversation.with_thinking(effort:, budget:) # if applicableconversation.with_params(max_tokens: budget + 16_000) # if thinking with budgetconversation.with_temperature(thinking? ? 1 : 0.3)conversation.with_tools(*tools) # from ChatToolBuilderconversation.on_new_message { emit_status(:thinking) }conversation.on_tool_call { |tc| emit_status(...) }conversation.on_tool_result { ... }conversation.on_end_message { emit_status(:composing) }5. Tool Builder — Assistant::ChatToolBuilder
Section titled “5. Tool Builder — Assistant::ChatToolBuilder”File: app/services/assistant/chat_tool_builder.rb
Dynamically builds RubyLLM::Tool subclasses for the chat. No HTTP hop to the MCP server — tools execute in-process.
Tool registration is keyed on tool_services — an array passed from the controller that determines which tool groups are active for this conversation. Non-admins are limited to content + postgres_production; admins also get postgres_versions.
Content Tools (content)
Section titled “Content Tools (content)”| Tool | Description |
|---|---|
semantic_search | Semantic vector search across all content types |
find_faqs | FAQ search with product line filtering |
find_call_recordings | Permission-gated call transcript search |
PostgreSQL Tools (postgres_production, postgres_versions)
Section titled “PostgreSQL Tools (postgres_production, postgres_versions)”Built per database service key. Access level depends on role:
| Tool | Admin/Manager | Employee |
|---|---|---|
describe_available_data | Yes | Yes |
execute_sql | Full | Domain-restricted |
list_schemas | Yes | No |
list_objects | Yes | No |
get_object_details | Yes | No |
explain_query | Yes | No |
Blog Management Tools (blog_management)
Section titled “Blog Management Tools (blog_management)”| Tool | Description |
|---|---|
create_blog_post | Create a draft post (body, SEO, tags, preview image) |
update_blog_post | Update fields + atomically replace tags; creates a revision |
get_blog_post | Read all editable fields, current tags, revision count |
insert_image | oEmbed rendered HTML for an image embed |
insert_video | oEmbed rendered HTML for a video embed (Cloudflare or YouTube) |
insert_faqs | oEmbed rendered HTML for an FAQ section |
insert_product | oEmbed rendered HTML for a product card (Liquid tag) |
Brain Tool (all services)
Section titled “Brain Tool (all services)”| Tool | Description |
|---|---|
propose_brain_entry | Propose a new learned rule; creates a pending brain entry |
6. SQL Broker — Assistant::SqlBroker
Section titled “6. SQL Broker — Assistant::SqlBroker”File: app/services/assistant/sql_broker.rb
Centralized execution layer for all AI-generated SQL.
Enforcement chain:
- Read-only check: Only
SELECT,SHOW,WITH,EXPLAIN,SETare allowed - SQL normalization: Strip comments, collapse whitespace
- Object extraction:
PgQuery.parse(sql).tablesvia AST - Object-level access:
DataPolicy.object_allowed?()checks againstALWAYS_BLOCKED_OBJECTS+ domain-resolvedallowed_objects - Query execution: Read-only transaction with statement timeout (8s)
- Column-level redaction: PII columns redacted from results
- Row cap: Truncates to 50 rows; adds truncation notice
- Audit logging: Every execution logged to
assistant_sql_audit_logs
7. Data Domain Access Control
Section titled “7. Data Domain Access Control”Same domain system as described in the Analytics section:
- File:
config/analytics/data_domains.yml - 5 business domains:
sales,financial,workforce,operations,marketing - Role mappings:
sales_rep,sales_manager,accounting_manager,marketing_manager, etc. - Database tables/views declare their domain(s) in
db/comments/*.ymlmanifests - Admins get
nil(unrestricted);DataDomainPolicyresolves roles → domains → allowed objects
8. Context Compactor — Assistant::ContextCompactor
Section titled “8. Context Compactor — Assistant::ContextCompactor”File: app/services/assistant/context_compactor.rb
Sliding-window context management for long conversations:
- When token count exceeds the threshold,
ensure_context_summary!generates (or returns a cached) summary of older messages - Summary injected as synthetic user + assistant messages in
to_llm(after system messages) - Only recent messages (after
compaction_through_message_id) are sent verbatim - Summary and cutoff ID are cached in
AssistantConversation’s JSONB metadata
9. Cost Calculator — Assistant::CostCalculator
Section titled “9. Cost Calculator — Assistant::CostCalculator”File: app/services/assistant/cost_calculator.rb
Computes per-turn and per-conversation cost from token counts and model pricing:
- Looks up per-token price from
ChatService::MODELS - Accounts for cached tokens (90% discount) and cache-creation tokens
- Called by
AssistantConversation#computed_total_costandsync_token_totals!
10. Response Formatter — Assistant::ResponseFormatter
Section titled “10. Response Formatter — Assistant::ResponseFormatter”File: app/services/assistant/response_formatter.rb
| Transformation | Description |
|---|---|
| Markdown → HTML | Kramdown with GFM + Rouge syntax highlighting |
| HTML sanitization | Sanitize gem strips XSS vectors |
| SQL collapse | SQL blocks wrapped in Bootstrap collapsible accordions |
| Table enhancement | Bootstrap tables with cell formatting and export buttons (Copy, CSV, Excel) |
| Entity linking | Order numbers, customer IDs, SKUs linked to CRM pages |
| Source panel | Source citations styled distinctively |
11. Sunny Brain — AssistantBrainEntry
Section titled “11. Sunny Brain — AssistantBrainEntry”Files: app/models/assistant_brain_entry.rb, app/controllers/crm/assistant_brain_controller.rb
The Brain is a persistent, CRM-editable knowledge base of rules and facts that are injected into Sunny’s system prompt on every conversation. It replaces hard-coded domain rules in prompt files with database-backed entries that any manager can inspect, approve, or correct without a code deploy.
Entry Model
Section titled “Entry Model”assistant_brain_entries id, category, title, rule (text), scope ('global'|'user'), user_id, applies_to_services (varchar[]), status ('active'|'pending'|'inactive'), source ('manual'|'auto_learned'), suggested_by_id, approved_by_id, created_at, updated_atscope: 'global'— shared across all users; managed by managers /sunny_administratorrolescope: 'user'— personal preferences for one user; self-managed, never shown to othersapplies_to_services— contextual filtering: an empty array means “always inject”; a populated array means “only inject when one of these tool services is active in the current conversation” (e.g.['blog_management']keeps blog rules out of analytics sessions)
Categories
Section titled “Categories”url_rules, product_data, content_rules, analytics, schema_knowledge, general
System Prompt Injection
Section titled “System Prompt Injection”ChatService#brain_prompt is called during prompt assembly and appends a ## LEARNED RULES section. It uses hybrid retrieval:
- Universal entries (
applies_to_services: []) — always injected - Service-specific entries — when the total exceeds
BRAIN_SEMANTIC_THRESHOLD(40) and a user message is present, pgvector cosine similarity selects the topBRAIN_SEMANTIC_TOP_K(20) most relevant entries from candidates; otherwise all are injected - Entries are sorted by category then title before injection
Semantic Embeddings
Section titled “Semantic Embeddings”Each AssistantBrainEntry includes Models::Embeddable. When title or rule changes, an Events::ContentEmbeddingRequired event is published to the Rails Event Store. Embedding::ContentChangedHandler (async handler) dequeues an EmbeddingWorker job on the ai_embeddings queue, which calls OpenAI text-embedding-3-small and stores a 1536-dimension vector in the content_embeddings_assistant_brain_entries partition (PostgreSQL list partition of content_embeddings keyed on embeddable_type = 'AssistantBrainEntry').
The partition has:
- A UNIQUE index on
(embeddable_id, content_type, locale) - An HNSW index with
vector_cosine_opsfor fast approximate nearest-neighbor search
Self-Learning via ProposeBrainEntryTool
Section titled “Self-Learning via ProposeBrainEntryTool”Sunny can propose new rules after user corrections by calling the propose_brain_entry tool (registered in ChatToolBuilder). Proposed entries are created with status: 'pending'. Managers see a Pending Approval panel in the Brain CRM page and can approve or reject with a single click.
CRM Interface
Section titled “CRM Interface”GET /assistant_brain — lists:
- Global active entries (grouped by category) — visible to all users with access
- Pending entries awaiting approval — sunny_admin only
- Inactive entries — sunny_admin only
- Current user’s personal entries
Access control uses CanCan:
# ability.rb — inside is_manager? blockcan :manage, AssistantBrainEntry
# Plus standalone for non-managers with explicit rolecan :manage, AssistantBrainEntry if account.has_role?('sunny_administrator', admin_check: false)The sunny_administrator role (created in 20260227140000_add_sunny_administrator_role) can be granted to non-managers who need brain management access (e.g. content specialists).
12. Blog Management Tool Service
Section titled “12. Blog Management Tool Service”File: app/services/assistant/blog_tool_builder.rb
The blog_management tool service provides Sunny with 8 tools for creating and maintaining blog posts.
| Tool | Description |
|---|---|
create_blog_post | Create a draft post with HTML body, SEO fields, tags, preview image |
update_blog_post | Update any post field; creates a revision; replaces tags atomically |
get_blog_post | Read all editable fields + current tags, revisions, effective meta description |
insert_image | Fetch rendered oEmbed HTML for an image embed |
insert_video | Fetch rendered oEmbed HTML for a video embed |
insert_faqs | Fetch rendered oEmbed HTML for an FAQ section |
insert_product | Fetch rendered oEmbed HTML for a product card |
propose_brain_entry | Propose a new brain rule (available in all services) |
Tagging System (Two-Tier)
Section titled “Tagging System (Two-Tier)”Blog posts use two distinct tag types, both applied via the tags: parameter:
Tier 1 — Page-placement tags (public: false): internal tags that make a post appear in the content section of a specific landing page. Format: for-{page-path-parameterized}-page. Used by PagesHelper#page_posts, page_videos, page_showcases, etc. Never shown to visitors (filtered by BlogHelper#tags_with_links).
Examples: for-towel-warmer-page, for-towel-warmer-matte-black-page, for-floor-heating-bathroom-page
Tier 2 — Public navigation tags (public: true): drive the blog tag cloud at /posts/{tag}/tag. 22 tags in production: towel-warmers, installation, indoor-heating, snow-melting, design-trends, etc.
The tags: parameter in update_blog_post replaces the full tag set (clear + add). Always call get_blog_post first to read existing tags before updating. Only tags that already exist in the database are applied — unrecognised names are silently skipped (assign_tags never creates new tags).
Link Validation
Section titled “Link Validation”Before calling update_blog_post or create_blog_post with any HTML containing <a> elements, Sunny must validate every link using fetch_url. Links that return 404, redirect loops, or error pages must be corrected before the post is saved. This is enforced by a Brain rule (Validate All Links Before Saving a Blog Post) injected when blog_management + web_fetch are active.
Style Guide (BlogToolBuilder::STYLE_GUIDE)
Section titled “Style Guide (BlogToolBuilder::STYLE_GUIDE)”A condensed style guide is injected into the create_blog_post and update_blog_post tool descriptions, covering headings (H2+), callouts, feature boxes, sidebars, tables, Liquid tags, Liquid variables, internal link patterns (always use {{ locale }}), and the oEmbed embed workflow.
13. Embedding Infrastructure
Section titled “13. Embedding Infrastructure”Files: app/concerns/models/embeddable.rb, app/subscribers/embedding/content_changed_handler.rb, app/workers/embedding_worker.rb
Models::Embeddable Concern
Section titled “Models::Embeddable Concern”Included by any model that needs semantic search. Provides:
content_for_embedding(content_type)— override to define what text is embeddedembedding_content_changed?— override to define which attribute changes trigger re-embeddingqueue_embedding_generation— publishesEvents::ContentEmbeddingRequiredto the event store
The concern uses the Rails Event Store pub/sub system rather than direct worker enqueue, ensuring every embedding regeneration trigger has a permanent, auditable record in event_store_events.
Event Flow
Section titled “Event Flow”Model#save → after_commit → queue_embedding_generation → event_store.publish(Events::ContentEmbeddingRequired, data: {type, id}) → Embedding::ContentChangedHandler (AsyncJob, ai_embeddings queue) → EmbeddingWorker.perform_async(type, id) → OpenAI text-embedding-3-small API → ContentEmbedding.upsert(embeddable, vector)ContentEmbedding Table (Partitioned)
Section titled “ContentEmbedding Table (Partitioned)”The content_embeddings parent table is list-partitioned by embeddable_type. Each model that uses embeddings has its own partition created via pg_party’s create_list_partition_of. Per-partition indexes:
- UNIQUE on
(embeddable_id, content_type, locale) - HNSW (
vector_cosine_ops) for cosine ANN search
pgvector HNSW indexes cannot be placed on partitioned parent tables; they must be on each partition individually.
Models using embeddings
Section titled “Models using embeddings”| Model | Partition | Embedding content |
|---|---|---|
Article / Post / Faq | content_embeddings_articles | Subject + solution body |
Video | content_embeddings_videos | Title + description + transcript |
Showcase | content_embeddings_showcases | Title + description |
AssistantBrainEntry | content_embeddings_assistant_brain_entries | Title + rule text |
Data Models
Section titled “Data Models”AssistantConversation
Section titled “AssistantConversation”Table: assistant_conversations
id, user_id, title, messages (jsonb, legacy), metadata (jsonb), llm_model_id,processing_by_id, processing_since, timestampsacts_as_chatfrom RubyLLM — providesask(),with_model(),with_tools(),with_instructions(),with_thinking()- Messages auto-persist to
assistant_messagestable metadatastores viajsonb_accessor: token totals, cost, queries, compaction summary, tool serviceswith_instructionsoverride fixes RubyLLM v1.11.0 bug whereContent::Rawobjects were serialized as#<RubyLLM::Content::Raw:0x...>instead of their contentto_llmoverride: eager-loads to prevent N+1 queries, integrates context compaction, filters empty/orphaned/duplicate messages- Processing lock via
processing_by_id+processing_sincecolumns (5-minute staleness threshold)
AssistantMessage
Section titled “AssistantMessage”Table: assistant_messages
id, assistant_conversation_id, role, content, content_raw (json),input_tokens, output_tokens, cached_tokens, cache_creation_tokens,thinking_text, thinking_tokens, thinking_signature,llm_model_id, assistant_tool_call_id, timestampsAssistantToolCall
Section titled “AssistantToolCall”Table: assistant_tool_calls
id, assistant_message_id, tool_call_id, name, arguments (jsonb), timestampsAssistantConversationShare
Section titled “AssistantConversationShare”Table: assistant_conversation_shares
id, assistant_conversation_id, shared_with_type, shared_with_id, access_level, timestampsAccess levels: viewer (read-only), collaborator (can send messages).
AssistantBrainEntry
Section titled “AssistantBrainEntry”Table: assistant_brain_entries
id, category, title, rule (text),scope ('global'|'user'), user_id,applies_to_services (varchar[]),status ('active'|'pending'|'inactive'),source ('manual'|'auto_learned'),suggested_by_id, approved_by_id,created_at, updated_at- Includes
Models::Auditable(PaperTrail) — all changes tracked inrecord_versionstable - Includes
Models::Embeddable— rule text vectorised tocontent_embeddings_assistant_brain_entriespartition - Scopes:
.active,.pending,.global,.for_user(user_id),.for_services(service_keys)
ContentEmbedding
Section titled “ContentEmbedding”Table: content_embeddings (list-partitioned by embeddable_type)
id, embeddable_type, embeddable_id, content_type, locale, embedding (vector(1536)), timestampsUsed by: AssistantBrainEntry, Article/Post/Faq, Video, Showcase. Cosine ANN search via has_neighbors from the neighbor gem.
UI Layer
Section titled “UI Layer”Stimulus Controller — assistant_chat_controller.js
Section titled “Stimulus Controller — assistant_chat_controller.js”| Target | Description |
|---|---|
messages | Chat message container (auto-scroll target) |
input | Text input field |
submitButton | Send button (loading state management) |
placeholder | Empty-state placeholder |
modelSelect | Model dropdown selector |
Features:
- MutationObserver: Watches
messagestarget; auto-scrolls on new content - Completion detection: Scans for
[data-chat-complete="true"]marker to re-enable input - Double-submit prevention:
isProcessingflag blocks re-submission while worker runs
Turbo Stream Flow
Section titled “Turbo Stream Flow”- User submits → Turbo handles
POST /assistant/ask - Controller responds with
ask.turbo_stream.erb→ appends user message bubble + processing indicator - Worker broadcasts via
Turbo::StreamsChannel:broadcast_replace_to→ live preview with gray monospace text + blinking caret- Throttled to ~8fps for smooth streaming
- On completion: Worker broadcasts final rendered HTML replacing the preview
- Stimulus detects
[data-chat-complete]→ re-enables input
Stream name: assistant_chat:{conversation_id}
Configuration
Section titled “Configuration”Model IDs — config/initializers/ai_model_constants.rb
Section titled “Model IDs — config/initializers/ai_model_constants.rb”AiModelConstants is the canonical registry for every model ID. LlmDefaults
is a thin backward-compatibility shim that delegates to it:
module LlmDefaults DEFAULT_SONNET_MODEL = AiModelConstants.id(:anthropic_sonnet) # claude-sonnet-4-6 DEFAULT_OPUS_MODEL = AiModelConstants.id(:anthropic_opus) # claude-opus-4-8 DEFAULT_HAIKU_MODEL = AiModelConstants.id(:anthropic_haiku) # claude-haiku-4-5-20251001endAll model references in ChatService::MODELS use these constants. Never hardcode
model IDs elsewhere. WeeklyLlmModelSyncWorker syncs the live provider registry
weekly and emails a report when a pinned default has fallen behind.
RubyLLM Provider Keys
Section titled “RubyLLM Provider Keys”| Provider | Env Variable |
|---|---|
| Anthropic | ANTHROPIC_API_KEY |
| OpenAI | OPENAI_API_KEY |
GEMINI_API_KEY |
Database Services
Section titled “Database Services”| Service Key | Label | Connection Class |
|---|---|---|
postgres_production | App DB | ActiveRecord::Base |
postgres_versions | Versions DB | VersionsDb::Base |
File Index
Section titled “File Index”| File | Purpose |
|---|---|
app/agents/assistant/sunny_agent.rb | RubyLLM Agent: factory for AssistantConversation records |
app/prompts/assistant/sunny_agent/instructions.txt.erb | Base system prompt template (identity, date macros, guidelines) |
app/controllers/crm/assistant_chat_controller.rb | HTTP entry point, user context, tool service selection |
app/controllers/crm/assistant_brain_controller.rb | CRM CRUD for brain entries; access-gated via CanCan |
app/workers/assistant_chat_worker.rb | Sidekiq job: LLM loop, streaming, response formatting |
app/services/assistant/chat_service.rb | Model selection, prompt assembly (incl. brain_prompt), thinking, tool registration |
app/services/assistant/chat_tool_builder.rb | Builds RubyLLM::Tool instances (content, DB, blog, brain tools) |
app/services/assistant/blog_tool_builder.rb | Blog create/update/get/embed tools + tagging helpers |
app/services/assistant/sql_broker.rb | SQL execution: read-only, access control, redaction, audit |
app/services/assistant/data_policy.rb | Object + column access rules, PII redaction |
app/services/assistant/data_domain_policy.rb | Role → domain → allowed objects resolution |
app/services/assistant/comment_manifest.rb | YAML manifest reader: domains, restricted columns, comments |
app/services/assistant/tool_loop_guard.rb | Prevents repetitive tool call loops |
app/services/assistant/context_compactor.rb | Sliding-window conversation summarisation |
app/services/assistant/cost_calculator.rb | Per-turn and per-conversation cost from token counts |
app/services/assistant/response_formatter.rb | Markdown → HTML with tables, SQL collapse, entity links |
app/models/assistant_conversation.rb | Conversation record (acts_as_chat) with token tracking |
app/models/assistant_message.rb | Per-message record (acts_as_message) with token tracking |
app/models/assistant_tool_call.rb | Tool invocation record (acts_as_tool_call) |
app/models/assistant_conversation_share.rb | Conversation sharing (viewer / collaborator) |
app/models/assistant_brain_entry.rb | Learned rules model; embeddable + auditable |
app/models/content_embedding.rb | Polymorphic vector store (has_neighbors); partitioned by type |
app/concerns/models/embeddable.rb | Concern: queue_embedding_generation via Rails Event Store |
app/events/events/content_embedding_required.rb | Pub/sub event: signals an embedding regeneration is needed |
app/subscribers/embedding/content_changed_handler.rb | Async handler: enqueues EmbeddingWorker on ai_embeddings queue |
app/workers/embedding_worker.rb | Calls OpenAI text-embedding-3-small; upserts to content_embeddings |
config/initializers/llm_defaults.rb | Canonical Anthropic model ID constants |
config/analytics/data_domains.yml | Domain descriptions + role mappings |
db/comments/*.yml | Per-table manifests with domain, comments, restricted flags |
db/data/brain_entry_embeddings_seed.json | Pre-computed embeddings for initial brain entries (avoids API costs on deploy) |
Access Control
Section titled “Access Control”Who can use Sunny
Section titled “Who can use Sunny”All CRM accounts. Tool services available vary by role:
| Tool service | Non-admin | Admin / Manager |
|---|---|---|
content | Yes | Yes |
postgres_production | Domain-restricted | Full |
postgres_versions (audit DB) | No | Yes |
blog_management | No | Yes |
Who can manage the Brain
Section titled “Who can manage the Brain”The can :manage, AssistantBrainEntry CanCan ability is granted to:
- Any account where
is_manager?returnstrue(inside the shared manager ability block inability.rb) - Any account with the
sunny_administratorrole (can be assigned to non-managers, e.g. content specialists)
All CRM views (index.html.erb, histories.html.erb) use can?(:manage, AssistantBrainEntry) to conditionally render the “Sunny Brain” button. The controller uses the same ability check via require_sunny_admin!.
Testing
Section titled “Testing”| Test File | Coverage |
|---|---|
test/agents/assistant/sunny_agent_create_test.rb | create! returns persisted conversation; no llm_model; no system messages at factory time; title defaults and custom title |
test/agents/assistant/sunny_agent_find_test.rb | find returns existing conversation without mutating it; no llm_model; no system messages |
test/integration/sunny_blog_editor_end_to_end_test.rb | Julia-style blog tool flows (conv 1565); assertion helpers in test/support/sunny_blog_editor_julia_flow_helpers.rb |
test/models/assistant_conversation_test.rb | to_llm nil guard, track_query!, processing lock, scopes |
test/models/assistant_brain_entry_test.rb | Scopes, status transitions, for_services filtering |
test/services/assistant/chat_service_test.rb | MODELS registry, auto_select_model, estimate_tokens, base prompt, brain prompt injection |
test/workers/assistant_chat_worker_test.rb | broadcast_error mappings, graceful error handling |
test/initializers/llm_defaults_test.rb | Model IDs valid, Anthropic pattern, no future dates, RubyLLM resolution |
Run AI assistant tests:
mise exec -- bin/rails test test/agents/ \ test/models/assistant_conversation_test.rb \ test/models/assistant_brain_entry_test.rb \ test/services/assistant/chat_service_test.rb \ test/workers/assistant_chat_worker_test.rb \ test/initializers/llm_defaults_test.rbMigration History (Sunny features)
Section titled “Migration History (Sunny features)”| Migration | Description |
|---|---|
20260227100000_create_assistant_brain_entries | Creates assistant_brain_entries table |
20260227110000_add_scope_to_brain_entries | Adds scope, user_id, applies_to_services columns |
20260227140000_add_sunny_administrator_role | Creates sunny_administrator role in roles table |
20260228100001_seed_assistant_brain_entries | Seeds 14 initial brain entries (stub class pattern) |
20260228160000_add_brain_entry_embeddings | Creates content_embeddings partitioned table + AssistantBrainEntry partition; seeds pre-computed vectors |
20260228170000_seed_brain_entry_embeddings | Inserts pre-computed embedding vectors for the 14 initial entries |
20260228190000_add_blog_tagging_brain_entry | Seeds two-tier tagging brain entry; queues embedding via after_commit |
Migration pattern: Data migrations that create
AssistantBrainEntryrecords during a deploy sequence must use a local stub class (class BrainEntry < ApplicationRecord; self.table_name = 'assistant_brain_entries'; end) if the migration runs before all schema migrations are applied. This avoidsNoMethodErrorfrom model validations referencing not-yet-existing columns. The final tagging entry (20260228190000) intentionally uses the real model so theafter_commitembedding callback fires.