AI Agent Architecture

Status: Production Ready
Last Updated: February 2026

Overview

The AI Agent (Sunny) is an internal assistant for WarmlyYours employees, accessible at /assistant in the CRM. It answers business questions using real-time database queries, content search, blog management, and multi-model LLM reasoning. The system is built on RubyLLM's acts_as_chat + Agent patterns, with role-based data access, a persistent self-learning brain, tool-loop guardrails, audit logging, and streaming Turbo Stream responses.

Key capabilities:

  • Natural-language SQL: Users ask questions; the LLM writes, executes, and interprets read-only SQL
  • Content search: Semantic search across products, FAQs, blog posts, videos, showcases, and reviews
  • Blog management: Create, update, and enrich blog posts with linked assets (images, videos, FAQs, product cards), proper tagging, and SEO metadata
  • Multi-model support: Auto-selects from Claude, GPT-4.1, or Gemini based on query complexity
  • Extended thinking: Activates reasoning scratchpad for analytical queries (Anthropic/Gemini)
  • Data domain access control: CanCanCan roles map to data domains; each database table declares which domains it belongs to
  • Streaming UI: Real-time token-by-token streaming with live preview, tool status, and formatted output
  • Conversation sharing: Owner can share conversations with colleagues at viewer or collaborator level
  • Context compaction: Sliding-window summary for long conversations to stay within token limits
  • Sunny Brain: Persistent, editable knowledge base of learned rules injected into every system prompt; self-updates via semantic embeddings

Architecture Diagram

 Browser (CRM)                    Rails Server                      External
┌──────────────────┐   POST      ┌──────────────────────────────┐
│ Stimulus          │──/ask──────►│ Crm::AssistantChatController  │
│ assistant_chat    │             │ ├─ build_user_context()       │
│ controller        │             │ ├─ sanitize_tool_services()   │
│                   │             │ └─ AssistantChatWorker        │
│ MutationObserver  │                 .perform_async()            │
│ auto-scroll       │             └───────────┬──────────────────┘
│ completion detect │                         │ Sidekiq
└─────▲────────────┘                         ▼
      │                           ┌──────────────────────────────┐
      │ Turbo Streams             │ AssistantChatWorker           │
      │ (ActionCable)             │ ├─ ChatService.new(...)       │
      │                           │ │  ├─ auto_select_model()    │
      │                           │ │  ├─ configure_conversation()│
      │                           │ │  │  ├─ with_model()        │
      │                           │ │  │  ├─ with_instructions() │
      │                           │ │  │  ├─ with_tools()        │   ┌──────────┐
      │                           │ │  │  ├─ with_thinking()     │──►│ Anthropic │
      │                           │ │  │  └─ on_tool_call()      │   │ OpenAI   │
      │                           │ │  └─ conversation.ask()     │   │ Gemini   │
      │                           │ │     ├─ LLM streaming       │   └──────────┘
      │                           │ │     ├─ tool execution ─────┤
      │                           │ │     └─ auto-persist msgs   │
      │                           │ ├─ broadcast_chunk()          │
      │◄──────────────────────────│ ├─ broadcast_complete()       │
      │                           │ │  └─ ResponseFormatter       │
      │                           │ └─ finalize_response()        │
      │                           └──────────────────────────────┘
      │
      │  Tool Execution Layer
      │  ┌────────────────────────────────────────────┐
      │  │ Assistant::ChatToolBuilder                  │
      │  │ ├─ Content tools (semantic_search, FAQs)   │──► pgvector / OpenAI Embeddings
      │  │ └─ PostgreSQL tools                         │
      │  │    ├─ describe_available_data               │──► CommentManifest (db/comments/*.yml)
      │  │    ├─ execute_sql ──► SqlBroker             │──► PostgreSQL (read-only)
      │  │    ├─ list_schemas                          │
      │  │    ├─ list_objects                          │
      │  │    ├─ get_object_details                    │
      │  │    └─ explain_query                         │
      │  └────────────────────────────────────────────┘
      │
      │  Security Layer
      │  ┌────────────────────────────────────────────┐
      │  │ Assistant::DataDomainPolicy ◄── data_domains.yml│
      │  │ ├─ role → domain mapping                    │
      │  │ └─ CommentManifest.objects_for_domains()    │
      │  │                                             │
      │  │ Assistant::DataPolicy                       │
      │  │ ├─ ALWAYS_BLOCKED_OBJECTS                   │
      │  │ ├─ object_allowed?()                        │
      │  │ └─ sensitive_columns_for_query()            │
      │  │                                             │
      │  │ Assistant::ToolLoopGuard                    │
      │  │ ├─ MAX_IDENTICAL_CALLS = 2                  │
      │  │ └─ MAX_CONSECUTIVE_SQL_FAILURES = 3         │
      │  │                                             │
      │  │ AssistantSqlAuditLog (every SQL execution)  │
      │  └────────────────────────────────────────────┘

Components

1. Agent — Assistant::SunnyAgent

File: app/agents/assistant/sunny_agent.rb

RubyLLM::Agent subclass that serves as the canonical factory and configuration baseline for Sunny conversations. Follows the RubyLLM Agent pattern: chat_model wires create!/find to AssistantConversation, and instructions { nil } disables automatic discovery so ChatService can apply per-request configuration.

# Create a new conversation
conversation = Assistant::SunnyAgent.create!(user: current_user, messages: [])

# Load an existing conversation
conversation = Assistant::SunnyAgent.find(conversation_id)

All dynamic configuration (model, temperature, system prompt, tools, thinking, Anthropic prompt caching) is applied per-request by ChatService#configure_conversation. The agent is intentionally lean — it's a factory, not an orchestrator.

2. Controller — Crm::AssistantChatController

File: app/controllers/crm/assistant_chat_controller.rb

The HTTP entry point. Handles conversation CRUD and delegates LLM processing to a Sidekiq worker.

Action Method Description
index GET Load most recent conversation; list sidebar conversations
show GET Load a specific conversation (owned or shared)
create POST Create a new conversation via SunnyAgent.create!
ask POST Accept user message, enqueue AssistantChatWorker
cancel POST Set Redis cancellation flag; worker checks between chunks
destroy DELETE Delete a conversation

Key responsibilities:

  • User context assembly: build_user_context() serializes identity (name, party_id, department, job_title), role flags (is_admin, is_manager), and resolved analytics_domains into a Hash passed to the worker.
  • Tool service selection: available_chat_services() limits non-admins to content + postgres_production. Admins also get postgres_versions (audit trail DB).
  • Processing lock: Checks conversation.processing? before enqueuing; blocks if a job is already running.
  • Access control: Viewers (shared conversations) cannot submit new messages.

3. Worker — AssistantChatWorker

File: app/workers/assistant_chat_worker.rb

Sidekiq job that runs the full LLM interaction loop in the background.

Lifecycle:

  1. Acquire processing lock (acquire_processing_lock!)
  2. Load conversation, instantiate Assistant::ChatService
  3. Call service.call { |chunk| broadcast_chunk(chunk) }
  4. LLM streams response tokens → worker broadcasts live preview via Turbo::StreamsChannel
  5. On tool calls: LLM pauses, tool executes, result fed back, LLM continues
  6. On completion: ResponseFormatter renders final markdown → HTML; broadcast replaces preview
  7. finalize_response(): track metrics, sync token totals, dual-write legacy JSONB
  8. Release processing lock

Streaming: The worker throttles preview updates to ~8fps (0.12s interval) for smooth UX.

Cancellation: cancel action sets cancelled:{jid} in Redis (TTL 300s). Worker checks cancelled? between chunks and stops streaming.

Error handling: broadcast_error maps known error patterns (rate limits, timeouts, invalid model IDs, JSON parse failures) to user-friendly messages.

4. Chat Service — Assistant::ChatService

File: app/services/assistant/chat_service.rb

Orchestrates a single chat turn: model selection, system prompt assembly, tool registration, thinking activation, and streaming execution.

Model Selection

MODELS = {
  'claude-haiku'  => { id: LlmDefaults::DEFAULT_HAIKU_MODEL,  provider: :anthropic, cost: :low,    supports_thinking: false },
  'claude-sonnet' => { id: LlmDefaults::DEFAULT_SONNET_MODEL, provider: :anthropic, cost: :medium, supports_thinking: true,  thinking_effort_default: :medium },
  'claude-opus'   => { id: LlmDefaults::DEFAULT_OPUS_MODEL,   provider: :anthropic, cost: :high,   supports_thinking: true,  thinking_effort_default: :high },
  'gpt-4.1'       => { id: 'gpt-4.1',                        provider: :openai,    cost: :medium, supports_thinking: false },
  'gpt-4.1-mini'  => { id: 'gpt-4.1-mini',                   provider: :openai,    cost: :low,    supports_thinking: false },
  'gemini-flash'  => { id: 'gemini-2.5-flash',               provider: :gemini,    cost: :low,    supports_thinking: true,  thinking_effort_default: :low },
  'gemini-pro'    => { id: 'gemini-2.5-pro',                 provider: :gemini,    cost: :medium, supports_thinking: true,  thinking_effort_default: :medium },
}

Auto-selection (model: 'auto') uses regex pattern matching + token estimation:

  • Simple lookups (< 50 tokens, "show me", "list", "count") → claude-haiku
  • Complex/comparison queries ("trend", "correlate", "why", > 150 tokens) → claude-sonnet
  • Defaultclaude-sonnet

Model IDs are managed through config/initializers/llm_defaults.rb to ensure a single source of truth.

System Prompt

Assembled from layers, joined with "\n\n":

  1. Base template (app/prompts/assistant/sunny_agent/instructions.txt.erb): Identity, date macros, formatting guidelines, source citation rules
  2. user_context_prompt: User identity (## CURRENT USER), party_id mappings for "my" queries, data access domains
  3. schema_hint_for_role: Instructs the LLM to call describe_available_data; power users also get list_schemas/get_object_details
  4. tools_system_prompt: Tool names, tool-mode rules, and service-specific rules (content, blog, GA4, GSC, Ahrefs, etc.)
  5. crm_url_templates_prompt: CRM URL patterns for linking records by ID
  6. brain_prompt: Hybrid-retrieved AssistantBrainEntry rules injected under ## LEARNED RULES; contextually filtered to entries whose applies_to_services overlap with the active tool services

For Anthropic models, the system prompt is wrapped with cache: true via RubyLLM::Providers::Anthropic::Content. This enables up to 90% input token savings on subsequent turns.

Extended Thinking

When the model supports it and the query matches THINKING_QUERY_PATTERNS:

  • Activated via conversation.with_thinking(effort:, budget:)
  • Budget tokens: 4K (low) / 8K (medium) / 16K (high, Opus only)
  • Anthropic requires temperature: 1 when thinking is active
  • Thinking traces are persisted to assistant_messages.thinking_text

Conversation Configuration (configure_conversation)

conversation.with_model(model_id, provider:, assume_exists: true)
conversation.with_instructions(cacheable_system_prompt, replace: true)
conversation.with_thinking(effort:, budget:)        # if applicable
conversation.with_params(max_tokens: budget + 16_000) # if thinking with budget
conversation.with_temperature(thinking? ? 1 : 0.3)
conversation.with_tools(*tools)                       # from ChatToolBuilder
conversation.on_new_message { emit_status(:thinking) }
conversation.on_tool_call   { |tc| emit_status(...) }
conversation.on_tool_result { ... }
conversation.on_end_message { emit_status(:composing) }

5. Tool Builder — Assistant::ChatToolBuilder

File: app/services/assistant/chat_tool_builder.rb

Dynamically builds RubyLLM::Tool subclasses for the chat. No HTTP hop to the MCP server — tools execute in-process.

Tool registration is keyed on tool_services — an array passed from the controller that determines which tool groups are active for this conversation. Non-admins are limited to content + postgres_production; admins also get postgres_versions.

Content Tools (content)

Tool Description
semantic_search Semantic vector search across all content types
find_faqs FAQ search with product line filtering
find_call_recordings Permission-gated call transcript search

PostgreSQL Tools (postgres_production, postgres_versions)

Built per database service key. Access level depends on role:

Tool Admin/Manager Employee
describe_available_data Yes Yes
execute_sql Full Domain-restricted
list_schemas Yes No
list_objects Yes No
get_object_details Yes No
explain_query Yes No

Blog Management Tools (blog_management)

Tool Description
create_blog_post Create a draft post (body, SEO, tags, preview image)
update_blog_post Update fields + atomically replace tags; creates a revision
get_blog_post Read all editable fields, current tags, revision count
insert_image oEmbed rendered HTML for an image embed
insert_video oEmbed rendered HTML for a video embed (Cloudflare or YouTube)
insert_faqs oEmbed rendered HTML for an FAQ section
insert_product oEmbed rendered HTML for a product card (Liquid tag)

Brain Tool (all services)

Tool Description
propose_brain_entry Propose a new learned rule; creates a pending brain entry

6. SQL Broker — Assistant::SqlBroker

File: app/services/assistant/sql_broker.rb

Centralized execution layer for all AI-generated SQL.

Enforcement chain:

  1. Read-only check: Only SELECT, SHOW, WITH, EXPLAIN, SET are allowed
  2. SQL normalization: Strip comments, collapse whitespace
  3. Object extraction: PgQuery.parse(sql).tables via AST
  4. Object-level access: DataPolicy.object_allowed?() checks against ALWAYS_BLOCKED_OBJECTS + domain-resolved allowed_objects
  5. Query execution: Read-only transaction with statement timeout (8s)
  6. Column-level redaction: PII columns redacted from results
  7. Row cap: Truncates to 50 rows; adds truncation notice
  8. Audit logging: Every execution logged to assistant_sql_audit_logs

7. Data Domain Access Control

Same domain system as described in the Analytics section:

  • File: config/analytics/data_domains.yml
  • 5 business domains: sales, financial, workforce, operations, marketing
  • Role mappings: sales_rep, sales_manager, accounting_manager, marketing_manager, etc.
  • Database tables/views declare their domain(s) in db/comments/*.yml manifests
  • Admins get nil (unrestricted); DataDomainPolicy resolves roles → domains → allowed objects

8. Context Compactor — Assistant::ContextCompactor

File: app/services/assistant/context_compactor.rb

Sliding-window context management for long conversations:

  • When token count exceeds the threshold, ensure_context_summary! generates (or returns a cached) summary of older messages
  • Summary injected as synthetic user + assistant messages in to_llm (after system messages)
  • Only recent messages (after compaction_through_message_id) are sent verbatim
  • Summary and cutoff ID are cached in AssistantConversation's JSONB metadata

9. Cost Calculator — Assistant::CostCalculator

File: app/services/assistant/cost_calculator.rb

Computes per-turn and per-conversation cost from token counts and model pricing:

  • Looks up per-token price from ChatService::MODELS
  • Accounts for cached tokens (90% discount) and cache-creation tokens
  • Called by AssistantConversation#computed_total_cost and sync_token_totals!

10. Response Formatter — Assistant::ResponseFormatter

File: app/services/assistant/response_formatter.rb

Transformation Description
Markdown → HTML Kramdown with GFM + Rouge syntax highlighting
HTML sanitization Sanitize gem strips XSS vectors
SQL collapse SQL blocks wrapped in Bootstrap collapsible accordions
Table enhancement Bootstrap tables with cell formatting and export buttons (Copy, CSV, Excel)
Entity linking Order numbers, customer IDs, SKUs linked to CRM pages
Source panel Source citations styled distinctively

11. Sunny Brain — AssistantBrainEntry

Files: app/models/assistant_brain_entry.rb, app/controllers/crm/assistant_brain_controller.rb

The Brain is a persistent, CRM-editable knowledge base of rules and facts that are injected into Sunny's system prompt on every conversation. It replaces hard-coded domain rules in prompt files with database-backed entries that any manager can inspect, approve, or correct without a code deploy.

Entry Model

assistant_brain_entries
  id, category, title, rule (text),
  scope ('global'|'user'), user_id,
  applies_to_services (varchar[]),
  status ('active'|'pending'|'inactive'),
  source ('manual'|'auto_learned'),
  suggested_by_id, approved_by_id,
  created_at, updated_at
  • scope: 'global' — shared across all users; managed by managers / sunny_administrator role
  • scope: 'user' — personal preferences for one user; self-managed, never shown to others
  • applies_to_services — contextual filtering: an empty array means "always inject"; a populated array means "only inject when one of these tool services is active in the current conversation" (e.g. ['blog_management'] keeps blog rules out of analytics sessions)

Categories

url_rules, product_data, content_rules, analytics, schema_knowledge, general

System Prompt Injection

ChatService#brain_prompt is called during prompt assembly and appends a ## LEARNED RULES section. It uses hybrid retrieval:

  1. Universal entries (applies_to_services: []) — always injected
  2. Service-specific entries — when the total exceeds BRAIN_SEMANTIC_THRESHOLD (40) and a user message is present, pgvector cosine similarity selects the top BRAIN_SEMANTIC_TOP_K (20) most relevant entries from candidates; otherwise all are injected
  3. Entries are sorted by category then title before injection

Semantic Embeddings

Each AssistantBrainEntry includes Models::Embeddable. When title or rule changes, an Events::ContentEmbeddingRequired event is published to the Rails Event Store. Embedding::ContentChangedHandler (async handler) dequeues an EmbeddingWorker job on the ai_embeddings queue, which calls OpenAI text-embedding-3-small and stores a 1536-dimension vector in the content_embeddings_assistant_brain_entries partition (PostgreSQL list partition of content_embeddings keyed on embeddable_type = 'AssistantBrainEntry').

The partition has:

  • A UNIQUE index on (embeddable_id, content_type, locale)
  • An HNSW index with vector_cosine_ops for fast approximate nearest-neighbor search

Self-Learning via ProposeBrainEntryTool

Sunny can propose new rules after user corrections by calling the propose_brain_entry tool (registered in ChatToolBuilder). Proposed entries are created with status: 'pending'. Managers see a Pending Approval panel in the Brain CRM page and can approve or reject with a single click.

CRM Interface

GET /assistant_brain — lists:

  • Global active entries (grouped by category) — visible to all users with access
  • Pending entries awaiting approval — sunny_admin only
  • Inactive entries — sunny_admin only
  • Current user's personal entries

Access control uses CanCan:

# ability.rb — inside is_manager? block
can :manage, AssistantBrainEntry

# Plus standalone for non-managers with explicit role
can :manage, AssistantBrainEntry if .has_role?('sunny_administrator', admin_check: false)

The sunny_administrator role (created in 20260227140000_add_sunny_administrator_role) can be granted to non-managers who need brain management access (e.g. content specialists).


12. Blog Management Tool Service

File: app/services/assistant/blog_tool_builder.rb

The blog_management tool service provides Sunny with 8 tools for creating and maintaining blog posts.

Tools

Tool Description
create_blog_post Create a draft post with HTML body, SEO fields, tags, preview image
update_blog_post Update any post field; creates a revision; replaces tags atomically
get_blog_post Read all editable fields + current tags, revisions, effective meta description
insert_image Fetch rendered oEmbed HTML for an image embed
insert_video Fetch rendered oEmbed HTML for a video embed
insert_faqs Fetch rendered oEmbed HTML for an FAQ section
insert_product Fetch rendered oEmbed HTML for a product card
propose_brain_entry Propose a new brain rule (available in all services)

Tagging System (Two-Tier)

Blog posts use two distinct tag types, both applied via the tags: parameter:

Tier 1 — Page-placement tags (public: false): internal tags that make a post appear in the content section of a specific landing page. Format: for-{page-path-parameterized}-page. Used by PagesHelper#page_posts, page_videos, page_showcases, etc. Never shown to visitors (filtered by BlogHelper#tags_with_links).

Examples: for-towel-warmer-page, for-towel-warmer-matte-black-page, for-floor-heating-bathroom-page

Tier 2 — Public navigation tags (public: true): drive the blog tag cloud at /posts/{tag}/tag. 22 tags in production: towel-warmers, installation, indoor-heating, snow-melting, design-trends, etc.

The tags: parameter in update_blog_post replaces the full tag set (clear + add). Always call get_blog_post first to read existing tags before updating. Only tags that already exist in the database are applied — unrecognised names are silently skipped (assign_tags never creates new tags).

Link Validation

Before calling update_blog_post or create_blog_post with any HTML containing <a> elements, Sunny must validate every link using fetch_url. Links that return 404, redirect loops, or error pages must be corrected before the post is saved. This is enforced by a Brain rule (Validate All Links Before Saving a Blog Post) injected when blog_management + web_fetch are active.

Style Guide (BlogToolBuilder::STYLE_GUIDE)

A condensed style guide is injected into the create_blog_post and update_blog_post tool descriptions, covering headings (H2+), callouts, feature boxes, sidebars, tables, Liquid tags, Liquid variables, internal link patterns (always use {{ locale }}), and the oEmbed embed workflow.


13. Embedding Infrastructure

Files: app/concerns/models/embeddable.rb, app/subscribers/embedding/content_changed_handler.rb, app/workers/embedding_worker.rb

Models::Embeddable Concern

Included by any model that needs semantic search. Provides:

  • content_for_embedding(content_type) — override to define what text is embedded
  • embedding_content_changed? — override to define which attribute changes trigger re-embedding
  • queue_embedding_generation — publishes Events::ContentEmbeddingRequired to the event store

The concern uses the Rails Event Store pub/sub system rather than direct worker enqueue, ensuring every embedding regeneration trigger has a permanent, auditable record in event_store_events.

Event Flow

Model#save → after_commit → queue_embedding_generation
  → event_store.publish(Events::ContentEmbeddingRequired, data: {type, id})
    → Embedding::ContentChangedHandler (AsyncJob, ai_embeddings queue)
      → EmbeddingWorker.perform_async(type, id)
        → OpenAI text-embedding-3-small API
          → ContentEmbedding.upsert(embeddable, vector)

ContentEmbedding Table (Partitioned)

The content_embeddings parent table is list-partitioned by embeddable_type. Each model that uses embeddings has its own partition created via pg_party's create_list_partition_of. Per-partition indexes:

  • UNIQUE on (embeddable_id, content_type, locale)
  • HNSW (vector_cosine_ops) for cosine ANN search

pgvector HNSW indexes cannot be placed on partitioned parent tables; they must be on each partition individually.

Models using embeddings

Model Partition Embedding content
Article / Post / Faq content_embeddings_articles Subject + solution body
Video content_embeddings_videos Title + description + transcript
Showcase content_embeddings_showcases Title + description
AssistantBrainEntry content_embeddings_assistant_brain_entries Title + rule text

Data Models

AssistantConversation

Table: assistant_conversations

id, user_id, title, messages (jsonb, legacy), metadata (jsonb), llm_model_id,
processing_by_id, processing_since, timestamps
  • acts_as_chat from RubyLLM — provides ask(), with_model(), with_tools(), with_instructions(), with_thinking()
  • Messages auto-persist to assistant_messages table
  • metadata stores via jsonb_accessor: token totals, cost, queries, compaction summary, tool services
  • with_instructions override fixes RubyLLM v1.11.0 bug where Content::Raw objects were serialized as #<RubyLLM::Content::Raw:0x...> instead of their content
  • to_llm override: eager-loads to prevent N+1 queries, integrates context compaction, filters empty/orphaned/duplicate messages
  • Processing lock via processing_by_id + processing_since columns (5-minute staleness threshold)

AssistantMessage

Table: assistant_messages

id, assistant_conversation_id, role, content, content_raw (json),
input_tokens, output_tokens, cached_tokens, cache_creation_tokens,
thinking_text, thinking_tokens, thinking_signature,
llm_model_id, assistant_tool_call_id, timestamps

AssistantToolCall

Table: assistant_tool_calls

id, assistant_message_id, tool_call_id, name, arguments (jsonb), timestamps

AssistantConversationShare

Table: assistant_conversation_shares

id, assistant_conversation_id, shared_with_type, shared_with_id, access_level, timestamps

Access levels: viewer (read-only), collaborator (can send messages).

AssistantBrainEntry

Table: assistant_brain_entries

id, category, title, rule (text),
scope ('global'|'user'), user_id,
applies_to_services (varchar[]),
status ('active'|'pending'|'inactive'),
source ('manual'|'auto_learned'),
suggested_by_id, approved_by_id,
created_at, updated_at
  • Includes Models::Auditable (PaperTrail) — all changes tracked in record_versions table
  • Includes Models::Embeddable — rule text vectorised to content_embeddings_assistant_brain_entries partition
  • Scopes: .active, .pending, .global, .for_user(user_id), .for_services(service_keys)

ContentEmbedding

Table: content_embeddings (list-partitioned by embeddable_type)

id, embeddable_type, embeddable_id, content_type, locale, embedding (vector(1536)), timestamps

Used by: AssistantBrainEntry, Article/Post/Faq, Video, Showcase. Cosine ANN search via has_neighbors from the neighbor gem.

UI Layer

Stimulus Controller — assistant_chat_controller.js

Target Description
messages Chat message container (auto-scroll target)
input Text input field
submitButton Send button (loading state management)
placeholder Empty-state placeholder
modelSelect Model dropdown selector

Features:

  • MutationObserver: Watches messages target; auto-scrolls on new content
  • Completion detection: Scans for [data-chat-complete="true"] marker to re-enable input
  • Double-submit prevention: isProcessing flag blocks re-submission while worker runs

Turbo Stream Flow

  1. User submits → Turbo handles POST /assistant/ask
  2. Controller responds with ask.turbo_stream.erb → appends user message bubble + processing indicator
  3. Worker broadcasts via Turbo::StreamsChannel:
    • broadcast_replace_to → live preview with gray monospace text + blinking caret
    • Throttled to ~8fps for smooth streaming
  4. On completion: Worker broadcasts final rendered HTML replacing the preview
  5. Stimulus detects [data-chat-complete] → re-enables input

Stream name: assistant_chat:{conversation_id}

Configuration

Model IDs — config/initializers/ai_model_constants.rb

AiModelConstants is the canonical registry for every model ID. LlmDefaults
is a thin backward-compatibility shim that delegates to it:

module LlmDefaults
  DEFAULT_SONNET_MODEL = AiModelConstants.id(:anthropic_sonnet) # claude-sonnet-4-6
  DEFAULT_OPUS_MODEL   = AiModelConstants.id(:anthropic_opus)   # claude-opus-4-8
  DEFAULT_HAIKU_MODEL  = AiModelConstants.id(:anthropic_haiku)  # claude-haiku-4-5-20251001
end

All model references in ChatService::MODELS use these constants. Never hardcode
model IDs elsewhere. WeeklyLlmModelSyncWorker syncs the live provider registry
weekly and emails a report when a pinned default has fallen behind.

RubyLLM Provider Keys

Provider Env Variable
Anthropic ANTHROPIC_API_KEY
OpenAI OPENAI_API_KEY
Google GEMINI_API_KEY

Database Services

Service Key Label Connection Class
postgres_production App DB ActiveRecord::Base
postgres_versions Versions DB VersionsDb::Base

File Index

File Purpose
app/agents/assistant/sunny_agent.rb RubyLLM Agent: factory for AssistantConversation records
app/prompts/assistant/sunny_agent/instructions.txt.erb Base system prompt template (identity, date macros, guidelines)
app/controllers/crm/assistant_chat_controller.rb HTTP entry point, user context, tool service selection
app/controllers/crm/assistant_brain_controller.rb CRM CRUD for brain entries; access-gated via CanCan
app/workers/assistant_chat_worker.rb Sidekiq job: LLM loop, streaming, response formatting
app/services/assistant/chat_service.rb Model selection, prompt assembly (incl. brain_prompt), thinking, tool registration
app/services/assistant/chat_tool_builder.rb Builds RubyLLM::Tool instances (content, DB, blog, brain tools)
app/services/assistant/blog_tool_builder.rb Blog create/update/get/embed tools + tagging helpers
app/services/assistant/sql_broker.rb SQL execution: read-only, access control, redaction, audit
app/services/assistant/data_policy.rb Object + column access rules, PII redaction
app/services/assistant/data_domain_policy.rb Role → domain → allowed objects resolution
app/services/assistant/comment_manifest.rb YAML manifest reader: domains, restricted columns, comments
app/services/assistant/tool_loop_guard.rb Prevents repetitive tool call loops
app/services/assistant/context_compactor.rb Sliding-window conversation summarisation
app/services/assistant/cost_calculator.rb Per-turn and per-conversation cost from token counts
app/services/assistant/response_formatter.rb Markdown → HTML with tables, SQL collapse, entity links
app/models/assistant_conversation.rb Conversation record (acts_as_chat) with token tracking
app/models/assistant_message.rb Per-message record (acts_as_message) with token tracking
app/models/assistant_tool_call.rb Tool invocation record (acts_as_tool_call)
app/models/assistant_conversation_share.rb Conversation sharing (viewer / collaborator)
app/models/assistant_brain_entry.rb Learned rules model; embeddable + auditable
app/models/content_embedding.rb Polymorphic vector store (has_neighbors); partitioned by type
app/concerns/models/embeddable.rb Concern: queue_embedding_generation via Rails Event Store
app/events/events/content_embedding_required.rb Pub/sub event: signals an embedding regeneration is needed
app/subscribers/embedding/content_changed_handler.rb Async handler: enqueues EmbeddingWorker on ai_embeddings queue
app/workers/embedding_worker.rb Calls OpenAI text-embedding-3-small; upserts to content_embeddings
config/initializers/llm_defaults.rb Canonical Anthropic model ID constants
config/analytics/data_domains.yml Domain descriptions + role mappings
db/comments/*.yml Per-table manifests with domain, comments, restricted flags
db/data/brain_entry_embeddings_seed.json Pre-computed embeddings for initial brain entries (avoids API costs on deploy)

Access Control

Who can use Sunny

All CRM accounts. Tool services available vary by role:

Tool service Non-admin Admin / Manager
content Yes Yes
postgres_production Domain-restricted Full
postgres_versions (audit DB) No Yes
blog_management No Yes

Who can manage the Brain

The can :manage, AssistantBrainEntry CanCan ability is granted to:

  1. Any account where is_manager? returns true (inside the shared manager ability block in ability.rb)
  2. Any account with the sunny_administrator role (can be assigned to non-managers, e.g. content specialists)

All CRM views (index.html.erb, histories.html.erb) use can?(:manage, AssistantBrainEntry) to conditionally render the "Sunny Brain" button. The controller uses the same ability check via require_sunny_admin!.


Testing

Test File Coverage
test/agents/assistant/sunny_agent_create_test.rb create! returns persisted conversation; no llm_model; no system messages at factory time; title defaults and custom title
test/agents/assistant/sunny_agent_find_test.rb find returns existing conversation without mutating it; no llm_model; no system messages
test/integration/sunny_blog_editor_end_to_end_test.rb Julia-style blog tool flows (conv 1565); assertion helpers in test/support/sunny_blog_editor_julia_flow_helpers.rb
test/models/assistant_conversation_test.rb to_llm nil guard, track_query!, processing lock, scopes
test/models/assistant_brain_entry_test.rb Scopes, status transitions, for_services filtering
test/services/assistant/chat_service_test.rb MODELS registry, auto_select_model, estimate_tokens, base prompt, brain prompt injection
test/workers/assistant_chat_worker_test.rb broadcast_error mappings, graceful error handling
test/initializers/llm_defaults_test.rb Model IDs valid, Anthropic pattern, no future dates, RubyLLM resolution

Run AI assistant tests:

mise exec -- bin/rails test test/agents/ \
  test/models/assistant_conversation_test.rb \
  test/models/assistant_brain_entry_test.rb \
  test/services/assistant/chat_service_test.rb \
  test/workers/assistant_chat_worker_test.rb \
  test/initializers/llm_defaults_test.rb

Migration History (Sunny features)

Migration Description
20260227100000_create_assistant_brain_entries Creates assistant_brain_entries table
20260227110000_add_scope_to_brain_entries Adds scope, user_id, applies_to_services columns
20260227140000_add_sunny_administrator_role Creates sunny_administrator role in roles table
20260228100001_seed_assistant_brain_entries Seeds 14 initial brain entries (stub class pattern)
20260228160000_add_brain_entry_embeddings Creates content_embeddings partitioned table + AssistantBrainEntry partition; seeds pre-computed vectors
20260228170000_seed_brain_entry_embeddings Inserts pre-computed embedding vectors for the 14 initial entries
20260228190000_add_blog_tagging_brain_entry Seeds two-tier tagging brain entry; queues embedding via after_commit

Migration pattern: Data migrations that create AssistantBrainEntry records during a deploy sequence must use a local stub class (class BrainEntry < ApplicationRecord; self.table_name = 'assistant_brain_entries'; end) if the migration runs before all schema migrations are applied. This avoids NoMethodError from model validations referencing not-yet-existing columns. The final tagging entry (20260228190000) intentionally uses the real model so the after_commit embedding callback fires.