AI Agent Architecture

Status: Production Ready Last Updated: February 2026

Overview

The AI Agent (Sunny) is an internal assistant for WarmlyYours employees, accessible at /assistant in the CRM. It answers business questions using real-time database queries, content search, blog management, and multi-model LLM reasoning. The system is built on RubyLLM’s acts_as_chat + Agent patterns, with role-based data access, a persistent self-learning brain, tool-loop guardrails, audit logging, and streaming Turbo Stream responses.

Key capabilities:

Natural-language SQL: Users ask questions; the LLM writes, executes, and interprets read-only SQL
Content search: Semantic search across products, FAQs, blog posts, videos, showcases, and reviews
Blog management: Create, update, and enrich blog posts with linked assets (images, videos, FAQs, product cards), proper tagging, and SEO metadata
Multi-model support: Auto-selects from Claude, GPT-4.1, or Gemini based on query complexity
Extended thinking: Activates reasoning scratchpad for analytical queries (Anthropic/Gemini)
Data domain access control: CanCanCan roles map to data domains; each database table declares which domains it belongs to
Streaming UI: Real-time token-by-token streaming with live preview, tool status, and formatted output
Conversation sharing: Owner can share conversations with colleagues at viewer or collaborator level
Context compaction: Sliding-window summary for long conversations to stay within token limits
Sunny Brain: Persistent, editable knowledge base of learned rules injected into every system prompt; self-updates via semantic embeddings

Architecture Diagram

 Browser (CRM)                    Rails Server                      External
┌──────────────────┐   POST      ┌──────────────────────────────┐
│ Stimulus          │──/ask──────►│ Crm::AssistantChatController  │
│ assistant_chat    │             │ ├─ build_user_context()       │
│ controller        │             │ ├─ sanitize_tool_services()   │
│                   │             │ └─ AssistantChatWorker        │
│ MutationObserver  │                 .perform_async()            │
│ auto-scroll       │             └───────────┬──────────────────┘
│ completion detect │                         │ Sidekiq
└─────▲────────────┘                         ▼
      │                           ┌──────────────────────────────┐
      │ Turbo Streams             │ AssistantChatWorker           │
      │ (ActionCable)             │ ├─ ChatService.new(...)       │
      │                           │ │  ├─ auto_select_model()    │
      │                           │ │  ├─ configure_conversation()│
      │                           │ │  │  ├─ with_model()        │
      │                           │ │  │  ├─ with_instructions() │
      │                           │ │  │  ├─ with_tools()        │   ┌──────────┐
      │                           │ │  │  ├─ with_thinking()     │──►│ Anthropic │
      │                           │ │  │  └─ on_tool_call()      │   │ OpenAI   │
      │                           │ │  └─ conversation.ask()     │   │ Gemini   │
      │                           │ │     ├─ LLM streaming       │   └──────────┘
      │                           │ │     ├─ tool execution ─────┤
      │                           │ │     └─ auto-persist msgs   │
      │                           │ ├─ broadcast_chunk()          │
      │◄──────────────────────────│ ├─ broadcast_complete()       │
      │                           │ │  └─ ResponseFormatter       │
      │                           │ └─ finalize_response()        │
      │                           └──────────────────────────────┘
      │
      │  Tool Execution Layer
      │  ┌────────────────────────────────────────────┐
      │  │ Assistant::ChatToolBuilder                  │
      │  │ ├─ Content tools (semantic_search, FAQs)   │──► pgvector / OpenAI Embeddings
      │  │ └─ PostgreSQL tools                         │
      │  │    ├─ describe_available_data               │──► CommentManifest (db/comments/*.yml)
      │  │    ├─ execute_sql ──► SqlBroker             │──► PostgreSQL (read-only)
      │  │    ├─ list_schemas                          │
      │  │    ├─ list_objects                          │
      │  │    ├─ get_object_details                    │
      │  │    └─ explain_query                         │
      │  └────────────────────────────────────────────┘
      │
      │  Security Layer
      │  ┌────────────────────────────────────────────┐
      │  │ Assistant::DataDomainPolicy ◄── data_domains.yml│
      │  │ ├─ role → domain mapping                    │
      │  │ └─ CommentManifest.objects_for_domains()    │
      │  │                                             │
      │  │ Assistant::DataPolicy                       │
      │  │ ├─ ALWAYS_BLOCKED_OBJECTS                   │
      │  │ ├─ object_allowed?()                        │
      │  │ └─ sensitive_columns_for_query()            │
      │  │                                             │
      │  │ Assistant::ToolLoopGuard                    │
      │  │ ├─ MAX_IDENTICAL_CALLS = 2                  │
      │  │ └─ MAX_CONSECUTIVE_SQL_FAILURES = 3         │
      │  │                                             │
      │  │ AssistantSqlAuditLog (every SQL execution)  │
      │  └────────────────────────────────────────────┘

Components

1. Agent — `Assistant::SunnyAgent`

File: app/agents/assistant/sunny_agent.rb

RubyLLM::Agent subclass that serves as the canonical factory and configuration baseline for Sunny conversations. Follows the RubyLLM Agent pattern: chat_model wires create!/find to AssistantConversation, and instructions { nil } disables automatic discovery so ChatService can apply per-request configuration.

# Create a new conversation
conversation = Assistant::SunnyAgent.create!(user: current_user, messages: [])

# Load an existing conversation
conversation = Assistant::SunnyAgent.find(conversation_id)

All dynamic configuration (model, temperature, system prompt, tools, thinking, Anthropic prompt caching) is applied per-request by ChatService#configure_conversation. The agent is intentionally lean — it’s a factory, not an orchestrator.

2. Controller — `Crm::AssistantChatController`

File: app/controllers/crm/assistant_chat_controller.rb

The HTTP entry point. Handles conversation CRUD and delegates LLM processing to a Sidekiq worker.

Action	Method	Description
`index`	GET	Load most recent conversation; list sidebar conversations
`show`	GET	Load a specific conversation (owned or shared)
`create`	POST	Create a new conversation via `SunnyAgent.create!`
`ask`	POST	Accept user message, enqueue `AssistantChatWorker`
`cancel`	POST	Set Redis cancellation flag; worker checks between chunks
`destroy`	DELETE	Delete a conversation

Key responsibilities:

User context assembly: build_user_context() serializes identity (name, party_id, department, job_title), role flags (is_admin, is_manager), and resolved analytics_domains into a Hash passed to the worker.
Tool service selection: available_chat_services() limits non-admins to content + postgres_production. Admins also get postgres_versions (audit trail DB).
Processing lock: Checks conversation.processing? before enqueuing; blocks if a job is already running.
Access control: Viewers (shared conversations) cannot submit new messages.

3. Worker — `AssistantChatWorker`

File: app/workers/assistant_chat_worker.rb

Sidekiq job that runs the full LLM interaction loop in the background.

Lifecycle:

Acquire processing lock (acquire_processing_lock!)
Load conversation, instantiate Assistant::ChatService
Call service.call { |chunk| broadcast_chunk(chunk) }
LLM streams response tokens → worker broadcasts live preview via Turbo::StreamsChannel
On tool calls: LLM pauses, tool executes, result fed back, LLM continues
On completion: ResponseFormatter renders final markdown → HTML; broadcast replaces preview
finalize_response(): track metrics, sync token totals, dual-write legacy JSONB
Release processing lock

Streaming: The worker throttles preview updates to ~8fps (0.12s interval) for smooth UX.

Cancellation: cancel action sets cancelled:{jid} in Redis (TTL 300s). Worker checks cancelled? between chunks and stops streaming.

Error handling: broadcast_error maps known error patterns (rate limits, timeouts, invalid model IDs, JSON parse failures) to user-friendly messages.

4. Chat Service — `Assistant::ChatService`

File: app/services/assistant/chat_service.rb

Orchestrates a single chat turn: model selection, system prompt assembly, tool registration, thinking activation, and streaming execution.

Model Selection

MODELS = {
  'claude-haiku'  => { id: LlmDefaults::DEFAULT_HAIKU_MODEL,  provider: :anthropic, cost: :low,    supports_thinking: false },
  'claude-sonnet' => { id: LlmDefaults::DEFAULT_SONNET_MODEL, provider: :anthropic, cost: :medium, supports_thinking: true,  thinking_effort_default: :medium },
  'claude-opus'   => { id: LlmDefaults::DEFAULT_OPUS_MODEL,   provider: :anthropic, cost: :high,   supports_thinking: true,  thinking_effort_default: :high },
  'gpt-4.1'       => { id: 'gpt-4.1',                        provider: :openai,    cost: :medium, supports_thinking: false },
  'gpt-4.1-mini'  => { id: 'gpt-4.1-mini',                   provider: :openai,    cost: :low,    supports_thinking: false },
  'gemini-flash'  => { id: 'gemini-2.5-flash',               provider: :gemini,    cost: :low,    supports_thinking: true,  thinking_effort_default: :low },
  'gemini-pro'    => { id: 'gemini-2.5-pro',                 provider: :gemini,    cost: :medium, supports_thinking: true,  thinking_effort_default: :medium },
}

Auto-selection (model: 'auto') uses regex pattern matching + token estimation:

Simple lookups (< 50 tokens, “show me”, “list”, “count”) → claude-haiku
Complex/comparison queries (“trend”, “correlate”, “why”, > 150 tokens) → claude-sonnet
Default → claude-sonnet

Model IDs are managed through config/initializers/llm_defaults.rb to ensure a single source of truth.

System Prompt

Assembled from layers, joined with "\n\n":

Base template (app/prompts/assistant/sunny_agent/instructions.txt.erb): Identity, date macros, formatting guidelines, source citation rules
user_context_prompt: User identity (## CURRENT USER), party_id mappings for “my” queries, data access domains
schema_hint_for_role: Instructs the LLM to call describe_available_data; power users also get list_schemas/get_object_details
tools_system_prompt: Tool names, tool-mode rules, and service-specific rules (content, blog, GA4, GSC, Ahrefs, etc.)
crm_url_templates_prompt: CRM URL patterns for linking records by ID
brain_prompt: Hybrid-retrieved AssistantBrainEntry rules injected under ## LEARNED RULES; contextually filtered to entries whose applies_to_services overlap with the active tool services

For Anthropic models, the system prompt is wrapped with cache: true via RubyLLM::Providers::Anthropic::Content. This enables up to 90% input token savings on subsequent turns.

Extended Thinking

When the model supports it and the query matches THINKING_QUERY_PATTERNS:

Activated via conversation.with_thinking(effort:, budget:)
Budget tokens: 4K (low) / 8K (medium) / 16K (high, Opus only)
Anthropic requires temperature: 1 when thinking is active
Thinking traces are persisted to assistant_messages.thinking_text

Conversation Configuration (`configure_conversation`)

conversation.with_model(model_id, provider:, assume_exists: true)
conversation.with_instructions(cacheable_system_prompt, replace: true)
conversation.with_thinking(effort:, budget:)        # if applicable
conversation.with_params(max_tokens: budget + 16_000) # if thinking with budget
conversation.with_temperature(thinking? ? 1 : 0.3)
conversation.with_tools(*tools)                       # from ChatToolBuilder
conversation.on_new_message { emit_status(:thinking) }
conversation.on_tool_call   { |tc| emit_status(...) }
conversation.on_tool_result { ... }
conversation.on_end_message { emit_status(:composing) }

5. Tool Builder — `Assistant::ChatToolBuilder`

File: app/services/assistant/chat_tool_builder.rb

Dynamically builds RubyLLM::Tool subclasses for the chat. No HTTP hop to the MCP server — tools execute in-process.

Tool registration is keyed on tool_services — an array passed from the controller that determines which tool groups are active for this conversation. Non-admins are limited to content + postgres_production; admins also get postgres_versions.

Content Tools (`content`)

Tool	Description
`semantic_search`	Semantic vector search across all content types
`find_faqs`	FAQ search with product line filtering
`find_call_recordings`	Permission-gated call transcript search

PostgreSQL Tools (`postgres_production`, `postgres_versions`)

Built per database service key. Access level depends on role:

Tool	Admin/Manager	Employee
`describe_available_data`	Yes	Yes
`execute_sql`	Full	Domain-restricted
`list_schemas`	Yes	No
`list_objects`	Yes	No
`get_object_details`	Yes	No
`explain_query`	Yes	No

Blog Management Tools (`blog_management`)

Tool	Description
`create_blog_post`	Create a draft post (body, SEO, tags, preview image)
`update_blog_post`	Update fields + atomically replace tags; creates a revision
`get_blog_post`	Read all editable fields, current tags, revision count
`insert_image`	oEmbed rendered HTML for an image embed
`insert_video`	oEmbed rendered HTML for a video embed (Cloudflare or YouTube)
`insert_faqs`	oEmbed rendered HTML for an FAQ section
`insert_product`	oEmbed rendered HTML for a product card (Liquid tag)

Brain Tool (all services)

Tool	Description
`propose_brain_entry`	Propose a new learned rule; creates a `pending` brain entry

6. SQL Broker — `Assistant::SqlBroker`

File: app/services/assistant/sql_broker.rb

Centralized execution layer for all AI-generated SQL.

Enforcement chain:

Read-only check: Only SELECT, SHOW, WITH, EXPLAIN, SET are allowed
SQL normalization: Strip comments, collapse whitespace
Object extraction: PgQuery.parse(sql).tables via AST
Object-level access: DataPolicy.object_allowed?() checks against ALWAYS_BLOCKED_OBJECTS + domain-resolved allowed_objects
Query execution: Read-only transaction with statement timeout (8s)
Column-level redaction: PII columns redacted from results
Row cap: Truncates to 50 rows; adds truncation notice
Audit logging: Every execution logged to assistant_sql_audit_logs

7. Data Domain Access Control

Same domain system as described in the Analytics section:

File: config/analytics/data_domains.yml
5 business domains: sales, financial, workforce, operations, marketing
Role mappings: sales_rep, sales_manager, accounting_manager, marketing_manager, etc.
Database tables/views declare their domain(s) in db/comments/*.yml manifests
Admins get nil (unrestricted); DataDomainPolicy resolves roles → domains → allowed objects

8. Context Compactor — `Assistant::ContextCompactor`

File: app/services/assistant/context_compactor.rb

Sliding-window context management for long conversations:

When token count exceeds the threshold, ensure_context_summary! generates (or returns a cached) summary of older messages
Summary injected as synthetic user + assistant messages in to_llm (after system messages)
Only recent messages (after compaction_through_message_id) are sent verbatim
Summary and cutoff ID are cached in AssistantConversation’s JSONB metadata

9. Cost Calculator — `Assistant::CostCalculator`

File: app/services/assistant/cost_calculator.rb

Computes per-turn and per-conversation cost from token counts and model pricing:

Looks up per-token price from ChatService::MODELS
Accounts for cached tokens (90% discount) and cache-creation tokens
Called by AssistantConversation#computed_total_cost and sync_token_totals!

10. Response Formatter — `Assistant::ResponseFormatter`

File: app/services/assistant/response_formatter.rb

Transformation	Description
Markdown → HTML	Kramdown with GFM + Rouge syntax highlighting
HTML sanitization	Sanitize gem strips XSS vectors
SQL collapse	SQL blocks wrapped in Bootstrap collapsible accordions
Table enhancement	Bootstrap tables with cell formatting and export buttons (Copy, CSV, Excel)
Entity linking	Order numbers, customer IDs, SKUs linked to CRM pages
Source panel	Source citations styled distinctively

11. Sunny Brain — `AssistantBrainEntry`

Files: app/models/assistant_brain_entry.rb, app/controllers/crm/assistant_brain_controller.rb

The Brain is a persistent, CRM-editable knowledge base of rules and facts that are injected into Sunny’s system prompt on every conversation. It replaces hard-coded domain rules in prompt files with database-backed entries that any manager can inspect, approve, or correct without a code deploy.

Entry Model

assistant_brain_entries
  id, category, title, rule (text),
  scope ('global'|'user'), user_id,
  applies_to_services (varchar[]),
  status ('active'|'pending'|'inactive'),
  source ('manual'|'auto_learned'),
  suggested_by_id, approved_by_id,
  created_at, updated_at

scope: 'global' — shared across all users; managed by managers / sunny_administrator role
scope: 'user' — personal preferences for one user; self-managed, never shown to others
applies_to_services — contextual filtering: an empty array means “always inject”; a populated array means “only inject when one of these tool services is active in the current conversation” (e.g. ['blog_management'] keeps blog rules out of analytics sessions)

System Prompt Injection

ChatService#brain_prompt is called during prompt assembly and appends a ## LEARNED RULES section. It uses hybrid retrieval:

Universal entries (applies_to_services: []) — always injected
Service-specific entries — when the total exceeds BRAIN_SEMANTIC_THRESHOLD (40) and a user message is present, pgvector cosine similarity selects the top BRAIN_SEMANTIC_TOP_K (20) most relevant entries from candidates; otherwise all are injected
Entries are sorted by category then title before injection

Semantic Embeddings

Each AssistantBrainEntry includes Models::Embeddable. When title or rule changes, an Events::ContentEmbeddingRequired event is published to the Rails Event Store. Embedding::ContentChangedHandler (async handler) dequeues an EmbeddingWorker job on the ai_embeddings queue, which calls OpenAI text-embedding-3-small and stores a 1536-dimension vector in the content_embeddings_assistant_brain_entries partition (PostgreSQL list partition of content_embeddings keyed on embeddable_type = 'AssistantBrainEntry').

The partition has:

A UNIQUE index on (embeddable_id, content_type, locale)
An HNSW index with vector_cosine_ops for fast approximate nearest-neighbor search

Self-Learning via `ProposeBrainEntryTool`

Sunny can propose new rules after user corrections by calling the propose_brain_entry tool (registered in ChatToolBuilder). Proposed entries are created with status: 'pending'. Managers see a Pending Approval panel in the Brain CRM page and can approve or reject with a single click.

CRM Interface

GET /assistant_brain — lists:

Global active entries (grouped by category) — visible to all users with access
Pending entries awaiting approval — sunny_admin only
Inactive entries — sunny_admin only
Current user’s personal entries

Access control uses CanCan:

# ability.rb — inside is_manager? block
can :manage, AssistantBrainEntry

# Plus standalone for non-managers with explicit role
can :manage, AssistantBrainEntry if account.has_role?('sunny_administrator', admin_check: false)

The sunny_administrator role (created in 20260227140000_add_sunny_administrator_role) can be granted to non-managers who need brain management access (e.g. content specialists).

12. Blog Management Tool Service

File: app/services/assistant/blog_tool_builder.rb

The blog_management tool service provides Sunny with 8 tools for creating and maintaining blog posts.

Tools

Tool	Description
`create_blog_post`	Create a draft post with HTML body, SEO fields, tags, preview image
`update_blog_post`	Update any post field; creates a revision; replaces tags atomically
`get_blog_post`	Read all editable fields + current tags, revisions, effective meta description
`insert_image`	Fetch rendered oEmbed HTML for an image embed
`insert_video`	Fetch rendered oEmbed HTML for a video embed
`insert_faqs`	Fetch rendered oEmbed HTML for an FAQ section
`insert_product`	Fetch rendered oEmbed HTML for a product card
`propose_brain_entry`	Propose a new brain rule (available in all services)

Tagging System (Two-Tier)

Blog posts use two distinct tag types, both applied via the tags: parameter:

Tier 1 — Page-placement tags (public: false): internal tags that make a post appear in the content section of a specific landing page. Format: for-{page-path-parameterized}-page. Used by PagesHelper#page_posts, page_videos, page_showcases, etc. Never shown to visitors (filtered by BlogHelper#tags_with_links).

Examples: for-towel-warmer-page, for-towel-warmer-matte-black-page, for-floor-heating-bathroom-page

Tier 2 — Public navigation tags (public: true): drive the blog tag cloud at /posts/{tag}/tag. 22 tags in production: towel-warmers, installation, indoor-heating, snow-melting, design-trends, etc.

The tags: parameter in update_blog_post replaces the full tag set (clear + add). Always call get_blog_post first to read existing tags before updating. Only tags that already exist in the database are applied — unrecognised names are silently skipped (assign_tags never creates new tags).

Link Validation

Before calling update_blog_post or create_blog_post with any HTML containing <a> elements, Sunny must validate every link using fetch_url. Links that return 404, redirect loops, or error pages must be corrected before the post is saved. This is enforced by a Brain rule (Validate All Links Before Saving a Blog Post) injected when blog_management + web_fetch are active.

Style Guide (`BlogToolBuilder::STYLE_GUIDE`)

A condensed style guide is injected into the create_blog_post and update_blog_post tool descriptions, covering headings (H2+), callouts, feature boxes, sidebars, tables, Liquid tags, Liquid variables, internal link patterns (always use {{ locale }}), and the oEmbed embed workflow.

13. Embedding Infrastructure

Files: app/concerns/models/embeddable.rb, app/subscribers/embedding/content_changed_handler.rb, app/workers/embedding_worker.rb

`Models::Embeddable` Concern

Included by any model that needs semantic search. Provides:

content_for_embedding(content_type) — override to define what text is embedded
embedding_content_changed? — override to define which attribute changes trigger re-embedding
queue_embedding_generation — publishes Events::ContentEmbeddingRequired to the event store

The concern uses the Rails Event Store pub/sub system rather than direct worker enqueue, ensuring every embedding regeneration trigger has a permanent, auditable record in event_store_events.

Event Flow

Model#save → after_commit → queue_embedding_generation
  → event_store.publish(Events::ContentEmbeddingRequired, data: {type, id})
    → Embedding::ContentChangedHandler (AsyncJob, ai_embeddings queue)
      → EmbeddingWorker.perform_async(type, id)
        → OpenAI text-embedding-3-small API
          → ContentEmbedding.upsert(embeddable, vector)

`ContentEmbedding` Table (Partitioned)

The content_embeddings parent table is list-partitioned by embeddable_type. Each model that uses embeddings has its own partition created via pg_party’s create_list_partition_of. Per-partition indexes:

UNIQUE on (embeddable_id, content_type, locale)
HNSW (vector_cosine_ops) for cosine ANN search

pgvector HNSW indexes cannot be placed on partitioned parent tables; they must be on each partition individually.

Models using embeddings

Model	Partition	Embedding content
`Article` / `Post` / `Faq`	`content_embeddings_articles`	Subject + solution body
`Video`	`content_embeddings_videos`	Title + description + transcript
`Showcase`	`content_embeddings_showcases`	Title + description
`AssistantBrainEntry`	`content_embeddings_assistant_brain_entries`	Title + rule text

Data Models

`AssistantConversation`

Table: assistant_conversations

id, user_id, title, messages (jsonb, legacy), metadata (jsonb), llm_model_id,
processing_by_id, processing_since, timestamps

acts_as_chat from RubyLLM — provides ask(), with_model(), with_tools(), with_instructions(), with_thinking()
Messages auto-persist to assistant_messages table
metadata stores via jsonb_accessor: token totals, cost, queries, compaction summary, tool services
with_instructions override fixes RubyLLM v1.11.0 bug where Content::Raw objects were serialized as #<RubyLLM::Content::Raw:0x...> instead of their content
to_llm override: eager-loads to prevent N+1 queries, integrates context compaction, filters empty/orphaned/duplicate messages
Processing lock via processing_by_id + processing_since columns (5-minute staleness threshold)

`AssistantMessage`

Table: assistant_messages

id, assistant_conversation_id, role, content, content_raw (json),
input_tokens, output_tokens, cached_tokens, cache_creation_tokens,
thinking_text, thinking_tokens, thinking_signature,
llm_model_id, assistant_tool_call_id, timestamps

`AssistantToolCall`

Table: assistant_tool_calls

id, assistant_message_id, tool_call_id, name, arguments (jsonb), timestamps

`AssistantConversationShare`

Table: assistant_conversation_shares

id, assistant_conversation_id, shared_with_type, shared_with_id, access_level, timestamps

Access levels: viewer (read-only), collaborator (can send messages).

`AssistantBrainEntry`

Table: assistant_brain_entries

id, category, title, rule (text),
scope ('global'|'user'), user_id,
applies_to_services (varchar[]),
status ('active'|'pending'|'inactive'),
source ('manual'|'auto_learned'),
suggested_by_id, approved_by_id,
created_at, updated_at

Includes Models::Auditable (PaperTrail) — all changes tracked in record_versions table
Includes Models::Embeddable — rule text vectorised to content_embeddings_assistant_brain_entries partition
Scopes: .active, .pending, .global, .for_user(user_id), .for_services(service_keys)

`ContentEmbedding`

Table: content_embeddings (list-partitioned by embeddable_type)

id, embeddable_type, embeddable_id, content_type, locale, embedding (vector(1536)), timestamps

Used by: AssistantBrainEntry, Article/Post/Faq, Video, Showcase. Cosine ANN search via has_neighbors from the neighbor gem.

UI Layer

Stimulus Controller — `assistant_chat_controller.js`

Target	Description
`messages`	Chat message container (auto-scroll target)
`input`	Text input field
`submitButton`	Send button (loading state management)
`placeholder`	Empty-state placeholder
`modelSelect`	Model dropdown selector

Features:

MutationObserver: Watches messages target; auto-scrolls on new content
Completion detection: Scans for [data-chat-complete="true"] marker to re-enable input
Double-submit prevention: isProcessing flag blocks re-submission while worker runs

Turbo Stream Flow

User submits → Turbo handles POST /assistant/ask
Controller responds with ask.turbo_stream.erb → appends user message bubble + processing indicator
Worker broadcasts via Turbo::StreamsChannel:
- broadcast_replace_to → live preview with gray monospace text + blinking caret
- Throttled to ~8fps for smooth streaming
On completion: Worker broadcasts final rendered HTML replacing the preview
Stimulus detects [data-chat-complete] → re-enables input

Stream name: assistant_chat:{conversation_id}

Configuration

Model IDs — `config/initializers/ai_model_constants.rb`

AiModelConstants is the canonical registry for every model ID. LlmDefaults is a thin backward-compatibility shim that delegates to it:

module LlmDefaults
  DEFAULT_SONNET_MODEL = AiModelConstants.id(:anthropic_sonnet) # claude-sonnet-4-6
  DEFAULT_OPUS_MODEL   = AiModelConstants.id(:anthropic_opus)   # claude-opus-4-8
  DEFAULT_HAIKU_MODEL  = AiModelConstants.id(:anthropic_haiku)  # claude-haiku-4-5-20251001
end

All model references in ChatService::MODELS use these constants. Never hardcode model IDs elsewhere. WeeklyLlmModelSyncWorker syncs the live provider registry weekly and emails a report when a pinned default has fallen behind.

RubyLLM Provider Keys

Provider	Env Variable
Anthropic	`ANTHROPIC_API_KEY`
OpenAI	`OPENAI_API_KEY`
Google	`GEMINI_API_KEY`

Database Services

Service Key	Label	Connection Class
`postgres_production`	App DB	`ActiveRecord::Base`
`postgres_versions`	Versions DB	`VersionsDb::Base`

File Index

File	Purpose
`app/agents/assistant/sunny_agent.rb`	RubyLLM Agent: factory for `AssistantConversation` records
`app/prompts/assistant/sunny_agent/instructions.txt.erb`	Base system prompt template (identity, date macros, guidelines)
`app/controllers/crm/assistant_chat_controller.rb`	HTTP entry point, user context, tool service selection
`app/controllers/crm/assistant_brain_controller.rb`	CRM CRUD for brain entries; access-gated via CanCan
`app/workers/assistant_chat_worker.rb`	Sidekiq job: LLM loop, streaming, response formatting
`app/services/assistant/chat_service.rb`	Model selection, prompt assembly (incl. brain_prompt), thinking, tool registration
`app/services/assistant/chat_tool_builder.rb`	Builds `RubyLLM::Tool` instances (content, DB, blog, brain tools)
`app/services/assistant/blog_tool_builder.rb`	Blog create/update/get/embed tools + tagging helpers
`app/services/assistant/sql_broker.rb`	SQL execution: read-only, access control, redaction, audit
`app/services/assistant/data_policy.rb`	Object + column access rules, PII redaction
`app/services/assistant/data_domain_policy.rb`	Role → domain → allowed objects resolution
`app/services/assistant/comment_manifest.rb`	YAML manifest reader: domains, restricted columns, comments
`app/services/assistant/tool_loop_guard.rb`	Prevents repetitive tool call loops
`app/services/assistant/context_compactor.rb`	Sliding-window conversation summarisation
`app/services/assistant/cost_calculator.rb`	Per-turn and per-conversation cost from token counts
`app/services/assistant/response_formatter.rb`	Markdown → HTML with tables, SQL collapse, entity links
`app/models/assistant_conversation.rb`	Conversation record (`acts_as_chat`) with token tracking
`app/models/assistant_message.rb`	Per-message record (`acts_as_message`) with token tracking
`app/models/assistant_tool_call.rb`	Tool invocation record (`acts_as_tool_call`)
`app/models/assistant_conversation_share.rb`	Conversation sharing (viewer / collaborator)
`app/models/assistant_brain_entry.rb`	Learned rules model; embeddable + auditable
`app/models/content_embedding.rb`	Polymorphic vector store (`has_neighbors`); partitioned by type
`app/concerns/models/embeddable.rb`	Concern: `queue_embedding_generation` via Rails Event Store
`app/events/events/content_embedding_required.rb`	Pub/sub event: signals an embedding regeneration is needed
`app/subscribers/embedding/content_changed_handler.rb`	Async handler: enqueues `EmbeddingWorker` on `ai_embeddings` queue
`app/workers/embedding_worker.rb`	Calls OpenAI `text-embedding-3-small`; upserts to `content_embeddings`
`config/initializers/llm_defaults.rb`	Canonical Anthropic model ID constants
`config/analytics/data_domains.yml`	Domain descriptions + role mappings
`db/comments/*.yml`	Per-table manifests with domain, comments, restricted flags
`db/data/brain_entry_embeddings_seed.json`	Pre-computed embeddings for initial brain entries (avoids API costs on deploy)

Access Control

Who can use Sunny

All CRM accounts. Tool services available vary by role:

Tool service	Non-admin	Admin / Manager
`content`	Yes	Yes
`postgres_production`	Domain-restricted	Full
`postgres_versions` (audit DB)	No	Yes
`blog_management`	No	Yes

Who can manage the Brain

The can :manage, AssistantBrainEntry CanCan ability is granted to:

Any account where is_manager? returns true (inside the shared manager ability block in ability.rb)
Any account with the sunny_administrator role (can be assigned to non-managers, e.g. content specialists)

All CRM views (index.html.erb, histories.html.erb) use can?(:manage, AssistantBrainEntry) to conditionally render the “Sunny Brain” button. The controller uses the same ability check via require_sunny_admin!.

Testing

Test File	Coverage
`test/agents/assistant/sunny_agent_create_test.rb`	`create!` returns persisted conversation; no `llm_model`; no system messages at factory time; title defaults and custom title
`test/agents/assistant/sunny_agent_find_test.rb`	`find` returns existing conversation without mutating it; no `llm_model`; no system messages
`test/integration/sunny_blog_editor_end_to_end_test.rb`	Julia-style blog tool flows (conv 1565); assertion helpers in `test/support/sunny_blog_editor_julia_flow_helpers.rb`
`test/models/assistant_conversation_test.rb`	`to_llm` nil guard, `track_query!`, processing lock, scopes
`test/models/assistant_brain_entry_test.rb`	Scopes, status transitions, `for_services` filtering
`test/services/assistant/chat_service_test.rb`	`MODELS` registry, `auto_select_model`, `estimate_tokens`, base prompt, brain prompt injection
`test/workers/assistant_chat_worker_test.rb`	`broadcast_error` mappings, graceful error handling
`test/initializers/llm_defaults_test.rb`	Model IDs valid, Anthropic pattern, no future dates, RubyLLM resolution

Run AI assistant tests:

mise exec -- bin/rails test test/agents/ \
  test/models/assistant_conversation_test.rb \
  test/models/assistant_brain_entry_test.rb \
  test/services/assistant/chat_service_test.rb \
  test/workers/assistant_chat_worker_test.rb \
  test/initializers/llm_defaults_test.rb

Migration History (Sunny features)

Migration	Description
`20260227100000_create_assistant_brain_entries`	Creates `assistant_brain_entries` table
`20260227110000_add_scope_to_brain_entries`	Adds `scope`, `user_id`, `applies_to_services` columns
`20260227140000_add_sunny_administrator_role`	Creates `sunny_administrator` role in `roles` table
`20260228100001_seed_assistant_brain_entries`	Seeds 14 initial brain entries (stub class pattern)
`20260228160000_add_brain_entry_embeddings`	Creates `content_embeddings` partitioned table + `AssistantBrainEntry` partition; seeds pre-computed vectors
`20260228170000_seed_brain_entry_embeddings`	Inserts pre-computed embedding vectors for the 14 initial entries
`20260228190000_add_blog_tagging_brain_entry`	Seeds two-tier tagging brain entry; queues embedding via `after_commit`

Migration pattern: Data migrations that create AssistantBrainEntry records during a deploy sequence must use a local stub class (class BrainEntry < ApplicationRecord; self.table_name = 'assistant_brain_entries'; end) if the migration runs before all schema migrations are applied. This avoids NoMethodError from model validations referencing not-yet-existing columns. The final tagging entry (20260228190000) intentionally uses the real model so the after_commit embedding callback fires.

AI Agent Architecture

Overview

Architecture Diagram

Components

1. Agent — Assistant::SunnyAgent

2. Controller — Crm::AssistantChatController

3. Worker — AssistantChatWorker

4. Chat Service — Assistant::ChatService