Module: ContentEmbedding::TextSearchable
- Extended by:
- ActiveSupport::Concern
- Included in:
- ActivityEmbedding, ArticleEmbedding, CallRecordEmbedding, ItemEmbedding, PostEmbedding, ProductLineEmbedding, ReviewsIoEmbedding, ShowcaseEmbedding, SiteMapEmbedding, VideoEmbedding
- Defined in:
- app/models/concerns/content_embedding/text_searchable.rb
Overview
Concern providing semantic search for text-based content embeddings.
Uses OpenAI text-embedding-3-small model for query embedding generation.
Include this in partition models that use text embeddings:
- PostEmbedding, ArticleEmbedding, ShowcaseEmbedding
- VideoEmbedding, SiteMapEmbedding, ItemEmbedding
- CallRecordEmbedding, ProductLineEmbedding, ReviewsIoEmbedding
Constant Summary collapse
- EMBEDDING_MODEL =
OpenAI text embedding model for text content
'text-embedding-3-small'- SIMILARITY_THRESHOLD =
Minimum similarity threshold (0 = no filtering)
0.0
Class Method Summary collapse
-
.apply_locale_filter(scope, locale) ⇒ ActiveRecord::Relation
Apply locale filter using the partition's table name.
-
.apply_published_filter(scope) ⇒ ActiveRecord::Relation
Override in subclass to apply model-specific published filter.
-
.generate_text_query_embedding(query) ⇒ Array<Float>?
Generate query embedding using OpenAI.
-
.locale_filtered? ⇒ Boolean
Override in partition models that have locale-specific content.
-
.semantic_search(query, limit: 10, locale: 'en', published_only: true, min_similarity: SIMILARITY_THRESHOLD) ⇒ ActiveRecord::Relation
Semantic search within this partition's content type.
Class Method Details
.apply_locale_filter(scope, locale) ⇒ ActiveRecord::Relation
Apply locale filter using the partition's table name.
Only called when locale_filtered? returns true.
82 83 84 85 86 87 88 89 90 91 |
# File 'app/models/concerns/content_embedding/text_searchable.rb', line 82 def apply_locale_filter(scope, locale) locale_str = locale.to_s if locale_str.include?('-') # Exact match for regional locales (en-US, en-CA, fr-CA) scope.where(locale: locale_str) else # Base locale matches itself and all regional variants scope.where("#{table_name}.locale = ? OR #{table_name}.locale LIKE ?", locale_str, "#{locale_str}-%") end end |
.apply_published_filter(scope) ⇒ ActiveRecord::Relation
Override in subclass to apply model-specific published filter
116 117 118 |
# File 'app/models/concerns/content_embedding/text_searchable.rb', line 116 def apply_published_filter(scope) scope # Default: no filtering (override in subclass) end |
.generate_text_query_embedding(query) ⇒ Array<Float>?
Generate query embedding using OpenAI
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
# File 'app/models/concerns/content_embedding/text_searchable.rb', line 97 def (query) cache_key = "query_embedding:#{EMBEDDING_MODEL}:#{Digest::SHA256.hexdigest(query.downcase.strip)[0..15]}" cached = Rails.cache.read(cache_key) return cached if cached.present? result = RubyLLM.(query, model: EMBEDDING_MODEL, provider: :openai, assume_model_exists: true) vector = result.vectors Rails.cache.write(cache_key, vector, expires_in: 24.hours) if vector.present? vector rescue StandardError => e Rails.logger.error "[#{name}] Failed to generate query embedding: #{e.}" nil end |
.locale_filtered? ⇒ Boolean
Override in partition models that have locale-specific content.
Default is false - most content is English-only.
72 73 74 |
# File 'app/models/concerns/content_embedding/text_searchable.rb', line 72 def locale_filtered? false end |
.semantic_search(query, limit: 10, locale: 'en', published_only: true, min_similarity: SIMILARITY_THRESHOLD) ⇒ ActiveRecord::Relation
Semantic search within this partition's content type
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'app/models/concerns/content_embedding/text_searchable.rb', line 39 def semantic_search(query, limit: 10, locale: 'en', published_only: true, min_similarity: SIMILARITY_THRESHOLD) return none if query.blank? = (query) return none unless # Build query using nearest_neighbors scope = primary_content . .nearest_neighbors(:embedding, , distance: :cosine) # Apply locale filter only for models with locale-specific content (e.g., SiteMap) scope = apply_locale_filter(scope, locale) if locale_filtered? # Apply published filter if model supports it scope = apply_published_filter(scope) if published_only # Apply similarity threshold if specified if min_similarity.positive? max_distance = 1.0 - min_similarity vector_literal = "[#{.join(',')}]" scope = scope.where( sanitize_sql_array(['embedding <=> ?::vector <= ?', vector_literal, max_distance]) ) end scope.limit(limit).includes(:embeddable) end |