Module: Models::HybridSearchable

Extended by:
ActiveSupport::Concern
Included in:
Image, Publication, Video
Defined in:
app/concerns/models/hybrid_searchable.rb

Overview

Shared Reciprocal Rank Fusion (RRF) helper for hybrid search scopes.

Combines two ranked ID lists (e.g. AI embedding + keyword search) into a
single ordered ActiveRecord::Relation using RRF scoring:
score(id) = Σ 1/(k + rank) for each list that contains the id

Items appearing in both lists score highest; items in only one list
still appear but rank lower.

Examples:

class Image < ApplicationRecord
  include Models::HybridSearchable

  scope :hybrid_search, ->(query, limit: 500) {
    ai_ids = ai_search(query, limit: limit).pluck(:id)
    kw_ids = keyword_search(query).limit(limit).pluck(:id) rescue []
    rrf_ranked_relation(ai_ids, kw_ids, limit: limit)
  }
end

Class Method Summary collapse

Class Method Details

.rrf_ranked_relation(ai_ids, keyword_ids, limit:, k: 60) ⇒ ActiveRecord::Relation

Build a relation filtered to the union of two ranked ID lists,
ordered by Reciprocal Rank Fusion score (highest first).

Parameters:

  • ai_ids (Array<Integer>)

    IDs from semantic/AI search, ordered by relevance

  • keyword_ids (Array<Integer>)

    IDs from keyword/text search, ordered by relevance

  • limit (Integer)

    Maximum results to return

  • k (Integer) (defaults to: 60)

    RRF smoothing constant (default 60)

Returns:

  • (ActiveRecord::Relation)


35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# File 'app/concerns/models/hybrid_searchable.rb', line 35

def rrf_ranked_relation(ai_ids, keyword_ids, limit:, k: 60)
  combined_ids = (ai_ids | keyword_ids)
  return none if combined_ids.empty?

  ai_rank = ai_ids.each_with_index.to_h
  kw_rank = keyword_ids.each_with_index.to_h

  scored = combined_ids.map { |id|
    rrf = 0.0
    rrf += 1.0 / (k + ai_rank[id]) if ai_rank.key?(id)
    rrf += 1.0 / (k + kw_rank[id]) if kw_rank.key?(id)
    [id, rrf]
  }.sort_by { |_, s| -s }

  ordered_ids = scored.map { |id, _| id.to_i }.first(limit)

  where(id: ordered_ids).order(
    Arel.sql(
      "CASE #{table_name}.id " +
      ordered_ids.each_with_index.map { |id, i| "WHEN #{id} THEN #{i}" }.join(' ') +
      " ELSE #{ordered_ids.size} END"
    )
  )
end