Class: ImageDuplicateFinderWorker

Inherits:
Object
  • Object
show all
Includes:
Sidekiq::Worker, Workers::StatusBroadcastable
Defined in:
app/workers/image_duplicate_finder_worker.rb

Overview

Background worker that finds duplicate images using pHash fingerprints.
Results are stored in the image_duplicate_pairs table for persistence.

Uses PostgreSQL bit_count() for efficient Hamming distance calculation,
which is O(n) instead of O(n²) for in-memory comparisons.

Examples:

Queue duplicate detection

ImageDuplicateFinderWorker.perform_async(threshold: 10)

Run incrementally (only new images)

ImageDuplicateFinderWorker.perform_async(threshold: 10, incremental: true)

Instance Attribute Summary

Attributes included from Workers::StatusBroadcastable

#broadcast_status_updates

Instance Method Summary collapse

Methods included from Workers::StatusBroadcastable::Overrides

#at, #store, #total

Instance Method Details

#perform(options = {}) ⇒ Object

Parameters:

  • options (Hash) (defaults to: {})

    Options

Options Hash (options):

  • :threshold (Integer)

    Maximum Hamming distance (default: 10)

  • :incremental (Boolean)

    Only check new images (default: false)



25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# File 'app/workers/image_duplicate_finder_worker.rb', line 25

def perform(options = {})
  options = options.with_indifferent_access
  threshold = (options[:threshold] || 10).to_i
  incremental = options[:incremental].to_s == 'true'
  redirect_to = options[:redirect_to]

  store started_at: Time.current.iso8601
  store threshold: threshold
  store incremental: incremental
  store redirect_to: redirect_to if redirect_to.present?
  store message: 'Scanning for duplicate images...'

  pairs_found = find_and_store_duplicates(threshold: threshold, incremental: incremental)

  store completed_at: Time.current.iso8601
  store pairs_found: pairs_found
  store message: "Found #{pairs_found} duplicate pairs"
  store info_message: "Duplicate detection complete. Found #{pairs_found} pairs with threshold ≤ #{threshold}."

  Rails.logger.info "[ImageDuplicateFinderWorker] Found #{pairs_found} duplicate pairs with threshold #{threshold}"
end