Module: Assistant::BlockAddressedEditor

Defined in:
app/services/assistant/block_addressed_editor.rb

Overview

Stage 2 of the Sunny blog editor fix: replace fragile HTML find/replace
with stable block-ID addressing, Kadous's "edit trick" adapted for HTML.

The LLM sees a block_index like:
b_a3f2 Real estate in Los Angeles...
b_b91c The Project
b_c44d

And emits ordered operations by block_id:
replace_block(block_id: "b_a3f2", html: "...")
delete_block(block_id: "b_b91c")
insert_after(block_id: "b_c44d", html: "...")
insert_before(block_id: "b_a3f2", html: "...")
move_block(block_id: "b_c44d", after: "b_a3f2")
move_block(block_id: "b_c44d", before: "b_a3f2")
update_attr(block_id: "b_c44d", attr: "data-id", value: "10511")

All addressing is by ID, never by string match — eliminating the ~40%
patch-failure rate caused by whitespace/attribute drift in patch_blog_post.

Constant Summary collapse

BLOCK_ID_ATTR =
'data-block-id'
BLOCK_ID_PATTERN =
/\Ab_[a-f0-9]{8}\z/
VALID_OPS =
%w[replace_block delete_block insert_after insert_before move_block update_attr].freeze
ADDRESSABLE_TAGS =

Tags eligible to receive a block_id. Inline tags and whitespace text
nodes are skipped. We intentionally include and
because some legacy posts wrap content in those.

%w[
  p h1 h2 h3 h4 h5 h6 ul ol blockquote figure pre table
  div section aside article hr dl
].to_set.freeze

Class Method Summary collapse

Class Method Details

.apply_ops(html, ops, on_op: nil) ⇒ Hash

Apply ordered block-ID operations to HTML.

Parameters:

  • html (String)

    Source HTML (must already have block IDs assigned).

  • ops (Array<Hash>)

    Ordered operations.

  • on_op (Proc, nil) (defaults to: nil)

    Optional callback invoked per op with
    { op:, block_id:, status:, preview: } — used by Stage 9 streaming.

Returns:

  • (Hash)

    { html:, op_results: [...] }
    Each op_result: { index:, op:, block_id:, status:, detail: (optional) }
    status: "applied" | "not_found" | "invalid" | "duplicate_id"



90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
# File 'app/services/assistant/block_addressed_editor.rb', line 90

def apply_ops(html, ops, on_op: nil)
  ops_array = Array(ops)
  return { html: html.to_s, op_results: [] } if ops_array.empty?

  fragment = parse_fragment(html.to_s)
  valid_ids = collect_block_ids(fragment)
  results = []

  ops_array.each_with_index do |op, idx|
    op_name = (op[:op] || op['op']).to_s
    block_id = (op[:block_id] || op['block_id']).to_s
    entry = { index: idx, op: op_name, block_id: block_id }

    unless VALID_OPS.include?(op_name)
      entry[:status] = 'invalid'
      entry[:detail] = "Unknown op: #{op_name.inspect}. Valid ops: #{VALID_OPS.join(', ')}"
      results << entry
      notify(on_op, entry, fragment, nil)
      next
    end

    unless BLOCK_ID_PATTERN.match?(block_id)
      entry[:status] = 'invalid'
      entry[:detail] = "Invalid block_id format: #{block_id.inspect}. Must match /b_[a-f0-9]{8}/. " \
                       "Re-call get_blog_post and copy the block_id verbatim — never invent one."
      assign_did_you_mean!(entry, block_id, valid_ids)
      results << entry
      notify(on_op, entry, fragment, nil)
      next
    end

    target = find_block(fragment, block_id)
    unless target
      entry[:status] = 'not_found'
      entry[:detail] = "No block with data-block-id=#{block_id} found in content. " \
                       "Re-call get_blog_post to refresh the block_index — IDs change after every edit."
      assign_did_you_mean!(entry, block_id, valid_ids)
      results << entry
      notify(on_op, entry, fragment, nil)
      next
    end

    preview_node = nil
    begin
      preview_node = apply_single_op(fragment, target, op_name, op, entry)
    rescue ArgumentError => e
      entry[:status] = 'invalid'
      entry[:detail] = e.message
    end

    results << entry
    notify(on_op, entry, fragment, preview_node)
  end

  { html: serialize(fragment), op_results: results }
end

.assign_ids!(html) ⇒ String

Parse HTML, assign a stable b_<8hex> data-block-id to every top-level
child that doesn't already have one, and return the serialized HTML.
Idempotent — existing valid IDs are preserved.

Parameters:

  • html (String)

    The blog post body HTML.

Returns:

  • (String)

    HTML with data-block-id on every top-level block.



47
48
49
50
51
52
53
54
55
56
57
58
# File 'app/services/assistant/block_addressed_editor.rb', line 47

def assign_ids!(html)
  return '' if html.nil? || html.to_s.strip.empty?

  fragment = parse_fragment(html.to_s)
  top_level_blocks(fragment).each do |node|
    existing = node['data-block-id'].to_s
    next if BLOCK_ID_PATTERN.match?(existing)

    node[BLOCK_ID_ATTR] = new_block_id
  end
  serialize(fragment)
end

.block_index(html) ⇒ Array<Hash>

Build a compact index of top-level blocks with a short preview.
This is what the LLM consumes to choose block_ids to target —
much smaller than the 40k-char truncated full body.

Parameters:

  • html (String)

    Serialized HTML (already passed through assign_ids!).

Returns:

  • (Array<Hash>)

    [{ block_id:, tag:, preview:, kind: (optional), embed_id: (optional) }]



66
67
68
69
70
71
72
73
74
75
76
77
78
79
# File 'app/services/assistant/block_addressed_editor.rb', line 66

def block_index(html)
  return [] if html.nil? || html.to_s.strip.empty?

  fragment = parse_fragment(html.to_s)
  top_level_blocks(fragment).map do |node|
    entry = {
      block_id: node[BLOCK_ID_ATTR].to_s,
      tag: node.name
    }
    entry[:preview] = build_preview(node)
    decorate_embed_metadata!(entry, node)
    entry
  end
end