Class: Assistant::PdfToolBuilder

Inherits:
Object
  • Object
show all
Defined in:
app/services/assistant/pdf_tool_builder.rb

Overview

Builds RubyLLM::Tool subclasses for the Sunny pdf_tools service — a curated
surface over Pdf::Toolkit (HexaPDF 1.9) for inspecting, editing, assembling,
and generating PDFs.

Provides 7 tools:
pdf_inspect — read structure, AcroForm fields, text preview
pdf_edit — overlay text / cover (redact) / image / watermark on pages
pdf_fill_form — set AcroForm field values
pdf_merge — concatenate several PDFs
pdf_pages — keep/reorder pages and/or rotate (extract, delete, reorder)
pdf_compress — reduce file size
pdf_generate — build a new branded PDF from a structured layout

I/O model
Input — source is either the id of an Upload attached to the current
conversation
(the safe boundary: the user explicitly dropped it
into this chat) or a public http(s) URL. Arbitrary upload ids from
other resources are refused to avoid cross-record data exposure.
Output — written as a new assistant_attachment Upload linked to the
conversation; the tool returns a presigned download URL.

Usage (via ChatToolBuilder):
tools = Assistant::PdfToolBuilder.tools(audit_context: { conversation_id:, user_id: })

Defined Under Namespace

Classes: InputError

Constant Summary collapse

MAX_BYTES =

Hard cap on a fetched/produced PDF (bytes). Inputs above this are refused.

60 * 1024 * 1024
COMPRESS_LEVELS =

Friendly compression presets → Ghostscript pdf settings.

{
  'screen'   => '/screen',   # smallest, 72 dpi
  'ebook'    => '/ebook',    # balanced, 150 dpi
  'printer'  => '/printer',  # high quality, 300 dpi
  'prepress' => '/prepress'  # largest, color-preserving
}.freeze

Class Method Summary collapse

Class Method Details

.build_compress_tool(ctx) ⇒ Object

── pdf_compress ─────────────────────────────────────────────────────



326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
# File 'app/services/assistant/pdf_tool_builder.rb', line 326

def build_compress_tool(ctx)
  captured = ctx
  klass = Class.new(RubyLLM::Tool) do
    description <<~DESC
      Reduce a PDF's file size with Ghostscript. `level` trades size vs quality:
      "screen" (smallest), "ebook" (balanced, default), "printer" (high quality),
      "prepress" (largest, preserves color). Returns the smaller file; if the
      level wouldn't help, the original is returned unchanged.
    DESC
    params type: 'object',
           properties: {
             source:   { type: 'string', description: 'Upload id of a PDF attached to this conversation, or a public http(s) URL.' },
             level:    { type: 'string', enum: %w[screen ebook printer prepress], description: 'Compression preset (default ebook).' },
             filename: { type: 'string', description: 'Optional output filename.' }
           },
           required: %w[source]

    define_method(:name) { 'pdf_compress' }

    define_method(:execute) do |source:, level: 'ebook', filename: nil, **_|
      builder = Assistant::PdfToolBuilder
      conv    = builder.send(:conversation!, captured)
      path    = builder.send(:resolve_pdf_path, source, conv)
      gs      = Assistant::PdfToolBuilder::COMPRESS_LEVELS.fetch(level.to_s, '/ebook')
      result  = Pdf::Toolkit.compress(path, level: gs)
      builder.send(:persist!, conv, result, filename, source, 'compressed').to_json
    rescue Pdf::Toolkit::Error, Assistant::PdfToolBuilder::InputError => e
      { error: e.message }.to_json
    rescue StandardError => e
      Rails.logger.error("[PdfToolBuilder] pdf_compress failed: #{e.class} #{e.message}")
      { error: "Compress failed: #{e.message}" }.to_json
    ensure
      Assistant::PdfToolBuilder.send(:cleanup_temps!)
    end
  end
  klass.new
end

.build_edit_tool(ctx) ⇒ Object

── pdf_edit (overlay / stamp) ───────────────────────────────────────



106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
# File 'app/services/assistant/pdf_tool_builder.rb', line 106

def build_edit_tool(ctx)
  captured = ctx
  klass = Class.new(RubyLLM::Tool) do
    description <<~DESC
      Overlay content on top of an existing PDF's pages — the tool for editing
      a PDF we did NOT generate from data (e.g. a third-party spec sheet).

      COORDINATES: PDF points (1/72 inch) from the page's LOWER-LEFT corner,
      x →right, y →up. US Letter is 612×792 pt. Call pdf_inspect first to get
      page sizes.

      `operations` is an ordered array; each item has a `type`:
        • "cover"     — draw a filled rectangle to hide existing content
                        (x, y, width, height, color hex default FFFFFF=white)
        • "text"      — draw text (text, x, y, size, color hex, font, bold)
        • "image"     — place an image (source = upload id/url, x, y, width, height)
        • "watermark" — diagonal translucent text across the page
                        (text, size, color, opacity 0-1, angle)

      REDACT-AND-REPLACE pattern (e.g. change a phone number in a footer):
      one "cover" op over the old text, then a "text" op with the new value at
      the same spot. This is reliable for a known location; it does NOT find
      text for you — use pdf_inspect's text preview to locate it.

      NOTE: "cover" hides text VISUALLY but does not remove the original glyphs
      from the file — they can still be extracted. Use it to change a displayed
      value, not to redact sensitive/confidential data.

      fonts: sofia, sofia_bold (Sofia Pro — the WarmlyYours website font, use
      this to match brand text), orpheus, orpheus_bold (Orpheus Pro serif, for
      elegant headings), helvetica, helvetica_bold, nimbus, nimbus_bold.
      `pages`: "all" (default), a number, or a list like "1,3-4".
    DESC
    params type: 'object',
           properties: {
             source:     { type: 'string', description: 'Upload id of a PDF attached to this conversation, or a public http(s) URL.' },
             operations: {
               type: 'array',
               description: 'Ordered overlay operations applied to each target page.',
               items: {
                 type: 'object',
                 properties: {
                   type:    { type: 'string', enum: %w[text cover image watermark] },
                   text:    { type: 'string', description: 'Text content (type=text or watermark).' },
                   x:       { type: 'number', description: 'X in points from lower-left.' },
                   y:       { type: 'number', description: 'Y in points from lower-left.' },
                   width:   { type: 'number', description: 'Width in points (cover/image).' },
                   height:  { type: 'number', description: 'Height in points (cover/image).' },
                   size:    { type: 'number', description: 'Font size in points (text/watermark).' },
                   color:   { type: 'string', description: 'Hex color e.g. "FFFFFF" or "323232".' },
                   font:    { type: 'string', enum: %w[sofia sofia_bold orpheus orpheus_bold helvetica helvetica_bold nimbus nimbus_bold] },
                   bold:    { type: 'boolean' },
                   opacity: { type: 'number', description: 'Watermark opacity 0-1.' },
                   angle:   { type: 'number', description: 'Watermark angle in degrees.' },
                   source:  { type: 'string', description: 'For type=image: upload id (this conversation) or public image URL.' }
                 },
                 required: %w[type]
               }
             },
             pages:    { type: 'string', description: 'Target pages: "all" (default), a number, or a list like "1,3-4".' },
             filename: { type: 'string', description: 'Optional output filename.' },
             stage_for_review: { type: 'boolean', description: 'Send the result to the CRM PDF studio for review and import into the publication library, instead of returning a chat download. Returns a studio URL.' }
           },
           required: %w[source operations]

    define_method(:name) { 'pdf_edit' }

    define_method(:execute) do |source:, operations:, pages: 'all', filename: nil, stage_for_review: false, **_|
      builder = Assistant::PdfToolBuilder
      conv    = builder.send(:conversation!, captured)
      path    = builder.send(:resolve_pdf_path, source, conv)
      ops     = builder.send(:resolve_image_ops, operations, conv)
      result  = Pdf::Toolkit.stamp(path, operations: ops, pages: builder.send(:parse_pages, pages))
      if stage_for_review
        builder.send(:stage_for_review!, conv, result.bytes, layout: { 'operations' => operations, 'source' => source }, kind: 'edited', title: filename || "edited-#{source}").to_json
      else
        builder.send(:persist!, conv, result, filename, source, 'edited').to_json
      end
    rescue Pdf::Toolkit::Error, Assistant::PdfToolBuilder::InputError => e
      { error: e.message }.to_json
    rescue StandardError => e
      Rails.logger.error("[PdfToolBuilder] pdf_edit failed: #{e.class} #{e.message}")
      { error: "Edit failed: #{e.message}" }.to_json
    ensure
      Assistant::PdfToolBuilder.send(:cleanup_temps!)
    end
  end
  klass.new
end

.build_fill_form_tool(ctx) ⇒ Object

── pdf_fill_form ────────────────────────────────────────────────────



198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
# File 'app/services/assistant/pdf_tool_builder.rb', line 198

def build_fill_form_tool(ctx)
  captured = ctx
  klass = Class.new(RubyLLM::Tool) do
    description <<~DESC
      Fill an interactive PDF form (AcroForm) by setting field values by name.
      Use pdf_inspect first to get the exact field names and types. This is the
      clean way to fill a fillable template; for flat PDFs with no form fields,
      use pdf_edit instead.

      `values` maps field name → value. Text fields take a string; checkboxes
      take true/false. Set `flatten` true to bake the values in permanently
      (the form is no longer editable afterward).
    DESC
    params type: 'object',
           properties: {
             source:   { type: 'string', description: 'Upload id of a PDF attached to this conversation, or a public http(s) URL.' },
             values:   { type: 'object', description: 'Map of field_name => value. Text → string; checkbox → true/false.', additionalProperties: true },
             flatten:  { type: 'boolean', description: 'Bake values in permanently (default false).' },
             filename: { type: 'string', description: 'Optional output filename.' }
           },
           required: %w[source values]

    define_method(:name) { 'pdf_fill_form' }

    define_method(:execute) do |source:, values:, flatten: false, filename: nil, **_|
      builder = Assistant::PdfToolBuilder
      conv    = builder.send(:conversation!, captured)
      path    = builder.send(:resolve_pdf_path, source, conv)
      result  = Pdf::Toolkit.fill_form(path, values: values.to_h, flatten: flatten)
      builder.send(:persist!, conv, result, filename, source, 'filled').to_json
    rescue Pdf::Toolkit::Error, Assistant::PdfToolBuilder::InputError => e
      { error: e.message }.to_json
    rescue StandardError => e
      Rails.logger.error("[PdfToolBuilder] pdf_fill_form failed: #{e.class} #{e.message}")
      { error: "Fill form failed: #{e.message}" }.to_json
    ensure
      Assistant::PdfToolBuilder.send(:cleanup_temps!)
    end
  end
  klass.new
end

.build_generate_tool(ctx) ⇒ Object

── pdf_generate ─────────────────────────────────────────────────────



366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
# File 'app/services/assistant/pdf_tool_builder.rb', line 366

def build_generate_tool(ctx)
  captured = ctx
  klass = Class.new(RubyLLM::Tool) do
    description <<~DESC
      Create a NEW branded PDF (WarmlyYours logo + brand font) from structured
      content — a one-pager, summary, or simple spec sheet. For editing an
      existing PDF, use pdf_edit/pdf_fill_form instead.

      `blocks` is an ordered array; each item has a `type`:
        • "heading"   — section heading (text)
        • "paragraph" — body text (text)
        • "bullets"   — a bulleted list (items: [string, ...])
        • "spacer"    — vertical gap (size in points, default 8)
        • "image"     — embed an image (source = upload id attached to this
                        conversation, or a public image URL; optional caption)
        • "video"     — a "scan to watch" QR code + a link (url, optional caption)
        • "link"      — a labelled link (text, url)
      Content flows top-to-bottom and paginates automatically. For a summary
      sheet from a video, read its transcript and write the heading/paragraph
      blocks yourself, then add a "video" block linking back to it.

      Set `template: "letterhead"` to render a formal WarmlyYours cover letter:
      a branded header band (logo + "Modern Radiant Heating Solutions" tagline)
      and a phone · address · website footer on every page, body in the brand
      font, and "heading" blocks as a burgundy serif. Compose the letter as
      blocks — the date (a paragraph with `bold: true`), salutation, body
      paragraphs, an "About WarmlyYours" heading, and the closing; title/
      subtitle/logo are ignored in letterhead mode.
    DESC
    params type: 'object',
           properties: {
             title:    { type: 'string', description: 'Document title (shown large under the logo).' },
             subtitle: { type: 'string', description: 'Optional subtitle.' },
             blocks: {
               type: 'array',
               description: 'Ordered content blocks.',
               items: {
                 type: 'object',
                 properties: {
                   type:    { type: 'string', enum: %w[heading paragraph bullets spacer image video link] },
                   text:    { type: 'string', description: 'Text for heading/paragraph, or label for link.' },
                   bold:    { type: 'boolean', description: 'Render a paragraph in bold (e.g. the date line on a letterhead).' },
                   items:   { type: 'array', items: { type: 'string' }, description: 'List items for type=bullets.' },
                   size:    { type: 'number', description: 'Gap in points for type=spacer.' },
                   source:  { type: 'string', description: 'For type=image: upload id (this conversation) or public image URL.' },
                   url:     { type: 'string', description: 'For type=video/link: the URL.' },
                   caption: { type: 'string', description: 'Optional caption for type=image/video.' }
                 },
                 required: %w[type]
               }
             },
             orientation: { type: 'string', enum: %w[portrait landscape], description: 'Default portrait.' },
             template:    { type: 'string', enum: %w[standard letterhead], description: 'Layout: "standard" (logo + title, default) or "letterhead" (formal cover-letter chrome).' },
             logo:        { type: 'boolean', description: 'Include the WarmlyYours logo header (default true; ignored for letterhead).' },
             filename:    { type: 'string', description: 'Optional output filename.' },
             stage_for_review: { type: 'boolean', description: 'Send the result to the CRM PDF studio for review and import into the publication library, instead of returning a chat download. Returns a studio URL.' }
           },
           required: %w[title blocks]

    define_method(:name) { 'pdf_generate' }

    define_method(:execute) do |title:, blocks:, subtitle: nil, orientation: 'portrait', template: 'standard', logo: true, filename: nil, stage_for_review: false, **_|
      builder = Assistant::PdfToolBuilder
      conv    = builder.send(:conversation!, captured)
      layout  = { title: title, subtitle: subtitle, blocks: blocks, orientation: orientation, template: template, logo:  }
      # Resolve image/video references for rendering; keep the original
      # blocks in the stored layout so the studio can iterate on them.
      render_layout = layout.merge(blocks: builder.send(:resolve_generate_blocks, blocks, conv))
      result        = Pdf::Toolkit.generate(layout: render_layout)
      if stage_for_review
        builder.send(:stage_for_review!, conv, result.bytes, layout: layout, kind: 'generated', title: title).to_json
      else
        builder.send(:persist!, conv, result, filename, nil, title.to_s.parameterize.presence || 'document').to_json
      end
    rescue Pdf::Toolkit::Error, Assistant::PdfToolBuilder::InputError => e
      { error: e.message }.to_json
    rescue StandardError => e
      Rails.logger.error("[PdfToolBuilder] pdf_generate failed: #{e.class} #{e.message}")
      { error: "Generate failed: #{e.message}" }.to_json
    ensure
      Assistant::PdfToolBuilder.send(:cleanup_temps!)
    end
  end
  klass.new
end

.build_inspect_tool(ctx) ⇒ Object

── pdf_inspect ──────────────────────────────────────────────────────



62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
# File 'app/services/assistant/pdf_tool_builder.rb', line 62

def build_inspect_tool(ctx)
  captured = ctx
  klass = Class.new(RubyLLM::Tool) do
    description <<~DESC
      Inspect a PDF without changing it: page count, page sizes and rotation,
      whether it has fillable AcroForm fields (and their names/types/values),
      embedded image count, document metadata, and a per-page text preview.

      Use this FIRST when the user attaches a PDF and wants to edit it — the
      field list tells you whether to use pdf_fill_form (it has form fields)
      or pdf_edit (overlay text at coordinates). The text preview helps you
      locate what to change.

      `source` is the upload id of a PDF attached to THIS conversation, or a
      public http(s) URL.
    DESC
    params type: 'object',
           properties: {
             source: { type: 'string', description: 'Upload id of a PDF attached to this conversation, or a public http(s) URL.' }
           },
           required: %w[source]

    define_method(:name) { 'pdf_inspect' }

    define_method(:execute) do |source:, **_|
      builder = Assistant::PdfToolBuilder
      conv    = builder.send(:conversation!, captured)
      path    = builder.send(:resolve_pdf_path, source, conv)
      result  = Pdf::Toolkit.inspect_pdf(path)
      Assistant::ChatToolBuilder.truncate_result(result.meta.merge(source: source).to_json)
    rescue Pdf::Toolkit::Error, Assistant::PdfToolBuilder::InputError => e
      { error: e.message }.to_json
    rescue StandardError => e
      Rails.logger.error("[PdfToolBuilder] pdf_inspect failed: #{e.class} #{e.message}")
      { error: "Inspect failed: #{e.message}" }.to_json
    ensure
      Assistant::PdfToolBuilder.send(:cleanup_temps!)
    end
  end
  klass.new
end

.build_merge_tool(ctx) ⇒ Object

── pdf_merge ────────────────────────────────────────────────────────



242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
# File 'app/services/assistant/pdf_tool_builder.rb', line 242

def build_merge_tool(ctx)
  captured = ctx
  klass = Class.new(RubyLLM::Tool) do
    description <<~DESC
      Merge several PDFs into one, in the order given. Each source is an upload
      id attached to this conversation or a public http(s) URL.
    DESC
    params type: 'object',
           properties: {
             sources:  { type: 'array', items: { type: 'string' }, description: 'Ordered list of upload ids / public http(s) URLs to concatenate.' },
             filename: { type: 'string', description: 'Optional output filename.' }
           },
           required: %w[sources]

    define_method(:name) { 'pdf_merge' }

    define_method(:execute) do |sources:, filename: nil, **_|
      builder = Assistant::PdfToolBuilder
      conv    = builder.send(:conversation!, captured)
      raise Assistant::PdfToolBuilder::InputError, 'provide at least two sources to merge' if Array(sources).size < 2

      paths  = Array(sources).map { |s| builder.send(:resolve_pdf_path, s, conv) }
      result = Pdf::Toolkit.merge(paths: paths)
      builder.send(:persist!, conv, result, filename, nil, 'merged').to_json
    rescue Pdf::Toolkit::Error, Assistant::PdfToolBuilder::InputError => e
      { error: e.message }.to_json
    rescue StandardError => e
      Rails.logger.error("[PdfToolBuilder] pdf_merge failed: #{e.class} #{e.message}")
      { error: "Merge failed: #{e.message}" }.to_json
    ensure
      Assistant::PdfToolBuilder.send(:cleanup_temps!)
    end
  end
  klass.new
end

.build_pages_tool(ctx) ⇒ Object

── pdf_pages (select/reorder + rotate) ──────────────────────────────



280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
# File 'app/services/assistant/pdf_tool_builder.rb', line 280

def build_pages_tool(ctx)
  captured = ctx
  klass = Class.new(RubyLLM::Tool) do
    description <<~DESC
      Reorganize a PDF's pages. Combines two operations:
        • keep_pages — an ordered list of page numbers to KEEP. This covers
          extraction (a subset, e.g. [1,2]), deletion (omit a page), and
          reordering (e.g. [3,1,2]). Omit to keep all pages.
        • rotate_degrees — rotate pages by a multiple of 90 (clockwise),
          applied to rotate_pages ("all" by default).
      At least one of keep_pages or rotate_degrees must be provided.
    DESC
    params type: 'object',
           properties: {
             source:         { type: 'string', description: 'Upload id of a PDF attached to this conversation, or a public http(s) URL.' },
             keep_pages:     { type: 'array', items: { type: 'integer' }, description: 'Ordered 1-based page numbers to keep (extract/delete/reorder).' },
             rotate_degrees: { type: 'integer', description: 'Rotate by a multiple of 90 (clockwise).' },
             rotate_pages:   { type: 'string', description: 'Pages to rotate: "all" (default), a number, or a list like "1,3-4".' },
             filename:       { type: 'string', description: 'Optional output filename.' }
           },
           required: %w[source]

    define_method(:name) { 'pdf_pages' }

    define_method(:execute) do |source:, keep_pages: nil, rotate_degrees: nil, rotate_pages: 'all', filename: nil, **_|
      builder = Assistant::PdfToolBuilder
      conv    = builder.send(:conversation!, captured)
      raise Assistant::PdfToolBuilder::InputError, 'provide keep_pages and/or rotate_degrees' if keep_pages.blank? && rotate_degrees.nil?

      path   = builder.send(:resolve_pdf_path, source, conv)
      result = builder.send(:apply_page_ops, path, keep_pages, rotate_degrees, rotate_pages)
      builder.send(:persist!, conv, result, filename, source, 'pages').to_json
    rescue Pdf::Toolkit::Error, Assistant::PdfToolBuilder::InputError => e
      { error: e.message }.to_json
    rescue StandardError => e
      Rails.logger.error("[PdfToolBuilder] pdf_pages failed: #{e.class} #{e.message}")
      { error: "Page operation failed: #{e.message}" }.to_json
    ensure
      Assistant::PdfToolBuilder.send(:cleanup_temps!)
    end
  end
  klass.new
end

.tools(audit_context: {}) ⇒ Array<RubyLLM::Tool>

Parameters:

  • audit_context (Hash) (defaults to: {})

    must include :conversation_id and :user_id

Returns:

  • (Array<RubyLLM::Tool>)


48
49
50
51
52
53
54
55
56
57
58
# File 'app/services/assistant/pdf_tool_builder.rb', line 48

def tools(audit_context: {})
  [
    build_inspect_tool(audit_context),
    build_edit_tool(audit_context),
    build_fill_form_tool(audit_context),
    build_merge_tool(audit_context),
    build_pages_tool(audit_context),
    build_compress_tool(audit_context),
    build_generate_tool(audit_context)
  ]
end