Class: Pdf::Toolkit
- Inherits:
-
Object
- Object
- Pdf::Toolkit
- Defined in:
- app/services/pdf/toolkit.rb
Overview
Bounded, programmatic PDF operations backed by HexaPDF 1.9 — the engine
behind the Sunny pdf_tools service (Assistant::PdfToolBuilder).
Every method loads HexaPDF lazily (Loader.load!) and operates on a
local file path. Transform methods return a Result carrying the output PDF
+bytes+ plus a small descriptive +meta+ hash; Toolkit.inspect_pdf returns a
meta-only Result (+bytes+ nil); Toolkit.split returns an Array of Results.
This is deliberately a curated operation set — not raw HexaPDF — so it can
be driven safely by the LLM tool loop (no arbitrary code execution):
inspect_pdf — structure, AcroForm fields, text preview, image count
stamp — overlay text / cover (redact) / image / watermark on pages
fill_form — set AcroForm field values, optionally flatten
merge — concatenate several PDFs
split — explode into per-page (or per-range) PDFs
rotate — rotate selected pages by a multiple of 90°
select_pages — keep/reorder a subset (covers extract, delete, reorder)
compress — Ghostscript size reduction (delegates to Compressor)
generate — build a new branded PDF from a declarative layout
Coordinates use the PDF user-space convention: points (1/72") measured from
the page's lower-left corner, +x right, +y up.
Defined Under Namespace
Constant Summary collapse
- NIMBUS =
Brand fonts (mirrors Base). Relative paths resolve from Rails root.
'data/fonts/NimbusSans.ttf'- NIMBUS_BOLD =
'data/fonts/NimbusSansBold.ttf'- SOFIA =
WarmlyYours website design-system fonts, converted (lossless woff2→sfnt) from
the self-hosted webfonts in public/fonts/ so generated PDFs match warmlyyours.com:
Sofia Pro is the site's primary sans (body), Orpheus Pro its serif display face. 'data/fonts/sofiapro/SofiaPro-Regular.ttf'- SOFIA_LIGHT =
'data/fonts/sofiapro/SofiaPro-Light.ttf'- SOFIA_BOLD =
'data/fonts/sofiapro/SofiaPro-Semibold.ttf'- ORPHEUS =
'data/fonts/orpheuspro/OrpheusPro-Regular.ttf'- ORPHEUS_BOLD =
'data/fonts/orpheuspro/OrpheusPro-Bold.ttf'- FONT_SPECS =
Logical font name → [hexapdf font spec, kwargs].
{ 'helvetica' => ['Helvetica', {}], 'helvetica_bold' => ['Helvetica', { variant: :bold }], 'sofia' => [SOFIA, {}], 'sofia_bold' => [SOFIA_BOLD, {}], 'sofia_light' => [SOFIA_LIGHT, {}], 'orpheus' => [ORPHEUS, {}], 'orpheus_bold' => [ORPHEUS_BOLD, {}], 'nimbus' => [NIMBUS, {}], 'nimbus_bold' => [NIMBUS_BOLD, {}] }.freeze
- LH_BURGUNDY =
WarmlyYours letterhead chrome (matches the approved cover-letter sample).
Contact strings default to the sample's presentation; CompanyConstants holds
the underlying data (PHONE[:usa], ADDRESS[:usa]). Override via the layout. '922328'- LH_INK =
logo wordmark, tagline, H1, footer separators
'262626'- LH_RULE =
body text
'd9d2d0'- LH_TAGLINE =
header/footer hairline
'Modern Radiant Heating Solutions'- LH_PHONE =
'1 (800) 875-5285'- LH_ADDRESS =
'590 Telser Rd Suite B, Lake Zurich, IL, 60047'- LH_WEBSITE =
'www.WarmlyYours.com'- LH_SIDE =
Page geometry in PostScript points (US Letter). Margins clear the header/footer
bands so flowing content never collides with the chrome drawn in the post-pass. 72- LH_TOP_MARGIN =
128- LH_BOTTOM_MARGIN =
100- MAX_TEXT_PREVIEW_CHARS =
Per-page text preview cap (chars) returned by inspect_pdf.
1_500- MAX_TEXT_PREVIEW_PAGES =
Max pages scanned for text preview.
10
Class Method Summary collapse
-
.compress(path, level: '/printer') ⇒ Result
Reduce file size via Ghostscript.
-
.fill_form(path, values:, flatten: false) ⇒ Result
Set AcroForm field values by full field name.
-
.generate(layout:) ⇒ Result
Build a new branded PDF from a declarative layout.
-
.inspect_pdf(path) ⇒ Result
Read structure and content of a PDF without modifying it.
-
.merge(paths:) ⇒ Result
Concatenate several PDFs into one, in the order given.
-
.rotate(path, degrees:, pages: :all) ⇒ Result
Rotate selected pages by a multiple of 90° (clockwise).
-
.select_pages(path, pages:) ⇒ Result
Build a new PDF from an ordered list of page numbers to keep.
-
.split(path, ranges: nil) ⇒ Array<Result>
Explode a PDF into multiple documents.
-
.stamp(path, operations:, pages: :all) ⇒ Result
Draw overlay operations on top of existing page content.
Class Method Details
.compress(path, level: '/printer') ⇒ Result
Reduce file size via Ghostscript. No-op (returns input) when gs is absent.
300 301 302 303 304 305 306 307 308 |
# File 'app/services/pdf/toolkit.rb', line 300 def compress(path, level: '/printer') blob = File.binread(path) res = Pdf::Compressor.new(input_blob: blob, pdf_setting: level).compress out = res.output_blob || blob Result.new(bytes: out, meta: { status: res.result.to_s, original_bytes: blob.bytesize, new_bytes: out.bytesize, saved_bytes: [blob.bytesize - out.bytesize, 0].max }) end |
.fill_form(path, values:, flatten: false) ⇒ Result
Set AcroForm field values by full field name.
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
# File 'app/services/pdf/toolkit.rb', line 163 def fill_form(path, values:, flatten: false) Pdf::Loader.load! raise Error, 'values must be a non-empty map of field_name => value' unless values.is_a?(Hash) && values.any? doc = HexaPDF::Document.open(path) form = doc.acro_form raise Error, 'This PDF has no fillable form fields (no AcroForm). Use stamp to overlay text instead.' unless form applied = [] unknown = [] failed = [] values.each do |name, val| field = form.field_by_name(name.to_s) if field.nil? unknown << name.to_s next end begin field.field_value = coerce_field_value(field, val) applied << name.to_s rescue StandardError => e failed << { field: name.to_s, error: e. } end end form.create_appearances form.flatten if flatten Result.new(bytes: write(doc), meta: { applied: applied, unknown_fields: unknown, failed: failed, flattened: !!flatten, pages: doc.pages.count }) rescue HexaPDF::Error => e raise Error, "Fill form failed: #{e.}" end |
.generate(layout:) ⇒ Result
Build a new branded PDF from a declarative layout.
{ title:, subtitle:, logo: true, page_size: "Letter", orientation: "portrait",
blocks: [ { type: "heading"|"paragraph"|"bullets"|"spacer", ... } ] }
317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 |
# File 'app/services/pdf/toolkit.rb', line 317 def generate(layout:) Pdf::Loader.load! layout = symbolize(layout) return generate_letterhead(layout) if layout[:template].to_s == 'letterhead' composer = HexaPDF::Composer.new( page_size: (layout[:page_size] || 'Letter').to_sym, page_orientation: (layout[:orientation] || 'portrait').to_sym, margin: [54, 54, 54, 54] ) configure_brand(composer.document) apply_generate_styles(composer) if layout.fetch(:logo, true) && File.exist?(Pdf::Config::LOGO_PATH) composer.image(Pdf::Config::LOGO_PATH, width: 180) composer.text(' ', font_size: 8) end composer.text(layout[:title].to_s, style: :gen_title) if present?(layout[:title]) composer.text(layout[:subtitle].to_s, style: :gen_subtitle) if present?(layout[:subtitle]) composer.text(' ', font_size: 6) Array(layout[:blocks]).each { |b| render_block(composer, symbolize(b)) } Result.new(bytes: write_composer(composer), meta: { pages: composer.document.pages.count }) rescue HexaPDF::Error => e raise Error, "Generate failed: #{e.}" end |
.inspect_pdf(path) ⇒ Result
Read structure and content of a PDF without modifying it.
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
# File 'app/services/pdf/toolkit.rb', line 91 def inspect_pdf(path) Pdf::Loader.load! doc = HexaPDF::Document.open(path) info = doc.trailer.info sizes = doc.pages.map do |pg| mb = pg.box(:media) { width: mb.width.round(1), height: mb.height.round(1), rotation: (pg[:Rotate] || 0).to_i } end form = doc.acro_form fields = [] form&.each_field do |f| fields << { name: f.full_field_name, type: (f.concrete_field_type || f.field_type).to_s, value: stringify(f.field_value) } end Result.new(meta: { pages: doc.pages.count, encrypted: doc.encrypted?, page_sizes: sizes, has_acroform: !form.nil?, field_count: fields.size, fields: fields, image_count: image_count(doc), title: present_str(info[:Title]), author: present_str(info[:Author]), producer: present_str(info[:Producer]), text_preview: text_preview(path) }) rescue HexaPDF::Error => e raise Error, "Could not read PDF: #{e.}" end |
.merge(paths:) ⇒ Result
Concatenate several PDFs into one, in the order given.
203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 |
# File 'app/services/pdf/toolkit.rb', line 203 def merge(paths:) Pdf::Loader.load! raise Error, 'merge needs at least one source PDF' if Array(paths).empty? target = HexaPDF::Document.new total = 0 Array(paths).each do |p| src = HexaPDF::Document.open(p) src.pages.each do |pg| target.pages << target.import(pg) total += 1 end end raise Error, 'no pages found across sources' if total.zero? Result.new(bytes: write(target), meta: { pages: total, sources: Array(paths).size }) rescue HexaPDF::Error => e raise Error, "Merge failed: #{e.}" end |
.rotate(path, degrees:, pages: :all) ⇒ Result
Rotate selected pages by a multiple of 90° (clockwise).
256 257 258 259 260 261 262 263 264 265 266 267 268 |
# File 'app/services/pdf/toolkit.rb', line 256 def rotate(path, degrees:, pages: :all) Pdf::Loader.load! norm = Integer(degrees) % 360 raise Error, 'degrees must be a multiple of 90' unless (norm % 90).zero? doc = HexaPDF::Document.open(path) targets = resolve_pages(doc, pages) targets.each { |pg| pg[:Rotate] = ((pg[:Rotate] || 0).to_i + norm) % 360 } Result.new(bytes: write(doc), meta: { pages: doc.pages.count, rotated: targets.size, degrees: norm }) rescue HexaPDF::Error => e raise Error, "Rotate failed: #{e.}" end |
.select_pages(path, pages:) ⇒ Result
Build a new PDF from an ordered list of page numbers to keep. Covers
extraction (a subset), deletion (omit pages), and reordering (permute).
277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 |
# File 'app/services/pdf/toolkit.rb', line 277 def select_pages(path, pages:) Pdf::Loader.load! src = HexaPDF::Document.open(path) count = src.pages.count order = Array(pages).map { |i| Integer(i) } raise Error, 'pages must be a non-empty list of page numbers' if order.empty? order.each { |i| raise Error, "page #{i} out of range (1-#{count})" if i < 1 || i > count } target = HexaPDF::Document.new order.each { |i| target.pages << target.import(src.pages[i - 1]) } Result.new(bytes: write(target), meta: { pages: order.size, kept: order }) rescue HexaPDF::Error => e raise Error, "Select pages failed: #{e.}" end |
.split(path, ranges: nil) ⇒ Array<Result>
Explode a PDF into multiple documents. With no +ranges+, yields one
document per page. +ranges+ is an array of [start, end] (1-based, inclusive).
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 |
# File 'app/services/pdf/toolkit.rb', line 230 def split(path, ranges: nil) Pdf::Loader.load! src = HexaPDF::Document.open(path) count = src.pages.count specs = ranges.presence || (1..count).map { |i| [i, i] } specs.map do |(a, b)| a = Integer(a) b = Integer(b || a) raise Error, "range #{a}-#{b} out of bounds (document has #{count} pages)" if a < 1 || b > count || a > b target = HexaPDF::Document.new (a..b).each { |i| target.pages << target.import(src.pages[i - 1]) } Result.new(bytes: write(target), meta: { pages: b - a + 1, range: [a, b] }) end rescue HexaPDF::Error => e raise Error, "Split failed: #{e.}" end |
.stamp(path, operations:, pages: :all) ⇒ Result
Draw overlay operations on top of existing page content. Each operation
is a Hash with a +:type+ of "text", "cover", "image", or "watermark".
Operations are applied in order to every target page.
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
# File 'app/services/pdf/toolkit.rb', line 138 def stamp(path, operations:, pages: :all) Pdf::Loader.load! raise Error, 'operations must be a non-empty array' unless operations.is_a?(Array) && operations.any? doc = HexaPDF::Document.open(path) targets = resolve_pages(doc, pages) targets.each do |page| canvas = page.canvas(type: :overlay) mb = page.box(:media) operations.each { |op| apply_op(doc, canvas, mb, symbolize(op)) } end Result.new(bytes: write(doc), meta: { pages: doc.pages.count, operations: operations.size, target_pages: targets.size }) rescue HexaPDF::Error => e raise Error, "Stamp failed: #{e.}" end |