Building Custom PDFs with PDFKit - A Template-Driven Approach

tags: pdfkit, pdf generation, nodejs, templates, cms integration, automation

When building applications that need to generate PDFs programmatically, developers often reach for HTML-to-PDF solutions like Puppeteer or wkhtmltopdf. While these tools work well for converting web pages, they come with significant overhead: browser dependencies, resource-intensive processes, and limited control over the final output.

Enter PDFKit - a lightweight, pure JavaScript PDF generation library that gives you pixel-perfect control without the baggage. In this article, I’ll share my experience building a production-ready PDF generation system that handles thousands of custom documents, and show you how to create a template-driven architecture that scales.\

Why PDFKit Over HTML-to-PDF Solutions?

After working with various PDF generation approaches, I’ve found PDFKit offers distinct advantages for certain use cases:

Lightweight and Fast
No browser dependencies means faster startup times and lower memory footprint. A typical PDFKit process uses ~50MB of memory compared to ~200MB+ for Puppeteer.\

Precise Layout Control
PDFKit uses PDF point coordinates (1 point = 1/72 inch), giving you exact positioning control. This is crucial for documents with specific print requirements or complex layouts.\

Predictable Output
Unlike HTML/CSS rendering which can vary between browsers, PDFKit generates consistent output every time. No more debugging CSS quirks or print margins.\

True Server-Side
Runs in pure Node.js environments without needing headless browsers. Perfect for serverless functions, Docker containers, or environments with limited resources.\

When NOT to Use PDFKit
If you need to convert existing HTML content or need complex CSS layouts, HTML-to-PDF tools are better suited. PDFKit requires you to programmatically define every element’s position.\

Understanding PDFKit’s Core Concepts

Before diving into implementation, let’s cover the fundamentals that make PDFKit powerful.\

Coordinate System

PDFKit uses a coordinate system where:\

Origin (0, 0) is the top-left corner
X increases to the right
Y increases downward
Measurements are in points (72 points = 1 inch)

const PDFDocument = require('pdfkit');
const fs = require('fs');

const doc = new PDFDocument({
  size: 'A4', // 595 x 842 points
  margins: { top: 50, bottom: 50, left: 50, right: 50 }
});

doc.pipe(fs.createWriteStream('output.pdf'));

// Position text at exact coordinates
doc.fontSize(16)
   .text('Hello World', 100, 100); // x: 100, y: 100

// Draw a rectangle at specific position
doc.rect(100, 150, 200, 50) // x, y, width, height
   .stroke();

doc.end();

Drawing Primitives

PDFKit provides methods for all basic PDF elements - text, images, shapes, and paths. Each element can be precisely positioned using x,y coordinates and styled with fonts, colors, and sizing options. The library supports chainable methods for cleaner code, and all standard PDF drawing operations.

Building a Template-Driven System

The real power comes from separating content from layout. Instead of hardcoding positions, we define reusable templates.\

The Template Concept

A template is a declarative definition of your PDF’s layout - describing what goes where, how it should look, and what data fields it expects. By storing this as structured data (JSON, database records, etc.), you can modify layouts without touching code, enable version control, and allow non-technical team members to create new designs.\

Here’s what a template looks like:\

{
  "id": "book-cover-template",
  "name": "Children's Book Cover",
  "pageSize": "A4",
  "pages": [
    {
      "pageNumber": 1,
      "elements": [
        {
          "type": "image",
          "field": "coverImage",
          "x": 0,
          "y": 0,
          "width": 595,
          "height": 842,
          "fit": "cover"
        },
        {
          "type": "text",
          "field": "bookTitle",
          "x": 100,
          "y": 600,
          "width": 395,
          "fontSize": 48,
          "font": "Helvetica-Bold",
          "color": "#FFFFFF",
          "align": "center"
        },
        {
          "type": "text",
          "field": "authorName",
          "x": 100,
          "y": 680,
          "width": 395,
          "fontSize": 24,
          "font": "Helvetica",
          "color": "#FFFFFF",
          "align": "center"
        }
      ]
    }
  ]
}

Separating Content from Presentation

The generator class reads the template and loops through pages and elements. For each element, it extracts the corresponding data field and renders it using PDFKit methods. A text element uses doc.text() with position and styling from the template. An image element uses doc.image() with the file path from the data. This architecture means you can create multiple templates (certificates, posters, books) without changing the generator code.\

Dynamic Content Injection

With this system, generating a PDF becomes simple:\

const template = require('./templates/book-cover-template.json');
const generator = new PDFGenerator(template);

const bookData = {
  coverImage: '/path/to/cover-image.png',
  bookTitle: 'The Adventures of Luna',
  authorName: 'Written by Alex'
};

await generator.generate(bookData, 'output.pdf');

Practical Implementation Patterns

Let’s explore how to build a production-ready system with real-world considerations.\

Headless CMS Integration

Using a headless CMS (content management systems like Payload CMS, Directus, or Strapi) gives you a visual admin interface where your team can create and manage templates without writing code. These systems provide databases, APIs, and user interfaces out of the box.\

Instead of storing templates as flat JSON, structure them with proper relationships and nested data:\

// Example schema for a CMS collection
export const Templates = {
  slug: 'pdf-templates',
  fields: [
    {
      name: 'name',
      type: 'text',
      required: true
    },
    {
      name: 'category',
      type: 'select',
      options: ['book-cover', 'certificate', 'poster', 'custom']
    },
    {
      name: 'pageSize',
      type: 'select',
      options: ['A4', 'Letter', 'Custom'],
      defaultValue: 'A4'
    },
    {
      name: 'pages',
      type: 'array',
      fields: [
        {
          name: 'pageNumber',
          type: 'number',
          required: true
        },
        {
          name: 'elements',
          type: 'array',
          fields: [
            {
              name: 'type',
              type: 'select',
              options: ['text', 'image', 'shape', 'line']
            },
            {
              name: 'fieldName',
              type: 'text',
              admin: {
                description: 'Data field this element will display'
              }
            },
            {
              name: 'x',
              type: 'number',
              required: true
            },
            {
              name: 'y',
              type: 'number',
              required: true
            },
            {
              name: 'width',
              type: 'number'
            },
            {
              name: 'height',
              type: 'number'
            },
            {
              name: 'fontSize',
              type: 'number',
              admin: {
                condition: (data, siblingData) => siblingData.type === 'text'
              }
            },
            {
              name: 'fontFamily',
              type: 'text',
              admin: {
                condition: (data, siblingData) => siblingData.type === 'text'
              }
            },
            {
              name: 'color',
              type: 'text',
              defaultValue: '#000000'
            }
          ]
        }
      ]
    },
    {
      name: 'previewImage',
      type: 'upload',
      relationTo: 'media'
    }
  ]
}

Benefits of this approach:\

Visual template management interface for non-developers
Structured data validation at the database level
Version control and audit trails for templates
Role-based access control (who can create/edit templates)
Template categorization and search
Conditional field visibility (fontSize only shows for text elements)
API endpoints auto-generated for fetching templates

Template Preview System

Building a preview system is crucial for validating templates before production use. Create an API endpoint that generates the PDF in memory, converts the first page to an image (using libraries like pdf-poppler or pdf2pic), and returns it to the frontend. Your team can then see exactly what the PDF will look like with sample data before deploying templates to production. This visual feedback loop dramatically speeds up template development and reduces errors.

File Management Strategy

For production systems, decide early whether to store PDFs locally or in cloud storage like AWS S3, Google Cloud Storage, or Azure Blob Storage. Create a storage abstraction layer that supports both options through environment configuration. This lets you develop locally but deploy with cloud storage. Consider organizing files by date or customer ID for easier management, and implement automatic cleanup policies for temporary or expired PDFs.

Real-World Considerations

Here are lessons learned from running a PDF generation system in production.\

Performance Optimization

The biggest performance bottleneck in PDF generation is typically image processing. Large, unoptimized images can dramatically increase both generation time and final file size.\

Image Optimization Strategy:
The key is to process images before adding them to the PDF. Tools like Sharp provide fast, efficient image manipulation:\

Implementation Approach:
Before rendering each image element in your template, pass it through Sharp to resize and convert to JPEG. Store the optimized version in a temporary location, use it in the PDF, then clean up afterward. Enable PDF compression in PDFKit’s document options (compress: true). Track all temporary files and ensure cleanup happens even if generation fails.\

Results You Can Expect:\

PDF file sizes reduced by 60-80% (from ~10MB to ~2MB for image-heavy documents)
Generation time improved by 30-40% due to smaller file processing
Significantly lower memory usage during generation

Key Principles:\

Always convert to JPEG when transparency isn’t needed
Resize to reasonable dimensions (2000x2000 max for high quality)
Use progressive JPEG encoding for better compression
Enable PDF-level compression
Implement proper cleanup of temporary files

Font Management

Custom fonts are essential for branding and multilingual support. PDFKit uses doc.font(fontPath) to register fonts - you need the actual .ttf or .otf files on your server. Create a font registry that maps font names to file paths, supporting different weights (regular, bold, italic). For RTL languages like Arabic or Hebrew, use the features option with ['rtla'] to enable right-to-left text rendering, and set align: 'right' in text options. Load all fonts at startup to avoid repeated file reads.

Error Handling

Implement validation before generation starts. Check that templates have required structure (pages array, elements for each page). Validate that incoming data contains all required fields referenced in the template. Catch specific errors like missing image files (ENOENT), invalid fonts, or corrupted templates, and throw custom errors with clear messages and error codes. This helps with debugging and provides meaningful feedback to users.\

Scalability

For high-volume PDF generation (hundreds or thousands per day), implement a queue system using Redis and libraries like Bull or BullMQ. Instead of generating PDFs synchronously during API requests, add jobs to a queue and process them with worker processes. This prevents timeouts, allows retry logic for failures, and lets you scale horizontally by adding more workers. Track job progress and notify users when their PDF is ready.

Example Use Case: Personalized Children’s Books

Let me walk through a complete example of generating personalized books:\

The Requirements:\

Multi-page book (20-30 pages)
Custom cover with child’s name and photo
Story pages with illustrations and text
Personalized throughout (child’s name in story)
Support multiple languages

Template Structure:\

{
  "id": "personalized-book-adventure",
  "name": "Personalized Adventure Book",
  "pageSize": [595, 842],
  "pages": [
    {
      "pageNumber": 1,
      "type": "cover",
      "elements": [
        {
          "type": "image",
          "field": "backgroundImage",
          "x": 0,
          "y": 0,
          "width": 595,
          "height": 842
        },
        {
          "type": "image",
          "field": "childPhoto",
          "x": 197.5,
          "y": 200,
          "width": 200,
          "height": 200,
          "shape": "circle"
        },
        {
          "type": "text",
          "field": "bookTitle",
          "x": 50,
          "y": 450,
          "width": 495,
          "fontSize": 36,
          "font": "CustomBold",
          "color": "#FFFFFF",
          "align": "center"
        }
      ]
    },
    {
      "pageNumber": 2,
      "type": "story",
      "elements": [
        {
          "type": "image",
          "field": "page2Illustration",
          "x": 50,
          "y": 50,
          "width": 495,
          "height": 400
        },
        {
          "type": "text",
          "field": "page2Text",
          "x": 50,
          "y": 470,
          "width": 495,
          "fontSize": 18,
          "font": "CustomRegular",
          "lineHeight": 1.5,
          "align": "left"
        }
      ]
    }
  ]
}

The Process:
The template defines the cover page with a background image, centered child’s photo, and personalized title. Story pages follow a consistent pattern: illustration on top, text below. Data preparation fetches story content from the database, replaces placeholder tags like {childName} with the actual name, and maps each page’s illustration and text to the template’s field names. The generation flow is straightforward: fetch order → prepare data → load template → generate PDF → save to storage → notify customer.\

Results:
This system successfully generates personalized books with:\

Average generation time: 1.5 seconds per 24-page book
Consistent quality across thousands of orders
Support for multiple languages including RTL text
Easy template updates without code changes

Conclusion & Key Takeaways

Building a PDF generation system with PDFKit requires initial investment in architecture, but pays dividends in control, performance, and maintainability.\

When PDFKit is the Right Choice:\

You need precise layout control for print-ready documents
Performance and resource usage are critical
Output consistency is a requirement
You’re building templates that non-developers will manage

Key Lessons Learned:\

Invest in templates early - Separating layout from code makes iterations much faster
Build preview systems - Visual feedback is essential for template development
Optimize images - Image processing is often the bottleneck
Plan for scale - Use queues for high-volume generation
Error handling matters - Detailed validation prevents production issues

What I’d Do Differently:\

Start with a more structured template validation system
Implement template versioning from day one
Build comprehensive logging earlier in the process

Resources to Get Started:\

PDFKit Documentation - Official docs and examples
PDFKit GitHub - Source code and issues
Practice with simple examples before building complex systems

The combination of PDFKit’s power with a well-architected template system creates a PDF generation solution that scales from dozens to thousands of documents while remaining maintainable and accessible to your entire team.