pdforge logo

Product

Resources

Integrations

pdforge logo

Redirected from pdforge.com? You’re in the right place. we’re now pdf noodle!

Redirected from pdforge.com? You’re in the right place. we’re now pdf noodle!

Redirected from pdforge.com? You’re in the right place. we’re now pdf noodle!

Quick Tutorial on Generating PDF from HTML with OpenPDF

Written by

Written by

Marcelo Abreu, founder of pdforge

Marcelo | Founder of pdf noodle

Marcelo | Founder of pdf noodle

Last Updated

Last Updated

Nov 4, 2024

Nov 4, 2024

Tags

Tags

PDF Libraries

PDF Libraries

Java

Java

pdforge logo
pattern behind Call to action

Introduction to OpenPDF for HTML to PDF Conversion

OpenPDF is an open-source Java library that enables developers to create and manipulate PDF documents programmatically. In SaaS applications, generating dynamic PDF reports from HTML content is a common requirement. While OpenPDF doesn’t support direct HTML to PDF conversion out of the box, it can be combined with other tools to achieve effective results.

You can check out the full documentation here.

Comparison Between OpenPDF and Other Java PDF Libraries

When choosing a PDF library or tool for your Java project, it’s crucial to consider features, licensing, community support, and how well it integrates with your technology stack.

  • OpenPDF: An LGPL/MPL-licensed library suitable for commercial use, focusing on PDF creation and manipulation within Java applications.

  • iText: A powerful library with extensive features, but newer versions are AGPL-licensed, which may not fit all projects due to licensing restrictions.

  • Apache PDFBox: Allows low-level PDF manipulation but lacks comprehensive HTML to PDF conversion support.

  • Flying Saucer: Specializes in rendering XHTML and CSS 2.1 to PDF, making it suitable for HTML to PDF tasks, though it’s less actively maintained.

  • Playwright: Primarily a browser automation tool that supports headless browser operations. It can render complex HTML and CSS to PDF by leveraging Chromium’s print to PDF capabilities, making it a good choice for generating PDFs from web content.

If you want to dig deeper on a comparison between OpenPDF and other Java pdf libraries, we also have a detailed article with a full comparison between the best PDF libraries for Java in 2025.

Guide to generate pdf from html using Java OpenPDF
Guide to generate pdf from html using Java OpenPDF
Guide to generate pdf from html using Java OpenPDF

Setting Up OpenPDF in Your Java Project

To start using OpenPDF, add it to your project’s dependencies.

For Maven projects:

<dependency>
    <groupId>com.github.librepdf</groupId>
    <artifactId>openpdf</artifactId>
    <version>1.3.30</version>
</dependency>

For Gradle projects:

implementation 'com.github.librepdf:openpdf:1.3.30'

Installing OpenPDF: A Quick Start Guide

  1. Add the Dependency: Include OpenPDF in your pom.xml or build.gradle.

  2. Refresh Dependencies: Update your project to fetch the new library.

  3. Verify Imports: Ensure you can import OpenPDF classes in your code.

Configuring Your Environment for HTML to PDF

Since OpenPDF doesn’t natively support HTML to PDF conversion, you’ll need to integrate it with an HTML parser like JSoup to manually map HTML elements to PDF elements.

Converting HTML to PDF with OpenPDF

Let’s walk through creating a PDF invoice by parsing an HTML template.

Creating a Complete Invoice HTML/CSS File as an Example

Create an invoice.html file:

<!DOCTYPE html>
<html>
<head>
    <style>
        body { font-family: Arial, sans-serif; }
        h1 { color: navy; }
        table { width: 100%; border-collapse: collapse; }
        th, td { border: 1px solid gray; padding: 8px; text-align: left; }
        .total { font-weight: bold; }
    </style>
</head>
<body>
    <h1>Invoice #12345</h1>
    <p>Date: 2024-11-04</p>
    <p>Customer: Jane Smith</p>
    <table>
        <tr>
            <th>Description</th><th>Quantity</th><th>Unit Price</th><th>Total</th>
        </tr>
        <tr>
            <td>Widget A</td><td>2</td><td>$25.00</td><td>$50.00</td>
        </tr>
        <tr>
            <td>Widget B</td><td>1</td><td>$75.00</td><td>$75.00</td>
        </tr>
        <tr>
            <td colspan="3" class="total">Grand Total</td><td>$125.00</td>
        </tr>
    </table>
</body>
</html>

Writing Java Code for HTML to PDF Conversion

Add JSoup to your dependencies:

<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.15.3</version>
</dependency>

Implement the conversion:

import com.github.librepdf.openpdf.text.Document;
import com.github.librepdf.openpdf.text.Element;
import com.github.librepdf.openpdf.text.Font;
import com.github.librepdf.openpdf.text.Image;
import com.github.librepdf.openpdf.text.Paragraph;
import com.github.librepdf.openpdf.text.pdf.BaseFont;
import com.github.librepdf.openpdf.text.pdf.PdfPCell;
import com.github.librepdf.openpdf.text.pdf.PdfPTable;
import com.github.librepdf.openpdf.text.pdf.PdfWriter;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Element as HtmlElement;
import org.jsoup.nodes.Document as HtmlDocument;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
public class HtmlToPdfConverter {
    public static void main(String[] args) {
        try {
            HtmlDocument htmlDoc = Jsoup.parse(new File("invoice.html"), "UTF-8");
            Document pdfDoc = new Document();
            PdfWriter.getInstance(pdfDoc, new FileOutputStream("invoice.pdf"));
            pdfDoc.open();
            // Set up fonts
            BaseFont baseFont = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.WINANSI, BaseFont.EMBEDDED);
            Font font = new Font(baseFont, 12);
            // Extract and add title
            String title = htmlDoc.select("h1").text();
            pdfDoc.add(new Paragraph(title, new Font(baseFont, 16)));
            // Extract and add date and customer info
            for (HtmlElement p : htmlDoc.select("p")) {
                pdfDoc.add(new Paragraph(p.text(), font));
            }
            // Extract table data
            HtmlElement table = htmlDoc.select("table").first();
            PdfPTable pdfTable = new PdfPTable(4); // Assuming 4 columns
            // Add table headers
            for (HtmlElement header : table.select("th")) {
                PdfPCell cell = new PdfPCell(new Paragraph(header.text(), font));
                cell.setBackgroundColor(new com.github.librepdf.openpdf.text.BaseColor(230, 230, 250));
                pdfTable.addCell(cell);
            }
            // Add table rows
            for (HtmlElement row : table.select("tr").not(":first-child")) {
                for (HtmlElement cell : row.select("td")) {
                    pdfTable.addCell(new PdfPCell(new Paragraph(cell.text(), font)));
                }
            }
            pdfDoc.add(pdfTable);
            pdfDoc.close();
            System.out.println("PDF generated successfully.");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Handling Dynamic Data in Your PDF

To deal with dynamic data, you can use placeholders in your HTML template and replace them at runtime.

Example:

<h1>Invoice #{{invoiceNumber}}</h1>
<p>Date: {{date}}</p>
<p>Customer: {{customerName}}</p>

In your Java code:

Map<String, String> data = new HashMap<>();
data.put("invoiceNumber", "12345");
data.put("date", "2024-11-04");
data.put("customerName", "Jane Smith");
String htmlContent = new String(Files.readAllBytes(Paths.get("invoice.html")), StandardCharsets.UTF_8);
for (Map.Entry<String, String> entry : data.entrySet()) {
    htmlContent = htmlContent.replace("{{" + entry.getKey() + "}}", entry.getValue());
}
// Proceed with parsing htmlContent using JSoup

Handling CSS and Images in Your PDF Output

While OpenPDF doesn’t support CSS, you can manually apply styles.

• Fonts and Colors: Use OpenPDF’s Font class to set font styles and colors.

• Images: Extract image sources from HTML and add them using OpenPDF’s Image class.

Example of adding an image:

String imageUrl = htmlDoc.select("img").attr("src");
Image image = Image.getInstance(imageUrl);
pdfDoc.add(image);

Example of styling text:

Font boldFont = new Font(baseFont, 12, Font.BOLD);
pdfDoc.add(new Paragraph("Bold Text", boldFont));

Best Practices for Using OpenPDF in Production

• Error Handling: Implement comprehensive exception management to catch and log errors.

• Resource Management: Use try-with-resources to ensure documents and streams are closed properly.

• Performance Optimization: Reuse fonts and images to optimize memory usage and performance.

Alternative: Convert HTML to PDF Using pdf noodle

Homepage of pdf noodle

Managing HTML-to-PDF conversion at scale can quickly become a nightmare!

Especially in serverless environments where cold starts, memory limits, and headless browser quirks love to break at the worst possible time (we even wrote a full article about it). Add constant template iterations, version control headaches, and the need to support non-technical contributors, and suddenly your “simple PDF library” turns into an ongoing engineering project.

pdf noodle eliminates all of that.

Instead of maintaining brittle infrastructure or wrestling with outdated pdf libraries, pdf noodle gives you a battle-tested PDF generation API that just works!

Fast, scalable, and designed for both developers and non-developers. You send raw HTML or use our AI-powered template builder, and pdf noodle handles the rendering, scaling, optimization, and delivery so your team doesn’t have to.

Here's an example of a simple API request to generate your pixel-perfect PDF with just a few lines of code:

import java.io.OutputStreamWriter;
import java.net.HttpURLConnection;
import java.net.URL;

public class PdfForgeExample {
    public static void main(String[] args) {
        try {
            URL url = new URL("https://api.pdfnoodle.com/v1/html-to-pdf/sync");
            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
            conn.setRequestMethod("POST");
            conn.setRequestProperty("Authorization", "Bearer your-api-key");
            conn.setRequestProperty("Content-Type", "application/json");
            conn.setDoOutput(true);

            String jsonInputString = " { \"html\": \"your-html\" }";

            try(OutputStreamWriter writer = new OutputStreamWriter(conn.getOutputStream())) {
                writer.write(jsonInputString);
                writer.flush();
            }

            int responseCode = conn.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) {
                // Read the response and process the PDF
            } else {
                // Handle errors
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

pdf noodle also includes a powerful AI Agent that can generate PDF templates instantly, along with a modern editor for refining the design, also using AI, to match your brand. You don't need developing or design experience to quickly update layouts, adjust styling, and manage template versions.

Here’s a quick demo showing how it works:

You can create your account and design your first template without any upfront payment.

Conclusion

OpenPDF is a solid choice when you need fine-grained control over PDF generation and are comfortable with manual mapping from HTML to PDF elements. It’s suitable for projects where you have simple HTML content and need extensive customization of the PDF output.

If your project involves complex HTML and CSS that need to be converted to PDF, libraries like Flying Saucer or iText, or tools like Playwright may be more fitting. Playwright can render complex web pages and generate PDFs using headless browsers, which is particularly useful when your content relies heavily on modern web technologies.

If you don't want to waste time maintaining pdfs layouts and their infrastructure or if you don't want to keep track of best practices to generate PDFs at scale, third-party PDF APIs like pdf noodle will save you hours of work and deliver a high quality pdf layout.

Generating pdfs can be annoying!

Let us help you make it easier while you focus on what truly matters for your company.

pdforge logo
pattern behind Call to action

Generating pdfs can be annoying!

Let us help you make it easier while you focus on what truly matters for your company.

pdforge logo
pattern behind Call to action

Generating pdfs can be annoying!

Let us help you make it easier while you focus on what truly matters for your company.

pdforge logo
pattern behind Call to action

Table of contents

Automate PDF Generation in minutes

No code or design experience needed

AI creates your template in seconds

Fine tune the design in our friendly builder

Generate PDFs with our API or integrations