> ## Documentation Index
> Fetch the complete documentation index at: https://mintlify.com/DilwoarH/pdf-visual-regression/llms.txt
> Use this file to discover all available pages before exploring further.

# How it works

> Learn about the algorithm and libraries powering PDF visual regression testing

## Overview

The PDF visual diff tool uses a sophisticated pipeline that combines PDF rendering, image processing, and structural similarity analysis to detect visual differences between PDFs. The core algorithm operates on a page-by-page comparison basis, leveraging industry-standard libraries for accuracy and performance.

## Core libraries

The tool is built on three essential Python libraries:

<AccordionGroup>
  <Accordion title="PyMuPDF (fitz)" icon="file-pdf">
    Used for PDF parsing and rendering. PyMuPDF converts PDF pages into high-resolution pixmaps for comparison.

    **Key features:**

    * Fast PDF page rendering
    * Configurable DPI via zoom matrix
    * Efficient memory handling for large PDFs

    See `pdf_visual_diff.py:2` for the import and `pdf_visual_diff.py:19-20` for PDF loading.
  </Accordion>

  <Accordion title="scikit-image (SSIM)" icon="chart-line">
    Provides the Structural Similarity Index (SSIM) algorithm for quantitative image comparison.

    **Key features:**

    * Perceptual similarity measurement (0.0 to 1.0)
    * Multi-channel support for color images
    * More accurate than pixel-by-pixel comparison

    The SSIM calculation happens at `pdf_visual_diff.py:54` with configurable threshold.
  </Accordion>

  <Accordion title="Pillow (PIL)" icon="image">
    Handles image manipulation, difference visualization, and output generation.

    **Key features:**

    * RGB pixmap conversion
    * ImageChops for pixel-level differences
    * Alpha compositing for highlighted diffs

    Image processing occurs from `pdf_visual_diff.py:42-69`.
  </Accordion>
</AccordionGroup>

## The comparison algorithm

<Steps>
  <Step title="PDF loading and validation">
    The tool opens both PDFs and validates page counts. If the PDFs have different page counts, it warns the user and compares up to the minimum page count.

    ```python theme={null}
    pdf1 = fitz.open(pdf1_path)
    pdf2 = fitz.open(pdf2_path)

    if len(pdf1) != len(pdf2):
        print(f"Warning: PDFs have different page counts...")

    page_count = min(len(pdf1), len(pdf2))
    ```

    See `pdf_visual_diff.py:19-30`
  </Step>

  <Step title="Page rendering at high resolution">
    Each page is rendered at 2x zoom (144 DPI) for accurate visual comparison. The zoom matrix ensures consistent rendering quality.

    ```python theme={null}
    zoom = 2  # DPI = 144
    mat = fitz.Matrix(zoom, zoom)

    img1 = page1.get_pixmap(matrix=mat)
    img2 = page2.get_pixmap(matrix=mat)
    ```

    The rendering happens at `pdf_visual_diff.py:31-39`
  </Step>

  <Step title="Image normalization">
    PyMuPDF pixmaps are converted to PIL RGB images. If dimensions differ, the second image is resized to match the first using LANCZOS interpolation.

    ```python theme={null}
    pil_img1 = Image.frombytes("RGB", [img1.width, img1.height], img1.samples)
    pil_img2 = Image.frombytes("RGB", [img2.width, img2.height], img2.samples)

    if pil_img1.size != pil_img2.size:
        pil_img2 = pil_img2.resize(pil_img1.size, Image.LANCZOS)
    ```

    See `pdf_visual_diff.py:42-47`
  </Step>

  <Step title="SSIM calculation">
    The Structural Similarity Index measures perceptual similarity between the two images. The default threshold is 0.999, meaning 99.9% similarity is required to consider pages identical.

    ```python theme={null}
    # Convert to numpy arrays for ssim
    np_img1 = np.array(pil_img1)
    np_img2 = np.array(pil_img2)

    # Compute SSIM with color channel support
    similarity = ssim(np_img1, np_img2, channel_axis=-1, data_range=255)

    if similarity < threshold:
        diff_pages.append(i + 1)
    ```

    SSIM computation at `pdf_visual_diff.py:49-57`
  </Step>

  <Step title="Difference visualization">
    When differences are detected, ImageChops creates a pixel-level difference image. The differences are thresholded and highlighted in red with 50% transparency.

    ```python theme={null}
    # Calculate pixel differences
    diff = ImageChops.difference(pil_img1, pil_img2)

    # Threshold to make differences more visible
    thresholded_diff = diff.point(lambda p: 255 if p > 20 else 0)

    # Create red highlight overlay
    if thresholded_diff.getbbox():
        drawing_layer = Image.new("RGBA", pil_img1.size, (0,0,0,0))
        drawing_layer.paste((255,0,0,128), mask=thresholded_diff.convert('L'))
        highlighted_img = Image.alpha_composite(pil_img1.convert("RGBA"), drawing_layer)
        highlighted_img.convert("RGB").save(os.path.join(output_dir, f"diff_page_{i+1}.png"))
    ```

    Visualization logic at `pdf_visual_diff.py:59-69`
  </Step>

  <Step title="Results generation">
    The tool generates a JSON report with detailed comparison results including timestamps, page counts, diff locations, and status.

    ```python theme={null}
    results = {
        "timestamp": timestamp,
        "status": "success" if (not diff_pages and not extra_pages) else "error",
        "description": description,
        "pdf1_pages": pdf1_page_count,
        "pdf2_pages": pdf2_page_count,
        "threshold": threshold,
        "identical": not diff_pages and not extra_pages,
        "diff_pages": diff_pages,
        "extra_pages": extra_pages
    }
    ```

    Results are saved at `pdf_visual_diff.py:109-126`
  </Step>
</Steps>

## Handling edge cases

### Different page counts

When PDFs have different page counts, the tool compares pages up to the minimum count and exports extra pages from the longer PDF as separate images.

```python theme={null}
if len(pdf1) > len(pdf2):
    longer_pdf = "PDF1"
    for i in range(page_count, len(pdf1)):
        extra_pages.append(i + 1)
        # Export extra page as image
        pil_img.save(os.path.join(output_dir, f"extra_page_{i+1}_only_in_pdf1.png"))
```

See `pdf_visual_diff.py:72-89`

### Different page dimensions

When page dimensions differ, the second image is automatically resized to match the first using high-quality LANCZOS resampling (`pdf_visual_diff.py:45-47`).

<Note>
  The default SSIM threshold of 0.999 is very strict. You can adjust it with the `--threshold` flag to make comparisons more or less sensitive to minor variations.
</Note>

## Performance considerations

* **DPI setting**: The 2x zoom factor (144 DPI) balances quality and performance. Higher zoom increases accuracy but requires more memory.
* **Memory usage**: Each page is processed individually and closed after comparison to manage memory efficiently.
* **Output optimization**: Only pages with detected differences generate output images, minimizing disk usage.

## Code references

The core comparison logic is in the `compare_pdfs()` function at `pdf_visual_diff.py:10-136`. Key sections:

* **PDF loading**: Lines 19-20
* **Page rendering**: Lines 31-39
* **Image conversion**: Lines 42-47
* **SSIM calculation**: Lines 49-57
* **Diff visualization**: Lines 59-69
* **Extra page handling**: Lines 72-89
* **Results generation**: Lines 109-126
