Home›Guides›Extract images from PDF

How to extract images from a PDF — using the PDF Pro image extractor.

⏱ 2 min read 🎯 Easy 🛠 PDF Pro Image Extractor

This guide is for the marketer reclaiming a hero photo from an old brand book, the engineer pulling diagrams out of a vendor datasheet, and anyone who needs the actual image asset that's already inside a PDF — not a fuzzy screenshot of it. Five steps to recover the original bytes the author embedded, at the original resolution.

What you'll need

A modern browser (Chrome, Edge, Firefox, or Safari from the last two years)
The PDF you want to mine for images, on your device
An understanding that vector art (logos drawn with paths) won't extract as raster — it isn't there as pixels
About two minutes — including time to filter out icon noise

The five steps

Open the in-browser extractor

Head to the PDF Pro image extractor. The page loads a WebAssembly PDF parser and runs entirely in your tab — no server round-trip, no signup, no queue. Because extraction reads the PDF's object stream directly, the operation is fast: a 200-page document is processed in seconds, not minutes.

Drop the PDF onto the page

Drag the file in. The extractor walks the PDF's object tree, finds every XObject of subtype Image, and reads the underlying compressed stream — typically DCTDecode (JPEG), FlateDecode (PNG-like), JBIG2, or JPEG2000. Each image is listed with its page number, original dimensions, color space, and approximate file size.

If a "logo" you expected to see doesn't appear, it's almost certainly vector — drawn with PDF path operators rather than embedded as a raster. Vector logos can't be extracted as pixels at original quality; they have to be re-rendered (use the PNG converter at high DPI for that case).

Filter and select what you actually want

A typical brochure has dozens of tiny embedded images — bullet glyphs, header textures, repeating patterns. Set a minimum-dimension filter (300×300 is a sensible default) to hide the noise and surface only the assets you'd reasonably want. Then click to select individual images, or use "select all visible" after filtering.

Choose preserve-original or normalize

Two output modes. Preserve original writes each image with its native bytes intact — a JPEG comes out as a .jpg with the original DCT coefficients untouched, a PNG stream comes out as a .png. This is the right choice when the asset is the goal: maximum fidelity, zero re-encoding. Normalize to PNG converts everything to lossless PNG, useful when you need consistent file types or the source uses an exotic encoding (JBIG2, CMYK JPEG) that some downstream tools don't handle.

Download the images

Click any thumbnail for a single download, or hit "Download all" for a zip. Filenames follow originalname-p007-img02.jpg so you can trace each asset back to its page and ordinal position. Open one in your image viewer; if you used preserve-original, the metadata block (camera EXIF, ICC profile, creation timestamp) is intact too. The whole operation happened in your browser — there's no server-side copy of your PDF or its assets.

Download 8 images (zip)

Common mistakes & gotchas

Confusing extract with rasterize. If the goal is "the original photo," use extract. If the goal is "a flat snapshot of how the page looks," use the PDF-to-JPG converter. Two different jobs, two different tools.
Looking for a vector logo as an image. A logo drawn with PDF path operators is not stored as pixels. It will not appear in the image list. The honest options: re-render the logo's page region as PNG at high DPI, or open the PDF in Illustrator and export the paths.
Skipping the size filter. A 200-page corporate report might contain 600+ image objects, most of them bullet glyphs and repeating background tiles. Without filtering, the inventory is unusable.
Normalizing when you didn't need to. Normalizing JPEG-to-PNG inflates file size 5-10x with no visible quality gain. Only normalize when downstream tools require it.
Forgetting CMYK exists. Print-bound PDFs often embed CMYK JPEGs. Preserve-original keeps them as CMYK JPEGs, which most browsers can't display. If you need a quick preview, use normalize-to-PNG (which converts CMYK to sRGB) instead.

Troubleshooting

The extractor says "0 images found" but the PDF clearly has graphics.

The graphics are vector, not raster. PDF can render shapes, illustrations, and many "logos" as path data — there is no embedded pixel asset to extract. Re-render the page (or a crop of it) using the PNG converter at 600 DPI to capture vector art as a high-quality bitmap.

An extracted image is split into many tiles instead of one whole picture.

Some PDF authoring tools (older InDesign exports, scanners) tile large images into 256×256 strips. The extractor will list each tile as a separate image. The fix: use rasterize-the-page mode instead, which gives you the assembled visual at the cost of one re-encoding pass.

Extracted JPEGs look right in the PDF but have wrong colors when opened.

Almost always a CMYK-vs-sRGB mismatch. The PDF embedded a CMYK JPEG and your viewer is interpreting it as sRGB. Re-extract with normalize-to-PNG enabled — the converter will apply the correct color transform on the way out.

I see the same image listed five times across the PDF.

Either the same image is placed on five pages (very common — headers, watermarks), or the PDF has duplicated the image stream rather than referencing it once. Enable "deduplicate identical streams" before downloading and the inventory collapses to one entry per unique asset.

The PDF is password-protected. Can I still extract?

Yes, if you know the password. The extractor prompts for it on load and decrypts the object stream in your browser; the password is held in memory only and discarded when you close the tab. If you don't know the password, the extractor — like every honest tool — won't help you bypass it.

Ready to extract?

Open the in-browser image extractor and run your PDF through the five steps above.

Open the tool →