Why Is My PDF So Big? (And How to Shrink It)

You finished a two-page document, exported it as a PDF, and somehow ended up with a 40 MB file that refuses to attach to an email. It is a surprisingly common experience. A PDF is not a single thing — it is a container that bundles text, fonts, images, vector graphics, and metadata into one file. When that file is huge, the size almost always comes from a few specific culprits. Once you know which one is to blame, shrinking the PDF is straightforward. This guide walks through each cause and the fix that actually works for it.

What is actually inside a PDF

A PDF stores content as a collection of objects: streams of text, image data, font programs, and the instructions that lay them out on the page. The format is defined by an open ISO standard, and the PDF Association’s overview of ISO 32000 is a good reference if you want the formal details. The practical takeaway is simple: text and vector graphics are tiny, while images are enormous by comparison. A full page of dense text might be a few kilobytes. A single full-page photo can be several megabytes. So when a PDF is unexpectedly large, the first question to ask is almost always: where are the images coming from?

Cause #1: High-resolution embedded images

This is the single most common reason a PDF balloons in size. When you drop a photo from a modern phone or camera into a document, it can be 4000 pixels wide or more. Even though it is displayed at a fraction of that size on the page, the PDF often stores the full resolution. For a document meant to be read on screen or printed at normal quality, you rarely need more than 150 pixels per inch (PPI); for high-quality print, 300 PPI is plenty. Anything beyond that is wasted weight.

The fix is to downsample and recompress those images before or after they go into the PDF. If you control the source images, shrink them first — run them through the JPEG compressorat quality 70–80 and resize them to the dimensions you actually need. If the images are already baked into the PDF, run the whole file through the PDF compressor, which re-encodes the embedded images at a lower, sensible size. For photographic content, this step alone often cuts the file by 80 percent or more.

Cause #2: Scanned pages

A scanned document is a special, and especially heavy, case. When you scan paper, every page becomes a full-page image — there is no real text in the file at all, just pictures of text. Scanners frequently default to very high DPI settings (600 DPI is common), and in full color, which produces massive files for what is essentially black ink on white paper.

There are two things you can do. First, scan smarter: for a text document, scanning at 200–300 DPI in grayscale or black-and-white rather than 600 DPI in color can reduce the size dramatically while staying perfectly legible. Second, for scans you already have, recompress them — passing the file through the PDF compressor re-encodes those page-images at a more reasonable resolution and quality. Be realistic about the limit, though: a scan is fundamentally a stack of images, so it will always be larger than a born-digital text PDF of the same length.

Cause #3: Embedded fonts and duplicated objects

PDFs embed the fonts they use so the document looks identical on every device — which is great for fidelity but adds weight. A well-behaved exporter subsets fonts, embedding only the characters actually used. Poorly configured tools embed the entire font family, and a document that uses several decorative typefaces can carry several megabytes of font data it barely touches. Sticking to a small set of common fonts, and letting your software subset them, keeps this in check.

Then there is plain inefficiency. Some software writes the same image or resource into the file multiple times instead of referencing it once — a logo that appears in a header on every page can get embedded dozens of times. Repeated “Save As” cycles and incremental edits can also leave behind orphaned, superseded objects that never get cleaned up. Re-saving the PDF through a tool that rewrites and de-duplicates its object structure clears this out automatically.

Cause #4: Metadata, attachments, and leftovers

Smaller contributors add up. PDFs can carry document metadata, XMP packets, embedded color profiles, thumbnails, form-field data, JavaScript, and even whole file attachments. Documents exported from design software sometimes include hidden layers, comments, or revision history. None of this is visible on the page, but all of it counts toward the byte total. Optimizing or “flattening” the PDF strips the parts you do not need while leaving the visible content intact.

When a PDF won’t shrink much

Compression is not magic, and it helps to know when you have hit the floor. If your PDF is mostly real text and vector graphics — a born-digital report, a contract, a code listing — it is probably already small, and there is little to gain. The bytes are doing useful work, and squeezing harder will not help. The big wins come almost entirely from images: high-resolution photos and scanned pages. If recompressing those does not move the needle, your file likely was not image-heavy to begin with.

A quick way to diagnose this: ask yourself whether you can select and copy the text in the document. If you can, it is born-digital and the size is coming from embedded images, fonts, or leftovers. If you cannot — if the text is just part of a picture — you have a scan, and downsampling those page-images is your biggest lever.

Shrinking yours, privately

The good news is that fixing all of this does not require uploading your document anywhere. The FileShrinking PDF compressor runs entirely in your browser, so a confidential contract or a scanned ID never leaves your device — it is processed locally and nothing is sent to a server. Start there for any large PDF; if you are assembling a document from your own photos, pre-shrink them with the JPEG compressor first so the bloat never gets in. Between the two, most oversized PDFs come down to a manageable, email-friendly size in a single pass.