Browser-Side Parsing · File Binary Stays Local

Summarize a PDF whose binary never leaves your browser.

Only the extracted text passages travel — never the file itself.

pdf.js parses locally. The AI synthesis runs server-side on text passages only.

✔ File bytes never upload ✔ Embedded fonts & images stay local ✔ Verifiable in DevTools

This page treats privacy as a technical claim you can verify. The PDF is parsed by pdf.js in your browser; the file binary, embedded fonts, and images never traverse the network. When you trigger summarization, the request sent to the AI carries only the extracted text passages required for synthesis — not the document itself.

If you handle NDA-bound material or regulated records, removing the file upload from the threat model is the meaningful reduction: no server-side copy of the PDF, no holding window, no third-party storage of the document. Pair it with end-to-end encrypted transfer when you need to share the original alongside the summary.

shieldFile stays in browser filter_altText passages only verified_userDevTools-verifiable policyGDPR-friendlier

summarizeOpen the Summarizer Verify it yourself

Why "no upload" actually matters

Privacy isn't an aesthetic — it's a constraint. These are the situations where uploading the PDF is not a trade-off, it's a non-starter.

gavel

NDA-bound documents

M&A drafts, term sheets, source-code reviews, supplier contracts. The NDA likely names "no third-party processors" — uploading to a SaaS summarizer breaches it. In-browser summarization keeps the file out of the upload boundary.

balance

Regulated industries

Healthcare, finance, legal, and public-sector workflows have hard rules about where personally identifiable or privileged data can be sent. Local-only processing removes the regulator question entirely — no DPA, no sub-processor list.

work

Sensitive client work

Litigation strategy memos, compensation grids, board decks. The risk of an unaudited server holding even a transient copy is professional, not theoretical. Zero-upload removes the holding period.

do_not_disturb_on

Files you can't put on someone else's server

Internal-only research, pre-publication manuscripts, security audits, classified attachments. If policy says "must not leave the device," server-side summarization is off the table — in-browser is the only compliant path.

How to verify the file binary doesn't upload

Treat this like a security audit. Three steps, thirty seconds — you check the request payload yourself.

Open DevTools → Network

Press F12 (or Cmd+Option+I on macOS) and click the Network tab. Use the Fetch/XHR filter so static-asset noise doesn't distract you. Click the clear (⊘) button to start with an empty log.

Drop your PDF and run the summarizer

Open the summarizer, drop a file in, and click summarize. Dropping the file triggers no upload — pdf.js parses it locally. Clicking summarize fires one request to the AI endpoint.

Inspect the request payload

Click the summarize request in the Network panel and open the Payload tab. You will see the extracted text passages — never a binary blob the size of your PDF. The payload size will be a few KB regardless of whether you summarized a 2 MB or a 200 MB document.

Elements Console Sources Network Performance

NameStatusTypeTime

filter_alt

/api/summarize · 4.2 KB payload

extracted text passages only · file binary not transmitted

1 request · 4.2 KB sent File bytes: 0

What runs in your browser vs server-side

Four stages run client-side; one runs on a hosted LLM. The split is intentional and the boundary is the only thing that travels over the network.

description

PDF parsing

pdf.js reads pages, fonts, and content streams locally in your tab.

→

format_align_left

Text extraction

Glyph runs are reflowed into clean paragraphs with page-position metadata.

→

grid_view

Chunk & select

Passages required for the summary are picked client-side; the rest never travels.

cloud

AI synthesis (server)

Selected text passages are sent to a hosted LLM (Anthropic Claude). The PDF binary is not.

→

summarize

Output rendering

The summary is composed in the tab with page citations linked back to local source positions.

verified

File bytes uploaded

memoryOnly the orange box leaves your device — and it carries text passages, never the file binary, fonts, or images.

Cloud upload vs in-browser

Same end result — a summary of your PDF — produced by two architectures with very different threat models.

cloud_uploadCloud upload summarizer

The full PDF binary traverses the public internet to a server you don't control.
A server-side process holds the file (even briefly) in storage you can't audit.
Embedded fonts, images, and metadata travel along with the document text.
File-retention windows, access logs, and breach exposure all apply to the binary.
The provider sees the document's filename, size, and structure, not just its content.

verified_userPDF Pro · text-passages-only

The PDF binary stays in the browser tab — pdf.js parses it locally.
No server-side copy of the file ever exists. There is nothing to retain or leak.
Embedded fonts, images, and metadata never travel over the network.
Only the extracted text passages required for the requested summary are sent to the AI.
Page citations are derived in your browser from local source positions, then linked back to the AI's bullets.
Closing the tab releases the parsed PDF from memory — there is no server-side trace of the file.

When keeping the file binary local matters

Some workflows treat the full document — fonts, images, embedded metadata — as more sensitive than its plain text. These are the contexts where the file-vs-passages distinction is the requirement.

lockDocuments whose binary is sensitive

PDFs whose embedded fonts, images, or metadata reveal source systems, watermarks, or internal markings — even when the prose itself is shareable. Keeping the binary in the browser prevents that fingerprint from reaching any third-party server.

routerBandwidth-constrained networks

A 200 MB binder over a coffee-shop or in-flight connection takes minutes to upload before anything happens. Parsing locally and sending only the text passages collapses that to a few KB of payload regardless of source-file size.

policyNDA-bound material

When an NDA forbids transmitting the document itself but is silent on summaries, the file-stays-local architecture lets you stay inside the letter of the agreement: no copy of the PDF reaches a third party, only the text required for synthesis.

Related privacy-first PDF tools

All of these keep the PDF binary in the browser. Some are fully client-side (compress, convert); the AI tools send only extracted text passages.

Frequently asked questions

Can I really verify the file doesn't upload?

Yes. Open Chrome DevTools (F12), switch to the Network tab, filter by Fetch/XHR, and clear the log. Drop a PDF into the summarizer. Dropping the file triggers no upload — pdf.js parses it inside the tab. When you click summarize, click the resulting request and open the Payload tab: you will see the extracted text passages, not a binary blob the size of your PDF. The payload size is a few KB regardless of source-file size, which is the proof that the binary stayed local.

Does the summarizer need an internet connection?

You can load and parse a PDF offline once the page is cached, but the summary itself requires a connection. The AI synthesis runs server-side on a hosted LLM (Anthropic Claude), so the extracted text passages have to make a network round-trip to the API. The file binary does not — only the text the AI needs to write the summary.

What about the AI model — isn't it server-hosted?

Yes — the LLM that writes the summary is hosted (Anthropic Claude via API). What is not hosted is the PDF parsing, text extraction, chunking, and citation linking — those run in your browser via pdf.js. The privacy claim is precise and bounded: your PDF binary, embedded fonts, and images never travel to our servers or to the AI provider. Only the extracted text passages required for the requested summary cross the wire. If your concern is "does the file itself reach a third party," the answer is no.

Why does the page take a moment to load before I can drop a file?

That delay is the browser fetching pdf.js and the page assets into local cache. After first load, parsing a new PDF is instant — only the AI synthesis call (which carries the extracted text, not the file) needs the network.

Is there a file-size limit?

There is no server-side upload cap because the file binary never uploads. The practical ceiling is your device's available memory, since pdf.js loads the PDF into the tab to extract text. A typical laptop handles 200–400 page PDFs comfortably; longer documents are best summarized per chapter. Mobile browsers have tighter memory limits, so very long PDFs are best processed on desktop. The summarizer will not throttle or reject based on source-file size — what it meters is the number of AI summary calls per month.

Summarize your PDF without uploading the file.

Open the summarizer, drop a file, read the summary. Then open DevTools, inspect the request payload, and confirm: text passages, not the binary.

summarizeOpen the Summarizer