Does the summarizer work on scanned PDFs?

Only after OCR. A scanned PDF is a stack of images — there's no text layer to summarize until characters are recognized. Run the file through an in-browser OCR pass first, then the summarizer can extract key points. If the OCR confidence is low, the TL;DR will flag uncertain passages instead of inventing content.

How long can the input PDF be?

Practical ceiling is around 800 pages of dense text or roughly 400,000 tokens, chunked and progressively distilled. Longer documents are split into sections, each summarized separately, then merged into a final TL;DR. Browser memory is the real limit — a modern laptop handles a 60-page report in seconds and a 600-page legal binder in under a minute.

Does it cite the source pages?

Yes. Every bullet in the TL;DR includes a page reference like p.12 or pp.34–37 pointing back to the passage that generated it. Click a citation to jump straight to the original page. This is what separates auto-summary from a hallucinated paraphrase — you can verify each claim in two seconds.

Does my file leave my browser?

PDF parsing, text extraction, and chunking happen entirely client-side via WebAssembly. The model call carries only the extracted text segments needed for summarization — your file binary never leaves the device. Open DevTools → Network during a run and you will see no PDF upload.

Why is the output sometimes shorter than I expected?

A good summary is bounded by signal density, not page count. A 200-page agreement with heavy boilerplate may compress to eight bullets because the unique substance is small. The summarizer favors brevity over padding — if you need more depth, switch to the long-form mode or open a chat session against the document.

Local PDF summarization · Page-cited output

你打开了一份60页的 PDF。以下是 60-second version.

专为长文档的阅读成本而设计的摘要工具 — 研究论文、合同、文字记录、财务文件 — 提炼为带有源页面引用的结构化要点。Less reading. Same understanding.

auto_awesome打开摘要工具 play_arrowSee how it works

浏览器端解析。每条要点均有引用。文件不离开你的设备。

Quarterly Report — 62 pp.

p. 12 / 62

Distill

TL;DR 62 pp → 3 pts

1 Revenue rose 14% YoY, driven by enterprise renewals.p.4

2 Operating margin compressed 220 bps on infrastructure spend.pp.18-21

3 Guidance reaffirmed; FX headwind flagged for Q4.p.49

Source-cited · verifiable 2.1 s

Why most summarizers fall short

不可信的摘要比阅读原文更糟糕。实际使用中反复出现三种失败模式 — 这款工具专门针对每种模式进行了优化。

Failure 1

format_align_left

Generic LLM dump

整篇文档被塞入一个提示词，模型返回一段论文形式的段落。没有结构、没有优先级、没有浏览路径。你仍需线性阅读摘要。

Failure 2

report

Hallucinated citations

要点引用「第47页」，但相关内容在第12页 — 更糟的是，捏造源文档中根本不存在的引用。没有可验证的参考，每个论断都需要对照原文重新核查。

Failure 3

cloud_upload

Slow upload roundtrips

PDF 被发送到服务器，排队、远程解析，然后摘要流式返回。在咖啡店网络下处理200 MB 的文件，第一个字出现前你已等待一分钟。

How the summarizer works

四个阶段。PDF 全程保留在你的设备上，只有提取的文本段落被摘要。

file_open

Parse locally

WebAssembly 在浏览器中逐页提取文本层。布局、标题和分页得以保留，确保引用准确。

grid_view

Chunk by section

标题、分页符和语义边界将文档分割为章节。每个块携带其页面范围作为元数据。

auto_awesome

Distill key points

每个章节被提炼为核心论断。冗长的套话被压缩，实质内容得以保留。页面引用随每条要点传递。

list_alt

Assemble TL;DR

各章节摘要合并为单一排序的要点列表 — 可直接复制粘贴，并附有可点击的源页面引用。

何时使用 this tool

长篇、密集或技术性 PDF，略读成本高、误读成本更高的场景。

science

Research papers

方法、结果、局限性 — 按审稿人的阅读顺序提取。引用指向论文的相关章节。

Academic

gavel

Contracts & agreements

期限、费用、终止、赔偿、适用法律。提取为独立要点，便于识别重要义务。

Legal

forum

会议记录

Decisions, action items, owner, deadline. Filler conversation drops away; the durable outcomes stay.

Operations

trending_up

Financial reports

年度报告、财报、年报 — 变动的数字、调整的指引、新披露的风险。

Finance

The old way vs. our way

同一份PDF，两条通往“豁然开朗”的不同路径。

Old way

Upload, wait, hope the summary holds up.

close在任何事情发生之前，将你的 PDF 上传到陌生人的服务器
closeOne paragraph of prose — no skim path, no priorities
close引用要么缺失，要么是虚构的；无法在数秒内核实
close"Daily limit reached" after three documents
closeSign-up wall before the first summary renders

Our summarizer

Drop, distill, verify.

checkPDF 在你的浏览器中解析 — 二进制文件从不离开页面
checkStructured key-points extraction — ranked bullets, scannable
check每条要点都带有链接回源文档的页面引用
check长篇模式用于深度分析；PDF 对话用于后续提问
checkNo upload, no signup gate, no daily summary cap

阅读摘要后需要深入了解？打开 chat session against the same PDF — 问题将以同样的页面引用规范得到解答。

Three things this tool actually does

仅限可验证的声明 — 可在开发者工具或源 PDF 中确认的功能。

memory

Local processing

PDF 解析和分块在浏览器内的 WebAssembly 中运行。文件二进制内容从不跨越网络。

cloud_off

No file upload

在摘要运行期间打开开发者工具 → 网络。你不会看到任何包含你 PDF 的请求体 — 只有短文本片段。

link

Source-cited output

每条要点链接到其来源的确切页面或页面范围，摘要中的每个论断均可两次点击验证。

相同的浏览器优先模型驱动我们的其他工具 — 无需上传即可翻译PDF, compress a PDF locally, 在浏览器中将 PDF 转换为 Word，或通过以下方式发送机密文件 end-to-end encrypted transfer.

Questions about the summarizer

边缘情况、限制和通常未说明的事项。

摘要工具适用于扫描版 PDF 吗？

仅在 OCR 之后可用。扫描版 PDF 是图片堆叠 — 在字符被识别之前没有可摘要的文本层。先对文件进行浏览器内 OCR，然后摘要工具才能提取要点。如果 OCR 置信度低，摘要会标记不确定的段落，而非捏造内容。

输入 PDF 可以多长？

实际上限约为800页密集文本，约400,000个词元，分块逐步提炼。更长的文档分割为章节，各自摘要后合并为最终摘要。浏览器内存是真正的限制 — 现代笔记本电脑几秒内处理60页报告，不到一分钟处理600页法律文件。

会标注来源页码吗？

是的。TL;DR中每条摘要都附有页码引用，例如 p.12 or pp.34–37 指向生成它的段落。点击引用跳转到原始页面。这就是自动摘要与幻觉释义的区别 — 你可以在两秒内验证每个论断。对同一 PDF 进行自由问答，请切换至 chat-with-PDF.

我的文件会离开浏览器吗？

PDF 解析、文字提取和分块完全通过 WebAssembly 在客户端完成。模型调用只携带摘要所需的提取文本段落 — 你的文件二进制内容从不离开设备。运行期间打开开发者工具 → 网络，你不会看到任何 PDF 上传。与我们的 no-upload compressor and no-upload converter.

为何输出有时比预期短？

好的摘要受信息密度限制，而非页数。一份200页充满套话的协议可能压缩为八条要点，因为独特的实质内容很少。摘要工具倾向于简洁而非填充 — 如需更多深度，切换到长篇模式或对文档开启对话会话。字数填充恰恰使摘要难以阅读。

Stop reading the whole thing. Read the TL;DR.

上传 PDF，获得带页面引用的结构化要点集合 — 在浏览器中，几秒内完成，无需将文件发送到任何地方。

auto_awesomeOpen the summarizer — Free