AIApril 9, 2026

How to Summarize a PDF with AI: Step-by-Step Guide

person

Jordan Martinez

Technical Writer

schedule5 min read

You have a long PDF — a research paper, a contract, a quarterly report — and you need a summary. The fastest approach: hand it to an AI model and ask for a summary. But how you hand it over matters. A clean input produces a clean summary. A messy input produces a summary that misses key data, especially tables and figures.

Here is the step-by-step process, from simple to advanced.

Method 1: Direct Upload (Quick and Simple)

The easiest approach — upload the PDF directly to your AI tool.

ChatGPT: Click the attachment icon, select your PDF, type "Summarize this document", send.

Claude: Click the attachment icon, upload your PDF, ask for a summary.

Gemini: Attach the file and prompt.

This works well for simple, text-heavy PDFs — articles, letters, memos, single-column reports. The AI's built-in PDF parser extracts the text and processes it.

Where it breaks down:

  • Tables with many columns — the parser often merges cells or loses column alignment
  • Multi-column layouts — academic papers, newsletters, brochures
  • Scanned PDFs — some tools handle OCR, others do not
  • Documents where tables contain the key data (financial reports, specs, comparisons)

If your PDF has important tables or complex formatting, Method 2 gives consistently better results.

Method 2: Convert to Markdown First (Better Results)

The extra step takes seconds and the difference is noticeable, particularly when tables matter.

Step 1: Convert

Go to mdstill and drop your PDF. Conversion finishes in under two seconds for most documents. If you have a scanned PDF (no selectable text), run it through ocrmypdf first so mdstill can read the text layer.

The converter produces clean GitHub-flavored Markdown with:

  • Proper heading hierarchy (H1, H2, H3)
  • Aligned tables with column headers
  • Bullet lists and numbered lists preserved
  • Code blocks kept intact

Step 2: Copy and Paste

Copy the Markdown output (one click with the copy button) and paste it into your AI chat. Add your prompt below the content.

Step 3: Prompt for the Summary

A good summary prompt is specific. Instead of "Summarize this", try:

For a financial report:

Summarize this quarterly report. Include: revenue, profit margin, year-over-year changes, and any forward guidance. Present key metrics in a table.

For a research paper:

Summarize this paper in 3 sections: (1) what problem it solves, (2) the approach, (3) key results with numbers. Keep it under 300 words.

For a legal contract:

Summarize the key terms of this contract: parties, obligations, payment terms, termination conditions, and any unusual clauses.

For meeting notes or transcripts:

Extract the key decisions, action items with owners, and open questions from this document. Format as bullet lists.

The specificity of your prompt matters more than the model you use. A precise prompt with clean Markdown input produces excellent summaries from any modern AI model.

Method 3: Batch Summarization (Multiple Documents)

When you need to summarize a stack of PDFs — say, 20 research papers or a folder of monthly reports — doing them one by one is painful.

Batch Convert

Upload multiple files to mdstill at once (up to 10-20 files depending on your plan). Download the Markdown results.

Systematic Prompting

For consistent summaries across many documents, use a template prompt:

I will give you a series of documents in Markdown format, separated by "---".
For each document, provide:
1. Title
2. One-paragraph summary (50-100 words)
3. Key data points (as bullet list)
4. Relevance score (1-5) for [your topic]

Documents:
---
[paste first document]
---
[paste second document]
---

This produces uniform, comparable summaries that you can scan quickly to find the most relevant documents.

API Automation

For regular batch processing, automate the pipeline:

import httpx

def summarize_pdf(pdf_path: str, prompt: str) -> str:
    # Step 1: Convert to Markdown
    with open(pdf_path, "rb") as f:
        md_response = httpx.post(
            "https://mdstill.com/api/convert",
            files={"file": (pdf_path, f, "application/pdf")}
        )
    markdown = md_response.text

    # Step 2: Send to your LLM of choice
    # (example using your preferred API client)
    return call_llm(f"{prompt}\n\n{markdown}")

Convert once, summarize with any model, re-summarize later with different prompts — the Markdown output is reusable.

Tips for Better Summaries

Be specific about length. "Summarize in 3 bullet points" gives a very different result than "Write a detailed 500-word summary." Choose based on your need.

Ask for structure. "Use headings and bullet points" produces a scannable summary. "Write as a paragraph" produces prose. Both are valid — pick what fits your workflow.

Request tables when data matters. If the original document has financial data, tell the AI: "Present key metrics in a table." The model will organize the numbers into a clean comparison — but only if it received those numbers in a structured format to begin with.

Specify what to prioritize. A 50-page document has many possible summaries. "Focus on financial performance" gives a different result than "Focus on strategic risks." Guide the model.

Use follow-up questions. A summary is a starting point. Follow up with "What were the three biggest risks mentioned?" or "Compare Q2 and Q3 revenue in a table." Clean Markdown input makes follow-up answers more accurate because the model has structured data to reference.

Which AI Tool Is Best for Summarization?

All major AI models handle summarization well. The practical differences:

ToolStrengthBest For
ChatGPT (GPT-5)Concise, well-formatted outputQuick summaries, business documents
Claude (Sonnet 4.6 / Opus 4.6)Thorough, handles long documents wellResearch papers, detailed analysis
Gemini 3.1 ProLarge context windowVery long documents, batch analysis

The format of your input matters more than the choice of model. A well-structured Markdown document produces good summaries from any of these tools.

Try It

Take a PDF you need summarized. Convert it to Markdown on mdstill, paste into your AI tool of choice, and compare the result against uploading the raw PDF. The difference is most obvious with documents that have tables — the summary will actually include the numbers instead of skipping or misreading them.

#pdf#summarize#ai#chatgpt#claude#gemini#markdown

Related technical reads

View allarrow_forward