AIApril 8, 2026

How to Convert PDF to Markdown for ChatGPT, Claude and Gemini

person

Sarah Chen

ML Engineer

schedule4 min read

You have a PDF -- a financial report, a research paper, a legal contract -- and you want an AI to analyze it. The instinct is to select all, copy, and paste into ChatGPT, Claude, or Gemini. That works, sort of. But you are throwing away structure, wasting tokens, and getting worse answers than you could be.

Converting your PDF to Markdown before feeding it to an AI chatbot is a simple step that dramatically improves results. Here is why, and how to do it.

The Copy-Paste Problem

When you copy text from a PDF and paste it into a chat window, several things go wrong:

Structure disappears. Headings become plain text. Tables become jumbled rows of numbers. Multi-column layouts merge into a single stream. The AI has to guess where one section ends and another begins.

Tokens are wasted. PDF text extraction produces inconsistent whitespace, repeated headers and footers from each page, and broken hyphenation from line wraps. All of this consumes your context window without adding useful information.

Tables break. This is the biggest problem. A clean 5-column table in your PDF becomes something like this when pasted as plain text: "Product Revenue Q1 Revenue Q2 Growth Widget A 4.2M 4.8M +14%". No alignment, no cell boundaries, no header distinction. The AI may misread which numbers belong to which columns.

Why Markdown Works Better

Markdown gives AI chatbots exactly what they need:

  • Headings (#, ##, ###) that tell the model about document hierarchy
  • Tables with clear column boundaries using pipe characters
  • Lists that preserve enumeration and nesting
  • Emphasis (**bold**, *italic*) that signals important content
  • Code blocks that are kept separate from prose

All three major AI chatbots -- ChatGPT, Claude, and Gemini -- natively understand Markdown. When you feed them structured input, they can reason about it more accurately.

Token Savings

This is where the numbers get interesting. We converted a 12-page financial report (quarterly earnings) both ways:

MethodToken countStructure preserved
Raw PDF copy-paste8,400 tokensNone
Markdown via mdstill3,600 tokensFull

That is a 57% reduction. The savings come from eliminating repeated page headers, footers, page numbers, broken line wraps, and redundant whitespace. The Markdown version contains the same information in roughly half the tokens.

For ChatGPT with a 256K context window (GPT-5), this means fitting twice as many documents. For Claude with 1M tokens (Opus 4.6), it means richer analysis with more source material. For Gemini 3.1 Pro with its 1M context, the savings still matter for response quality -- less noise means better signal.

Table Preservation

Consider a simple quarterly results table. When copied from PDF, the AI sees a stream of text with no structure. When converted to Markdown, the same data becomes a clean GFM table:

ProductQ1 RevenueQ2 RevenueGrowth
Widget A4.2M4.8M+14%
Widget B2.1M2.5M+19%

Now the AI can answer "Which product grew faster?" correctly every time. With raw PDF text, it is guessing.

Step-by-Step Guide

Converting with mdstill takes seconds:

  1. Go to mdstill.com or any format-specific page
  2. Drop your PDF file into the upload area
  3. Wait for conversion (typically under 2 seconds)
  4. Copy the Markdown output
  5. Paste into ChatGPT, Claude, or Gemini

mdstill works best on digitally-created PDFs with a standard text layer. For scanned PDFs, run them through an OCR tool first (like ocrmypdf) before converting. For math-heavy academic papers with complex equations, consider a dedicated math-aware parser.

For batch processing or automation, the mdstill API lets you convert documents programmatically:

curl -X POST https://mdstill.com/api/convert \
  -F "file=@annual-report.pdf" \
  -o annual-report.md

Tips for Each AI Tool

ChatGPT: Paste the full Markdown and ask your question in the same message. GPT-5 handles long Markdown well. For documents exceeding the context window, split by section headers.

Claude: Claude excels at analyzing structured documents. You can paste multiple converted documents in a single conversation. Use the --- separator between documents and tell Claude which document each section came from.

Gemini: With its large context window, Gemini can handle very long converted documents. Markdown tables are particularly well-parsed by Gemini -- it reliably extracts data points from properly formatted tables.

Stop fighting with copy-paste. Convert your PDFs to Markdown first, and let your AI tools do what they do best -- analyze well-structured content.

#pdf#markdown#chatgpt#claude#gemini#ai#llm

Related technical reads

View allarrow_forward