API Documentation

One endpoint, both directions: convert documents to clean Markdown, or turn Markdown back into PDF, Word, HTML, EPUB and 9 more targets. One POST request, one file, the conversion back. Works in any language that can speak HTTP.

Quickstart

Your first conversion in 30 seconds

No signup required for the free anonymous tier. Just send a file:

bash

curl -X POST https://mdstill.com/api/convert \
  -F "file=@document.pdf"

You get JSON back with the Markdown and some metadata:

json

{
  "markdown": "# Document Title\n\nConverted content here...",
  "metadata": {
    "filename": "document.pdf",
    "format": ".pdf",
    "converter": "fast",
    "size_bytes": 245760,
    "conversion_time_sec": 0.42,
    "markdown_length": 8320,
    "token_count": 2080
  }
}

That is the whole API. Everything below is just details: authentication for higher limits, other languages, error handling, rate limits.

Authentication

Authentication is optional. Anonymous requests work but share a small per-IP daily quota. Sign up for a free account and generate an API key in your Dashboard to get a higher per-account limit.

Pass your key in the Authorization header:

http

Authorization: Bearer mdr_your_api_key_here

API usage counts toward the same daily quota as the web interface. You can generate up to 5 API keys per account and revoke unused keys from the Dashboard.

Base URL

text

https://mdstill.com

Endpoint

POST

/api/convert

Convert a document to Markdown — or Markdown back to PDF, Word, HTML, EPUB, and more

Request

Send a multipart/form-data request with the file attached. The direction is inferred from the source extension and the optional target_format field — see the reverse-conversion section below for that path.

file

filerequired

The file to convert. Forward direction (→ Markdown) accepts: PDF, DOCX, DOC, PPTX, XLSX, XLS, HTML, HTM, EPUB, CSV, JSON, XML, ZIP, RTF, ODT, Pages, Numbers, Keynote. Reverse direction (Markdown → X) requires .md or .markdown.

target_format

string

Switches to reverse mode (Markdown → binary). One of: pdf, docx, html, epub, pptx, rtf, odt, tex, txt, adoc, rst, fb2, doc. Source must be .md. Response is the target file as a binary download — not JSON.

output

string

Forward mode only. markdown (default) returns just the Markdown and metadata. structured adds a structure object with semantic sections, document outline, and token-counted chunks ready for RAG ingestion.

chunk_tokens

integer

Max tokens per chunk (soft limit). Range: 100 – 4000, default 500. Atomic content blocks (tables, code blocks) are never split mid-element, so individual chunks may exceed this value. Only used when output=structured.

Response format

By default the API returns JSON. You can also request the raw Markdown file directly using the Accept header.

Accept: application/jsonDefault. Returns JSON with markdown and metadata fields.

Accept: text/markdownReturns the .md file directly as a download. No metadata.

File response (Accept: text/markdown):

http

HTTP/1.1 200 OK
Content-Type: text/markdown; charset=utf-8
Content-Disposition: attachment; filename="report.md"

# Document Title

Converted content here...

Markdown → other formats (reverse)

Same endpoint, one extra field

Send a .md file with a target_format field and the API returns the converted file as a binary download instead of JSON. The Accept header is ignored in this mode — output is always the target binary.

bash

curl -X POST https://mdstill.com/api/convert \
  -H "Authorization: Bearer mdr_your_api_key" \
  -F "file=@notes.md" \
  -F "target_format=pdf" \
  -o notes.pdf

Supported `target_format` values

pdf→ .pdf

docx→ .docx

html→ .html

epub→ .epub

pptx→ .pptx

rtf→ .rtf

odt→ .odt

tex→ .tex

txt→ .txt

adoc→ .adoc

rst→ .rst

fb2→ .fb2

doc→ .doc

Notes:

--Input must have a .md or .markdown extension. Sending target_format with any other source returns 400.
--GitHub-Flavored Markdown (GFM) — tables, task lists, fenced code, strikethrough, and footnotes carry across to the target format.
--pdf and doc go through a two-stage chain and take longer than the other targets — budget up to ~10s for cold-start cases.
--LaTeX output ships a compile-ready .tex source (preamble + \documentclass+ body). We don't ship a TeX distribution — compile downstream (Overleaf, texlive, VS Code).
--Reverse conversions count against the same daily quota as forward ones.

Prefer a UI? The same endpoint powers the picker on /convert-markdown.

Code samples

Every sample below does the same thing: upload document.pdf, parse the JSON response, and save the Markdown. Replace mdr_your_api_key with your real key, or drop the Authorization header for anonymous usage.

Simplest form — returns JSON:

bash

curl -X POST https://mdstill.com/api/convert \
  -H "Authorization: Bearer mdr_your_api_key" \
  -F "file=@document.pdf"

Download as a .md file directly:

bash

curl -X POST https://mdstill.com/api/convert \
  -H "Authorization: Bearer mdr_your_api_key" \
  -H "Accept: text/markdown" \
  -F "file=@report.pdf" \
  -o report.md

Or extract markdown from JSON with jq:

bash

curl -s -X POST https://mdstill.com/api/convert \
  -H "Authorization: Bearer mdr_your_api_key" \
  -F "file=@report.pdf" \
  | jq -r '.markdown' > report.md

Using requests (standard in most Python setups):

python

import requests

with open("document.pdf", "rb") as f:
    response = requests.post(
        "https://mdstill.com/api/convert",
        headers={"Authorization": "Bearer mdr_your_api_key"},
        files={"file": f},
        timeout=60,
    )

response.raise_for_status()
data = response.json()

markdown = data["markdown"]
meta = data["metadata"]
print(f"Converted in {meta['conversion_time_sec']}s, {meta['token_count']} tokens")

with open("document.md", "w") as f:
    f.write(markdown)

In the browser, upload a file directly from an <input type="file">:

javascript

async function convertFile(file) {
  const form = new FormData();
  form.append("file", file);

  const response = await fetch("https://mdstill.com/api/convert", {
    method: "POST",
    headers: {
      Authorization: "Bearer mdr_your_api_key",
    },
    body: form,
  });

  if (!response.ok) {
    const err = await response.json().catch(() => ({}));
    throw new Error(err.detail || `HTTP ${response.status}`);
  }

  const { markdown, metadata } = await response.json();
  console.log(`Converted in ${metadata.conversion_time_sec}s`);
  return markdown;
}

Node 18+ has fetch, Blob, and FormData built in — no dependencies needed:

javascript

import { readFile } from "node:fs/promises";

const buffer = await readFile("document.pdf");
const form = new FormData();
form.append("file", new Blob([buffer]), "document.pdf");

const response = await fetch("https://mdstill.com/api/convert", {
  method: "POST",
  headers: { Authorization: "Bearer mdr_your_api_key" },
  body: form,
});

if (!response.ok) {
  const err = await response.json().catch(() => ({}));
  throw new Error(err.detail || `HTTP ${response.status}`);
}

const { markdown, metadata } = await response.json();
console.log(`${metadata.markdown_length} chars, ~${metadata.token_count} tokens`);

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"mime/multipart"
	"net/http"
	"os"
)

type Response struct {
	Markdown string `json:"markdown"`
	Metadata struct {
		ConversionTimeSec float64 `json:"conversion_time_sec"`
		TokenCount        int     `json:"token_count"`
	} `json:"metadata"`
}

func main() {
	file, err := os.Open("document.pdf")
	if err != nil {
		panic(err)
	}
	defer file.Close()

	body := &bytes.Buffer{}
	writer := multipart.NewWriter(body)
	part, _ := writer.CreateFormFile("file", "document.pdf")
	io.Copy(part, file)
	writer.Close()

	req, _ := http.NewRequest("POST", "https://mdstill.com/api/convert", body)
	req.Header.Set("Authorization", "Bearer mdr_your_api_key")
	req.Header.Set("Content-Type", writer.FormDataContentType())

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		panic(err)
	}
	defer resp.Body.Close()

	var result Response
	json.NewDecoder(resp.Body).Decode(&result)
	fmt.Printf("Converted in %.2fs, ~%d tokens\n",
		result.Metadata.ConversionTimeSec, result.Metadata.TokenCount)
}

php

<?php
$ch = curl_init("https://mdstill.com/api/convert");

curl_setopt_array($ch, [
    CURLOPT_POST           => true,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HTTPHEADER     => [
        "Authorization: Bearer mdr_your_api_key",
    ],
    CURLOPT_POSTFIELDS     => [
        "file" => new CURLFile("document.pdf"),
    ],
]);

$response = curl_exec($ch);
$status   = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);

if ($status !== 200) {
    throw new RuntimeException("API error: HTTP $status");
}

$data = json_decode($response, true);
echo "Converted in {$data['metadata']['conversion_time_sec']}s\n";
file_put_contents("document.md", $data["markdown"]);

Convert every PDF in a directory. Drop-in pattern for shell scripts:

bash

mkdir -p output
for file in documents/*.pdf; do
  echo "Converting $file..."
  curl -s -X POST https://mdstill.com/api/convert \
    -H "Authorization: Bearer mdr_your_api_key" \
    -F "file=@$file" \
    | jq -r '.markdown' > "output/$(basename "$file" .pdf).md"
done

Structured / RAG-ready output

Chunks with metadata for vector databases

Add output=structured to get the Markdown plus a structureobject with semantic sections, a document outline, and token-counted chunks. Each chunk includes a heading path and content type labels — ready to drop into LangChain, LlamaIndex, or any RAG pipeline.

bash

curl -X POST https://mdstill.com/api/convert \
  -H "Authorization: Bearer mdr_your_api_key" \
  -F "file=@report.pdf" \
  -F "output=structured" \
  -F "chunk_tokens=500"

The response includes everything from the standard response, plus a structure field:

json

{
  "markdown": "# Introduction\n\nThis report covers...",
  "metadata": {
    "filename": "report.pdf",
    "format": ".pdf",
    "converter": "fast",
    "size_bytes": 245760,
    "conversion_time_sec": 0.42,
    "markdown_length": 8320,
    "token_count": 2080
  },
  "structure": {
    "sections": [
      {"heading": "Introduction", "level": 1, "content": "This report covers...", "tokens": 342},
      {"heading": "Methods", "level": 1, "content": "We used...", "tokens": 518}
    ],
    "headings": ["Introduction", "Methods", "Results", "Discussion"],
    "total_tokens": 2080,
    "max_chunk_tokens": 500,
    "chunks": [
      {
        "id": 0,
        "text": "# Introduction\n\nThis report covers...",
        "tokens": 342,
        "heading_path": ["Introduction"],
        "content_types": ["paragraph"]
      },
      {
        "id": 1,
        "text": "# Methods\n\nWe used a mixed-methods approach...",
        "tokens": 487,
        "heading_path": ["Methods"],
        "content_types": ["paragraph", "table"]
      }
    ]
  }
}

Chunk fields

id

integer

Sequential chunk index, starting from 0.

text

string

The chunk content as Markdown text.

tokens

integer

Exact token count (tiktoken cl100k_base, compatible with GPT-4 and Claude).

heading_path

string[]

Breadcrumb of parent headings, e.g. ["Chapter 1", "Methods", "Data Collection"]. Empty for documents without headings (most PDFs).

content_types

string[]

Types of content in this chunk: paragraph, table, list, code.

How chunking works

--Markdown is parsed into an AST of atomic blocks: paragraphs, tables, lists, code blocks, headings.
--Atomic blocks are never split mid-element. A 2000-token table stays as one chunk even if chunk_tokens=500.
--Each heading starts a new chunk. Adjacent small blocks are merged until they hit the max.
--Overlap: the last paragraph of the previous chunk is repeated at the start of the next for retrieval context.

Convert and get chunks ready for a vector database:

python

import requests

with open("report.pdf", "rb") as f:
    resp = requests.post(
        "https://mdstill.com/api/convert",
        headers={"Authorization": "Bearer mdr_your_api_key"},
        files={"file": f},
        data={"output": "structured", "chunk_tokens": "500"},
    )

data = resp.json()
chunks = data["structure"]["chunks"]

# Each chunk is ready for embedding
for chunk in chunks:
    print(f"Chunk {chunk['id']}: {chunk['tokens']} tokens")
    print(f"  Path: {' > '.join(chunk['heading_path']) or '(root)'}")
    print(f"  Types: {chunk['content_types']}")
    # embed(chunk["text"])  # your embedding call here

Extract just the chunks array:

bash

curl -s -X POST https://mdstill.com/api/convert \
  -H "Authorization: Bearer mdr_your_api_key" \
  -F "file=@report.pdf" \
  -F "output=structured" \
  -F "chunk_tokens=500" \
  | jq '.structure.chunks' > chunks.json

javascript

import { readFile } from "node:fs/promises";

const buffer = await readFile("report.pdf");
const form = new FormData();
form.append("file", new Blob([buffer]), "report.pdf");
form.append("output", "structured");
form.append("chunk_tokens", "500");

const resp = await fetch("https://mdstill.com/api/convert", {
  method: "POST",
  headers: { Authorization: "Bearer mdr_your_api_key" },
  body: form,
});

const { structure } = await resp.json();
console.log(`${structure.chunks.length} chunks, ${structure.total_tokens} tokens`);

// Feed chunks into your vector store
for (const chunk of structure.chunks) {
  await vectorStore.upsert({
    id: `report-${chunk.id}`,
    text: chunk.text,
    metadata: {
      heading_path: chunk.heading_path,
      content_types: chunk.content_types,
      tokens: chunk.tokens,
    },
  });
}

Error codes

All errors return a JSON body with a detail field explaining what went wrong:

json

{
  "detail": "File too large (45MB). Max: 20MB"
}

Code	Meaning	When it happens & how to fix
400	Bad Request	Unsupported format, malformed filename, or file exceeds your plan's size limit. The `detail` field says which. Check the supported formats list and your plan limits.
408	Request Timeout	Conversion took longer than the server timeout. Usually means the file is very large or structurally complex (deeply nested tables, thousands of pages). Try splitting the document.
413	Payload Too Large	Upload exceeded the absolute body-size ceiling before even reaching the converter. Hard cap independent of plan. Split the file and convert pieces separately.
429	Too Many Requests	Daily quota exhausted. The `detail` field shows current usage (`used/limit`). Quotas reset at 00:00 UTC. Sign up or upgrade for a higher limit.
500	Internal Server Error	The converter hit an unexpected error on this specific file. Usually a corrupted or unusual input. Retry once; if it still fails, the file format is likely outside what we handle.
503	Service Unavailable	Server temporarily overloaded or a dependency is degraded. Retry with exponential backoff (1s, 2s, 4s).

There is currently no 401 Unauthorized on the convert endpoint — an invalid or missing API key simply falls through to the anonymous tier and its lower quota.

Rate limits

Limits are daily, per plan. Anonymous requests count against a per-IP quota; authenticated requests count against your account's quota regardless of which API key was used.

Plan	Fast / day	Max file size
Anonymous	10	10 MB
Free	50	20 MB
Pro	500	50 MB

--Quotas reset at 00:00 UTC every day. The counter is based on successful conversions — failed requests (4xx/5xx) do not consume your quota.
--Web interface and API share the same daily quota per account.
--When you exceed the limit the API returns 429 Too Many Requests with a body like {"detail": "Daily fast conversion limit reached (50/50). Upgrade your plan for higher limits."}.
--There is no Retry-After header yet — assume the next window opens at the next UTC midnight.

Supported formats

→ Markdown (forward, 18)

.pdf

PDF

.docx

DOCX

.doc

DOC

.pptx

PPTX

.xlsx

XLSX

.xls

XLS

.html

HTML

.htm

HTM

.epub

EPUB

.csv

CSV

.json

JSON

.xml

XML

.rtf

RTF

.odt

ODT

.pages

Pages

.numbers

Numbers

.key

Keynote

.zip

ZIP

Markdown → X (reverse, 13)

.pdf

PDF

.docx

Word

.html

HTML

.epub

EPUB

.pptx

PowerPoint

.rtf

RTF

.odt

ODT

.tex

LaTeX

.txt

Plain text

.adoc

AsciiDoc

.rst

reStructuredText

.fb2

FictionBook

.doc

Word 97-2003

Notes

--Files are processed in memory and deleted immediately after conversion. Nothing is stored on our servers.
--The API supports both plain Markdown and structured RAG-ready output with semantic chunking. Use output=structured for the latter.
--The same endpoint goes the other way too. Send a .md file with a target_format field and the API returns the converted file as a binary download — see Markdown → other formats above.
--Track your API key usage in the Dashboard. Each key shows total conversions and last activity.
--You can generate up to 5 API keys per account. Revoke unused keys from the Dashboard.

API Documentation

Quickstart

Authentication

Base URL

Endpoint

Request

Response format

Markdown → other formats (reverse)

Supported target_format values

Code samples

Structured / RAG-ready output

Chunk fields

How chunking works

Error codes

Rate limits

Supported formats

→ Markdown (forward, 18)

Markdown → X (reverse, 13)

Notes

Supported `target_format` values