WorkflowApril 4, 2026

PDF to Markdown for Obsidian: The Complete Guide

person

Jordan Martinez

Technical Writer

schedule4 min read

Obsidian is a powerful knowledge management tool, but it only works with Markdown files. If your knowledge lives in PDFs -- research papers, textbooks, reports, manuals -- it is invisible to Obsidian's search, graph view, and backlink system. Converting those PDFs to Markdown brings them into your vault as first-class citizens.

Why Convert PDFs for Obsidian

PDFs in Obsidian are second-class citizens. You can embed them, but you cannot:

  • Search inside them from the global search
  • Link to specific sections with [[wikilinks]]
  • See them in the graph view
  • Transclude sections into other notes
  • Tag individual paragraphs or sections

Once converted to Markdown, every sentence in that 200-page textbook becomes searchable, linkable, and part of your knowledge graph.

Backlinks become powerful. Mention a concept from a converted paper, wrap it in [[brackets]], and it connects to your existing notes. Your graph view grows richer.

Search becomes complete. Instead of remembering which PDF mentioned a specific formula or statistic, just search your vault. Obsidian finds it instantly across all converted documents.

Converting with mdstill

  1. Go to mdstill.com and drop your PDF
  2. Review the output -- check that headings, tables, and lists look correct
  3. Download the .md file
  4. Move it into your Obsidian vault folder

For scanned PDFs (no selectable text), run ocrmypdf on the file first — mdstill reads the PDF's embedded text layer and cannot OCR scans on its own.

The Markdown output uses standard heading hierarchy (#, ##, ###) that Obsidian's outline panel understands. Tables are converted to GFM format that renders natively in Obsidian's preview mode.

Handling Tables and Images

Tables convert to standard GFM Markdown tables. Obsidian renders them correctly in both edit and preview mode. Tables with clean row and column structure work reliably; tables with merged cells or multi-level headers may lose some hierarchical grouping and benefit from a quick manual fix after conversion.

Images are not included in the Markdown output by default — mdstill produces pure text Markdown. If your PDF relies on images or diagrams, extract them separately (any PDF viewer can export images) and drop them into your vault's attachments folder manually, then add the ![](attachment-name.png) references where needed.

Headers are preserved as Markdown headings, which means Obsidian's outline panel gives you a navigable table of contents for every converted document.

Formulas in PDFs are extracted as plain text where possible, but complex mathematical notation may not render as clean LaTeX. For equation-heavy academic papers, consider a math-aware parser like Nougat or Mathpix instead.

Batch Conversion

If you have a large PDF library, converting files one by one is tedious. Use the mdstill API to automate:

import requests
import os

vault_path = "/path/to/obsidian/vault/References"

pdf_dir = "/path/to/pdfs"
for filename in os.listdir(pdf_dir):
    if not filename.endswith('.pdf'):
        continue
    filepath = os.path.join(pdf_dir, filename)
    with open(filepath, 'rb') as f:
        resp = requests.post(
            'https://mdstill.com/api/convert',
            files={'file': f}
        )
    md_name = filename.replace('.pdf', '.md')
    # Add YAML frontmatter for Obsidian
    frontmatter = f"""---
source: {filename}
type: reference
date: {os.path.getmtime(filepath):.0f}
---

"""
    with open(os.path.join(vault_path, md_name), 'w') as f:
        f.write(frontmatter + resp.text)
    print(f"Converted: {filename}")

This script converts every PDF, adds YAML frontmatter (which Obsidian uses for Dataview queries and properties), and places the result in your vault.

Organizing Converted Docs

Three approaches that work well:

Folder-based. Create a References/ folder in your vault. Subfolders by topic or source: References/Papers/, References/Reports/, References/Books/. Simple, works at any scale.

Tag-based. Add tags to the YAML frontmatter during conversion. Then use Obsidian's tag pane or Dataview to browse by topic regardless of folder location.

MOC (Map of Content). Create a note called "Research MOC" that links to all converted documents with brief annotations. This gives you a curated entry point into your reference library.

For large libraries, combine all three: folders for broad organization, tags for cross-cutting topics, and MOC notes for curated collections.

Converting PDFs to Markdown is the first step. The real value comes from integrating that content into your knowledge graph -- linking, annotating, and building on ideas from your source documents. mdstill handles the conversion; Obsidian handles the rest.

#obsidian#pdf#markdown#knowledge-management#workflow

Related technical reads

View allarrow_forward