ParsingMarch 24, 2026

Excel to Markdown Tables: The Complete Guide

person

Alex Riveria

Core Maintainer

schedule4 min read

Spreadsheets are the most common source of tabular data in business. Converting them to Markdown unlocks version control, documentation embedding, and LLM-friendly formatting. This guide covers everything about going from XLSX to clean GFM tables.

GFM Table Syntax

GitHub-Flavored Markdown tables are simple but strict. Here is the anatomy:

| Header 1 | Header 2 | Header 3 |
| :------- | :------: | -------: |
| left     |  center  |    right |
| aligned  | aligned  |  aligned |

Rules to remember:

  • The header row is required -- GFM has no headerless tables
  • The separator row uses dashes (---) with optional colons for alignment
  • : on the left = left-align, both sides = center, right = right-align
  • Every row must have the same number of pipe-delimited cells
  • Cell content cannot contain literal pipe characters (escape as \\|)

How mdstill Converts Excel

When you upload an XLSX file to mdstill, the conversion pipeline does the following:

  1. Parse the workbook and extract sheet data
  2. Detect the data range in each sheet -- ignoring empty rows and columns at the edges
  3. Extract the first row as headers (this is the GFM requirement)
  4. Render each data row as a pipe-delimited Markdown table row
  5. Apply formatting -- numbers are right-aligned, text is left-aligned

The output for a simple sales spreadsheet:

| Product    | Units Sold |  Revenue | Region    |
| :--------- | ---------: | -------: | :-------- |
| Widget A   |      1,200 |  $48,000 | Northeast |
| Widget B   |        890 |  $35,600 | Southeast |
| Widget C   |      2,100 |  $84,000 | West      |
| **Total**  |  **4,190** | **$167,600** |       |

Multi-Sheet Workbooks

Real-world Excel files often contain multiple sheets. mdstill handles this by converting each sheet into a separate section with a heading:

## Sheet: Revenue

| Quarter | Domestic | International |
| :------ | -------: | ------------: |
| Q1      |    $4.2M |         $2.1M |
| Q2      |    $4.8M |         $2.5M |

## Sheet: Expenses

| Category  | Budget | Actual | Variance |
| :-------- | -----: | -----: | -------: |
| Payroll   |  $2.1M | $2.0M  |    -$100K |
| Marketing |  $800K | $920K  |    +$120K |

Each sheet gets its own heading, making it easy to navigate and search. If you only need one sheet, you can simply delete the sections you do not need from the output.

Formulas vs Values

A critical question: what happens to Excel formulas?

mdstill extracts computed values, not formula expressions. If cell C2 contains =A2*B2 and displays "4,200", the Markdown output will show "4,200". This is almost always what you want -- the Markdown is a snapshot of the data, not a live spreadsheet.

If you need the formulas themselves (for documentation or auditing), that is a different workflow. Most use cases -- documentation, LLM input, data archival -- want the rendered values.

Formatting is preserved where meaningful:

  • Currency symbols and thousands separators appear as displayed
  • Dates render in their formatted form (e.g., "Mar 14, 2026" not "45365")
  • Percentages display with the percent sign

Handling Large Spreadsheets

GFM tables become hard to read when they exceed ~10-15 columns or ~100 rows. mdstill does not impose limits on conversion, but here are practical strategies:

Wide tables (many columns). The Markdown will be correct but may not render well in narrow viewports. Consider whether all columns are necessary, or if you can split the table.

Long tables (many rows). A 1,000-row table converts fine, but embedding it in a document is unusual. Common approaches:

  • Show the first N rows as a summary with a note about the full dataset
  • Split into multiple tables by category
  • Use the Markdown as LLM input (models handle long tables well)

Memory considerations. Very large workbooks (50MB+) with hundreds of thousands of cells may take longer to process. mdstill handles these within standard timeout limits, but if you are hitting issues, consider splitting the workbook into smaller files.

Common Edge Cases

Empty cells. Rendered as empty table cells -- the pipes are present but no content between them.

Merged cells. Excel allows merging cells across rows and columns. mdstill unmerges them, placing the content in the top-left cell of the merged range and leaving the rest empty. This produces valid GFM output.

Special characters. Pipes (|) in cell content are escaped. Newlines within cells are converted to spaces, since GFM table cells cannot contain line breaks.

Named ranges and pivot tables. These are Excel features without Markdown equivalents. mdstill converts what is visible on each sheet -- if a pivot table is displayed, its rendered values are captured.

#excel#xlsx#tables#markdown

Related technical reads

View allarrow_forward