Benchmark overview

How mdcraft evaluates document quality across text-first files, scans, tables, and review-heavy inputs.

Benchmark overview#

mdcraft uses benchmarks to improve conversion quality instead of relying on vague quality claims.

What is being measured#

The benchmark and review loop focuses on:

clean text-first PDFs
scans and OCR-heavy inputs
tables
multi-column layouts
lists and heading structure
code blocks and technical formatting
export fidelity for polished Markdown outputs

Why that matters#

Document conversion is not one problem.

The same engine can be good at text-first PDFs and weak on dense tables or noisy scans. Benchmarks make those tradeoffs visible, which is why mdcraft stores warnings, quality signals, and failure history in the conversion ledger.

What the benchmark loop does#

catches regressions before new conversion logic ships
turns recurring production failures into repeatable fixtures
compares providers and fallback strategies
supports product-quality decisions with evidence

What benchmark results should mean#

A good benchmark result does not mean every document will convert perfectly.

It means:

the output is readable
the structure is recoverable
the remaining cleanup burden is honest and manageable

That is more useful than pretending conversion is exact when it is not.