Benchmark

Benchmark overview

How mdcraft evaluates document quality across text-first files, scans, tables, and review-heavy inputs.

Benchmark overview#

mdcraft uses benchmarks to improve conversion quality instead of relying on vague quality claims.

What is being measured#

The benchmark and review loop focuses on:

  • clean text-first PDFs
  • scans and OCR-heavy inputs
  • tables
  • multi-column layouts
  • lists and heading structure
  • code blocks and technical formatting
  • export fidelity for polished Markdown outputs

Why that matters#

Document conversion is not one problem.

The same engine can be good at text-first PDFs and weak on dense tables or noisy scans. Benchmarks make those tradeoffs visible, which is why mdcraft stores warnings, quality signals, and failure history in the conversion ledger.

What the benchmark loop does#

  • catches regressions before new conversion logic ships
  • turns recurring production failures into repeatable fixtures
  • compares providers and fallback strategies
  • supports product-quality decisions with evidence

What benchmark results should mean#

A good benchmark result does not mean every document will convert perfectly.

It means:

  • the output is readable
  • the structure is recoverable
  • the remaining cleanup burden is honest and manageable

That is more useful than pretending conversion is exact when it is not.