Convert PDF to markdown quickly with high accuracy
GPL-3.0 License
This release has a 15% GPU speedup, 3x CPU, 7x MPS. The speedup comes from new surya models for layout and text detection that are a lot more efficient.
This is a "best case" speedup, if you need to OCR or do equation recognition, the speedup will be lower. But it will still be a lot faster.
Published by VikParuchuri 4 months ago
Published by VikParuchuri 4 months ago
Published by VikParuchuri 5 months ago
Fix model device check.
Published by VikParuchuri 5 months ago
Published by VikParuchuri 5 months ago
convert.py
each process takes 3GB VRAM. This enables much higher throughput (was between 4.5GB and 5GB before).Published by VikParuchuri 5 months ago
Published by VikParuchuri 5 months ago
Published by VikParuchuri 5 months ago
Published by VikParuchuri 5 months ago
Should be significantly faster now, but haven't fully benchmarked, since I'm running low on time this week!
Published by VikParuchuri 5 months ago
Published by VikParuchuri 6 months ago
Basically a full rewrite!
Main features:
It takes ~2x as long to run now, but seems like a decent tradeoff.
See the README for details.