Hi HN! I built a Python CLI that batch-converts PDF page images into clean Markdown and then into EPUB/AZW3/MOBI. It leans on Alibaba DashScope’s multimodal models for OCR, auto-rotates/downsizes images with Pillow, retries requests with backoff, and resumes where it left off. The tool also merges pages into a single book.md, strips headers/footers, and calls pandoc (plus Calibre if present) for final ebooks.
You can feed it PNG/JPG pages directly or run pdftoppm -png -r 300 input.pdf output-prefix first. Usage, parameters, and setup (Python deps, pandoc, Calibre) are documented in the README. Source: [add your repo URL or archive link]. Feedback on robustness, model compatibility, and additional cleanup heuristics would be awesome!
jollychang•1h ago
You can feed it PNG/JPG pages directly or run pdftoppm -png -r 300 input.pdf output-prefix first. Usage, parameters, and setup (Python deps, pandoc, Calibre) are documented in the README. Source: [add your repo URL or archive link]. Feedback on robustness, model compatibility, and additional cleanup heuristics would be awesome!