Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Maybe you could try extracting the text also using some pdf text extraction and use that also to compare. Might help fix numbers which tesseract gets wrong sometimes.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: