What is the source of these nasty docs? I am also working on a layer above pdfminer.six to parse tables. It seems like this task is never done. LLMs have had mixed results for me too. I am focused on documents containing invoices, income statements, etc from the real estate industry.
My email is in my profile if you want to reach out and compare notes!
My email is in my profile if you want to reach out and compare notes!