We are working on PDF data extraction doing a POC with a few LLMs like Claude, Co-pilot, Power automate, Textract etc. While, machine printed pdfs are fairly trivial and accurate from LLMs extraction, we are running into a challenge with hand written PDFs, as the image clarity is often very poor. Despite trying to OCR it and then parsing, it is not working well. I am trying to figure out, if anyone has a better suggestion or solution for this need?
Sort by:
Enterprise Information Data Architectan hour ago
This can be done using ChatGPT and we also use Palantir for enterprise solutions.
Commenting as I am also very interested in this for data ingestion capabilities.