We are working on PDF data extraction doing a POC with a few LLMs like Claude, Co-pilot, Power automate, Textract etc. While, machine printed pdfs are fairly trivial and accurate from LLMs extraction, we are running into a challenge with hand written PDFs, as the image clarity is often very poor. Despite trying to OCR it and then parsing, it is not working well. I am trying to figure out, if anyone has a better suggestion or solution for this need?

441 viewscircle icon1 Upvotecircle icon2 Comments
Sort by:
Director Data Strategyan hour ago

Commenting as I am also very interested in this for data ingestion capabilities.

Enterprise Information Data Architectan hour ago

This can be done using ChatGPT and we also use Palantir for enterprise solutions.

Content you might like

Yes63%

No33%

Unsure3%

View Results

Strongly Agree4%

Agree66%

Neither Agree nor Disagree19%

Disagree7%

Strongly Disagree1%

Don't know

View Results