On this page, I am trying to summarize the current PRD for the AI pipeline.
The PRD underlying the current approach can be found here:
Source / Inspiration of AI Pipeline PRD
It is not mentioned in the PRD, but this PRD seems to draw heavily on an AWS document from Aug 15, 2022: https://aws.amazon.com/blogs/machine-learning/part-1-intelligent-document-processing-with-aws-ai-services/
But the Amazon flow was developed to eg analyze invoices, receitps etc. In other words, PDFs or scans where data fields will be in pre-determined spots (like eg the total cost at the end of the receipt).
Patent Application doc
If I read correctly, our patent is built upon the premise that we beat the F1 scores of the (then) leading LLM models.
We would urgently need to validate this premise. The advances in LLMs have been fast, and this benchmark seems no longer valid.
https://www.researchgate.net/publication/369912067_Interpretable_Unified_Language_Checking