...
Current sources in our dictionary are often non-existing or of lower quality (wikipedia.com. techsmith.com,…).
Technology
Philosophy:
AI is advancing rapidly, and the progress in tools is unprecedented. We want to maximize existing tools before diving into research to create new tools.
Our approach leverages existing advancements in AI, particularly Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG), to process extensive text volumes and generate contextually accurate information.
RAG (Retrieval-Augmented Generation)
...
Loading: Adding data from where it lives, whether it is PDF, text files, a website, a database or an API into our pipeline
Indexing & storing: Creating a data structure that allows for querying the data. In our proof-of-concept we created vector embeddings to accurately find contextually relevant strings. The index is then stored along with other metadata.
Querying:
Proof-of-concept
The concept has been validated through a proof-of-concept, receiving positive feedback from the entire mapping team, despite limited sources and training.
Technical architecture for the POC
...
Live link:
The POC can be accessed here.
Development Roadmap
Next steps would entail:
...