Every company has the same problem: twenty years of documents, contracts and manuals nobody can find anything in. RAG — retrieval augmented generation — turns them into an assistant that answers in plain language. If it is built right.
How it works and what it costs
Documents are split into chunks, converted to vectors and stored in a database. A user's question retrieves the relevant passages and an LLM composes an answer with links to the source. Realistic budget: a pilot over one document domain €15,000–25,000; operating costs at hundreds of queries a day, €200–600 a month in inference.
Three mistakes that bury the project
First: dumping everything in, including obsolete policy versions — the assistant then confidently quotes a document from 2019. Second: no evaluation — without a test set of questions you do not know whether it answers correctly 60 % or 95 % of the time. Third: missing permissions — the assistant must not reveal the CEO's contract to the payroll clerk.
A well-built RAG saves dozens of search hours a month and always backs answers with sources. A badly built one is an expensive generator of confident nonsense. The difference is not the model — it is the data and the measurement.