RAG in practice: how we turned company documents into an AI assistant

Every company has the same problem: twenty years of documents, contracts and manuals nobody can find anything in. RAG — retrieval augmented generation — turns them into an assistant that answers in plain language. If it is built right.

How it works and what it costs

Documents are split into chunks, converted to vectors and stored in a database. A user's question retrieves the relevant passages and an LLM composes an answer with links to the source. Realistic budget: a pilot over one document domain €15,000–25,000; operating costs at hundreds of queries a day, €200–600 a month in inference.

Three mistakes that bury the project

First: dumping everything in, including obsolete policy versions — the assistant then confidently quotes a document from 2019. Second: no evaluation — without a test set of questions you do not know whether it answers correctly 60 % or 95 % of the time. Third: missing permissions — the assistant must not reveal the CEO's contract to the payroll clerk.

Where RAG ends and an agent begins

Pure RAG answers questions — but the real value comes when the assistant can also act. “Find me the payment terms” becomes “draft an amendment based on them”; “what is the order status” becomes “send the customer an update”. This is where RAG turns into an agent with access to systems — and with it the emphasis on control gates and permissions grows. We always start with answering; we add acting only once the company trusts the assistant.

Your data never leaves the company

The most common worry with RAG is not accuracy but privacy: “are we sending twenty years of contracts to someone else's model?” The answer is no. The documents and the vector database stay in your infrastructure or an EU cloud, only the passage relevant to a given question goes to the model, and sensitive data can be anonymised before it is sent. A GDPR data-processing agreement is part of every project — without it the conversation does not even start.

A well-built RAG saves dozens of search hours a month and always backs answers with sources. A badly built one is an expensive generator of confident nonsense. The difference is not the model — it is the data and the measurement.