What a good chatbot actually does
A chatbot that adds value to your business does three things well: it gives a fast, specific answer to a specific question; it knows when the answer falls outside its knowledge; and it connects the user to a human when needed. That sounds simple, but most chatbots I encounter fail on all three.
- ▸Targeted answers: the chatbot retrieves information from your own sources and cites them. An employee asking about their leave balance gets the correct figure, not a general explanation of how leave typically works.
- ▸Scoping: the chatbot knows what falls outside its knowledge domain. Questions that cannot be answered from the available documents are not answered with a guess.
- ▸Escalation: when the chatbot does not have the answer, or when a question requires a human decision, it says so explicitly and offers a handoff to a team member or contact form.
These three properties do not come automatically from a language model. They emerge from the right architectural choices: which knowledge source, how documents are indexed, how prompts are constructed, and which boundaries are hard-coded.
What a bad chatbot does and why it causes damage
Many businesses that want a chatbot get a wrapper around a generic language model. The chatbot has access to everything the model knows: the internet, news, general knowledge. The result is a bot that answers questions with information that has nothing to do with your business.
- ▸Generic answers: a customer asks about your returns policy and gets an explanation of how returns generally work. Not what your company actually handles.
- ▸Hallucinations: the model invents specific details when they are not in its context. Price, delivery time, product number, all can be generated as if they are accurate.
- ▸Lead-capture loops: some chatbots are built to collect email addresses. They ask questions, deflect, and never give the answer the user is looking for. This frustrates visitors and erodes trust in your brand.
- ▸No memory of the previous turn: a multi-turn conversation where the chatbot treats every question as new, without context from what was said before.
- ▸Answers that contradict your policy: a bot that says something is possible that you as a business do not offer, or that quotes a price that is wrong.
Damage from a bad chatbot is not always immediately visible. It shows up in visitors who leave, in customer service tickets about things the chatbot communicated incorrectly, and in lost trust from customers who were misled once.
My approach: RAG over your own documents
RAG stands for Retrieval-Augmented Generation. Instead of asking a language model to answer from memory, I have the model search a vector database filled with your documents. The model answers based on what it finds in that database, not from general internet knowledge.
The workflow looks like this: your documents are loaded, split into chunks, and converted into vector representations. When a user asks a question, the system first searches for the most relevant text chunks and passes those as context to the language model. The model then formulates an answer based on exactly those chunks.
- ▸Knowledge sources I index: PDF documents, Word files, Markdown knowledge bases, CSV product catalogues, FAQ pages, Confluence exports, Notion exports.
- ▸Automatic refresh: when you update a document, the index is updated automatically. The chatbot always works with the most recent version.
- ▸Source attribution: the chatbot can show which document or section an answer is based on. That makes answers verifiable.
- ▸Uncertainty threshold: if the relevance score of the retrieved fragment is too low, the chatbot does not guess. It honestly says it does not know from the available documents.
- ▸Session memory: the conversation retains context so follow-up questions are answered correctly.
The result is a chatbot that only answers based on what you have taught it. No surprises, no fabrications, no generic explanations that do not fit your business.
Which language models I use and why
The model is a choice I make based on your specific situation. There is no universal answer to which model is best. Claude by Anthropic is my first choice for situations where accuracy and reasoning matter more than cost. Claude has strong instruction-following behaviour, low hallucination tendencies, and performs well on tasks that require the model to be careful and honest about what it does not know.
For volume-driven scenarios where cost efficiency is the priority, I use the OpenAI API. OpenAI's lighter model offers excellent quality at low cost per token and is suitable for customer service bots that handle many short questions. For companies that want to keep all their data on their own infrastructure, a local model like Llama is the right choice. It runs on your own server, no data leaves your network.
The model choice is separate from the RAG architecture. The foundation, the indexing of your documents, the retrieval logic, the escalation rules, is identical regardless of which model answers. That means the model can be swapped later without rebuilding the entire chatbot.
Where the chatbot lives: website, Slack, Teams or WhatsApp
A chatbot consists of two parts: the backend (the language model, the RAG pipeline, the business logic) and the interface through which users interact with it. Building the backend is the core work. The interface is a choice that depends on who the users are and where they already spend their time.
- ▸Website widget: a chat window that appears on your own website. Suitable for customer service, product advice or FAQ handling for visitors. I build this as a lightweight embedded component without heavy third-party scripts.
- ▸Slack integration: a bot that answers employee questions inside your Slack workspace. Suitable for internal knowledge bases, HR questions or IT support. Employees ask in a channel or DM, the bot responds directly.
- ▸Microsoft Teams: for companies using Microsoft 365 that want to integrate their internal bot into the existing workplace. Same RAG backend, different interface.
- ▸WhatsApp Business: for companies with customers who use WhatsApp as their primary communication channel. Requires a WhatsApp Business API account, but the connection to the RAG backend is straightforward.
- ▸Admin tool: sometimes you do not need an external channel but an internal search interface on top of your own documents. A simple web application that employees use to search through internal documentation quickly.
The interface choice does not determine the quality of the answers. That lives in the RAG pipeline. The interface determines the user experience and adoption. A chatbot that employees have to open in a separate system gets used less than a bot that answers in the channel where they already work.
-- Anonymous case
Internal knowledge organisation: employees ask, documents answer
A knowledge-intensive organisation had an internal wiki of more than 800 pages. Employees could not find information quickly. Search queries returned too many results, and identifying the right page was still difficult. Customer service spent a significant part of the day answering internal questions from colleagues.
I built an internal RAG chatbot that indexes the entire wiki. Employees ask questions in plain language. The chatbot retrieves the relevant section and gives a concrete answer with a link to the source documentation. Answers that fall outside the wiki are not generated. In that case the chatbot explicitly states that no answer is available in the internal knowledge base.
The system runs on its own infrastructure. No external API processing internal documents. The index is updated every night so new or changed pages are immediately available. Employees do not need to understand how RAG works. They ask a question in Slack and get an answer.
What I do not do
Transparency about the limits of a chatbot is just as important as transparency about its capabilities. There are requests I decline, not because they are technically impossible, but because they result in a chatbot that causes harm.
- ▸Chatbot without a source of truth: a bot that answers based on general internet knowledge without your specific documents as a foundation. This inevitably leads to answers that do not fit your business.
- ▸Chatbot that 'knows everything': a bot without a clear scope. If the chatbot can theoretically answer any question but answers none consistently well, it is worse than no chatbot.
- ▸Marketing funnel bot without UX testing: a bot designed primarily to collect email addresses or qualify leads without testing the user experience. These bots push visitors away.
- ▸Chatbot as a replacement for human judgment: for decisions with legal, medical or financial consequences, a chatbot is never the final decision-maker. I build in escalation, not a replacement for human advice.
- ▸Chatbot with outdated documents and no update mechanism: a bot that works well initially but gives outdated information after six months because the index is never refreshed.
Custom chatbot: pricing
The investment for a chatbot depends on the number of knowledge sources, the complexity of the escalation logic, and the chosen interface. There is no standard figure that is meaningful here.
On request
Get in touch for an estimate based on your situation. I always give an honest picture of what is realistic before starting.