Blog

Want to know how well an intelligent AI Chatbot can represent your brand? We can create a free demo chatbot for you, no strings attached

Request Free Demo

How do you stop an AI chatbot from making things up?

How do you stop an AI chatbot from making things up?

Free Chatbot Demo

Want to know how well an intelligent AI Chatbot can represent your brand? We can create a free demo chatbot for you, no strings attached

TL;DR
- AI chatbots make things up not because they are broken but because they are designed to
- The causes are often subtle: conflicting source material, missing fallback instructions, the wrong AI model for the job or content that isn't synced with your live website
- The fix has two parts: disciplined setup before launch and active monitoring after it
- Most platforms give you a dashboard. Catching problems before your customers do requires a human in the loop
- If you want a website chatbot that stays accurate without you having to manage it yourself, that is exactly what we build

Table of Contents

INTRODUCTION
At Lightsounds (the professional audio and lighting business my husband Rick and I built over nearly two decades, growing to 18 locations before selling in 2018) before any product entered the catalogue it went through a structured verification process. No staff member could quote specs or pricing until that process was complete. A speaker quoted at 400W when it was rated 200W does not just create a return. It creates a customer who never comes back. The discipline was not about distrust. It was about understanding the real cost of a wrong answer.
A website chatbot that makes things up creates exactly that problem – at scale, around the clock, without anyone noticing until the damage is done.

1. What does "making things up" actually mean?

The technical term is hallucination. In plain language it means the model generates a fluent, confident-sounding answer even when the correct answer is not in its source material.

The problem is not that AI lies. It is that AI does not know when it does not know – and it is designed to sound helpful anyway. So instead of saying “I am not sure about that,” it reaches for the most plausible answer it can construct and delivers it with the same tone it uses when it actually knows something.

In an ecommerce context this shows up as:

– Quoting a return window that does not match your policy
– Describing a product feature that does not exist
– Giving a shipping timeframe the store has never offered
– Referencing a promotion that ended months ago

None of these sound like obvious errors when the customer reads them. That is what makes them dangerous.

69% of retail and ecommerce teams report that at least half of their AI-powered experiences needed substantial revision after launch. [1] This is not exceptional. It is the norm – and it happens even when the setup seemed to go well.

article6-stat-69pct-ai-experiences-major-fixes

2. Why it happens - and it is not what most people think

Most businesses assume hallucination means the chatbot was badly configured or cheaply built. The reality is more nuanced. These problems can appear even in a well-intentioned setup, because the causes are often subtle

Conflicting source material

Your product page, a blog post from eighteen months ago and your FAQ page may all say something slightly different about the same product or policy. The chatbot has ingested all three. Without an explicit instruction on which source takes precedence, it blends them – and the result is an answer that sounds plausible but does not precisely match any of your actual content.

Prompt and instruction conflicts

System instructions that say “always find a solution for the customer” and “only offer what is in our policy” pull in opposite directions the moment a customer pushes back. Without clear guidance on how to handle that tension, the chatbot improvises. It tries to be helpful. It goes off-script.

Scope creep in training data

Trained on the whole website means trained on everything – including a 2022 news post about a store expansion that never happened, a supplier-copied product spec that has since been updated or old promotional pricing still sitting on a forgotten page. The chatbot has no way to distinguish current from outdated unless you tell it explicitly.

No live sync with your website content

The chatbot was accurate on day one. Your return policy changed in March. No one updated the training. It is still quoting January’s version with complete confidence. This is one of the most common causes of drift – and it is entirely preventable if your chatbot platform syncs automatically with your live website content. If yours does not, that is worth addressing before anything else.

Confidence without grounding

In long or complex conversations the model starts drawing on general knowledge rather than your specific content. It does not flag uncertainty. It reaches for the most plausible answer from everything it has ever been trained on – which is a very large set of information that has nothing to do with your business.

The AI model underneath matters more than most people realise

Most off-the-shelf website chatbots run on fast, low-cost models (like ChatGPT 4o-mini) optimised for short, simple exchanges. When conversations get longer or product questions get more complex, these models struggle to follow detailed instructions accurately or select the right answer from a large set of options. The model choice matters – and it is rarely something plug-and-play platforms let you control. (We go into much more depth on this in our article on product recommendation chatbots for complex catalogs.)

No fallback instruction for knowledge gaps

This is one of the most common causes and the least discussed. When the chatbot hits a question its source material does not cover, it needs an explicit instruction on what to do. Without one, it defaults to its core behaviour: generate a fluent, helpful-sounding answer. It fills the gap with inference. The customer gets an answer that sounds real but was never authorised by anyone in your business.

Even in best-case conditions – the right model, store content search enabled & synchronised, clean source material – AI models hallucinate in more than 30% of realistic multi-turn conversations [2] without proper configuration and guardrails.

3.What it actually costs

A customer is quoted a 60-day return window. Your policy is 30 days. They screenshot the conversation. Now you are either honouring a commitment you never made or losing a customer over a mistake your chatbot made without your knowledge.
A product question gets answered using specs from a discontinued model still sitting somewhere on your site. The customer buys. The product does not match what they were told. You get a return and a review that mentions incorrect information on your website.
The customer does not distinguish between the AI and your business. A wrong answer from your chatbot is a wrong answer from you.

4.How you fix it - setup

The solution starts before the chatbot goes live.

Source material audit

Not everything on your website should be training data. Old blog posts, news items, supplier-copied specs and outdated FAQs need to be excluded or explicitly overridden before training begins. What goes in determines what comes out.

Conflict resolution rules

When two sources say different things, the chatbot needs an explicit instruction on which one takes precedence. Product page beats blog post. Current policy page beats FAQ. This has to be built in deliberately – it cannot be assumed or left to the model to figure out on its own.

Hard guardrails on scope

Explicit limits on what the chatbot is authorised to say and do. It can quote your return policy. It cannot offer an exception to it. It can describe a product. It cannot guarantee compatibility. When it reaches the boundary of what it is allowed to do, it hands off to a human rather than improvising.

Fallback behaviour for knowledge gaps

A single explicit instruction covers a large proportion of hallucinated answers: if you do not have a verified answer from approved source material, say so and offer to connect the customer with a human. This needs to be built into the training, not added as an afterthought.

Edge case testing before launch

Before customers use it, throw the hard questions at it deliberately. Contradictory scenarios, policy edge cases, questions about products not yet in the training data. The goal is to find the gaps before your customers do.

This is the kind of setup work most businesses do not know to ask for - and most plug-and-play platforms do not do.

Want this set up properly from day one?” – free demo chatbot offer

5.How you keep it accurate after launch

Setup gets you to the starting line. What happens after launch determines whether it stays accurate.

Platform sync

Your chatbot should connect directly to your live website content and update automatically when your site does. If it does not, you will eventually have a gap between what your site says and what your chatbot tells customers. This is a non-negotiable when evaluating platforms, especially for an online store where products and prices change every day. Manual updates will not happen consistently enough to keep pace with a real business.

Gap analysis

Regular review of what questions the chatbot could not answer confidently. These become the input for the next round of training updates – closing the gaps before they turn into wrong answers.

Active CSAT and DSAT monitoring

Most platforms give you a resolution rate and a satisfaction score. That tells you what already happened. At Pivot Point AI we run active CSAT and DSAT analysis across all client chatbots – looking not just at what customers explicitly rated but at conversations where frustration signals appeared mid-chat, where the customer did not get what they asked for or where the conversation drifted off course. We typically find issues before our clients do. That is the point. Keeping a human in the loop proactively, not reactively, is not standard in this industry. We think it should be.

6.The honest take

No website chatbot will ever be zero-risk. The goal is not a chatbot that never gets anything wrong. The goal is one where errors are caught before they reach customers, where the scope of what it can commit to is tightly controlled and where someone is watching for drift before it becomes a problem.
The difference between a chatbot that damages your brand and one that protects it is not which platform you chose. It is whether anyone is actively managing it.
If you would rather have a team that sets this up correctly from day one and keeps watching it after launch, that is exactly what we do – for ecommerce brands who want the benefits of AI customer service without taking on the risk of an unmonitored tool.
Conclusion
Hallucination is not a flaw you can eliminate. It is a behaviour you manage – through disciplined setup, a platform that syncs automatically with your content and ongoing human oversight that catches problems before your customers do.
The brands that get this right are not the ones with the most expensive platform. They are the ones with the most disciplined setup and the most consistent oversight after launch.

Free Demo Chatbot Offer

No obligation. 20 minutes. Real answers about what’s possible for your store.

ABOUT THE AUTHOR

Tala Chisholm is the founder of Pivot Point AI, a Sydney-based AI solutions business helping Australian e-commerce brands implement AI chatbots that actually work. She holds a Bachelor of Engineering (Magna Cum Laude) from the American University in Cairo, and spent nearly two decades building Lightsounds – a professional audio and lighting company that grew to 67 staff and 18 locations before being sold in 2018. She brings that real-world business experience to every client engagement. Visit pivotpointai.tech to learn more.

References:

Free Chatbot Demo

Want to know how well an intelligent AI Chatbot can represent your brand? We can create a free demo chatbot for you, no strings attached