Follow me on LinkedIn - AI, GA4, BigQuery
One of the biggest reasons many AI projects fail is that businesses try to automate everything. They remove humans from the loop completely.

It sounds efficient in theory. But in practice, full automation usually creates more problems than it solves.

Here’s why.

#1 AI can (and will) make mistakes.

LLMs can hallucinate, lie, misinterpret, ignore instructions, or simply fail without warning.

Image source: https://www.linkedin.com/feed/update/urn:li:activity:7352965550655881216/


You might get an output that looks perfect but is completely wrong underneath. And the scariest part? It usually won’t tell you when it’s wrong.

The following are real-life examples where LLMs produced false or fabricated information with potential serious implications:

Example-1: Air Canada chatbot invented a refund policy.

In early 2024, Air Canada’s AI-powered customer service chatbot fabricated a refund policy that did not exist.

A passenger was told by the bot that he could claim a partial refund for a bereavement fare after flying. When he later requested it, Air Canada denied the claim.


The court ruled the company was responsible for the chatbot’s false statement. The AI had lied about a policy and ignored internal documentation that contradicted it.

(Source. British Columbia Civil Resolution Tribunal, Feb 2024).

Example-2: ‘Donotpay’ “AI lawyer” misrepresented its capabilities.

‘DoNotPay’ marketed itself as the world’s first robot lawyer that could draft legal arguments and even represent clients in court.

Users and lawyers found that the AI routinely generated inaccurate legal filings and ignored explicit user constraints, such as including certain citations or formatting.

The company faced a class-action lawsuit for false advertising and unreliable legal automation.

It demonstrated that AI systems can misinterpret user intent and overstate their ability, effectively lying by confidence.

(Source. Ars Technica, The Verge, March 2023.)

Example-3: CNET’s AI-written finance articles contained factual errors.

CNET quietly used an internal AI tool to write finance articles under human bylines.

After publication, readers noticed that many articles contained incorrect financial math, misinterpreted basic economic principles, and ignored internal editorial rules for sourcing.


The AI outputs looked authoritative but were factually wrong and context-blind. CNET paused its AI publishing experiment after widespread backlash.

(Source. Futurism, The Washington Post, January 2023.)

Example-4: Microsoft’s Copilot gave dangerous business advice.

In early enterprise trials of Microsoft Copilot, users reported the model misinterpreting business requests. It fabricated performance summaries or created misleading financial charts when asked to summarize quarterly trends.

Executives testing the tool said it made up metrics that did not exist in their datasets.


Microsoft acknowledged that Copilot can generate plausible but incorrect information, highlighting the need for human verification.

(Source. Microsoft Copilot Early Access User Feedback Reports, 2023.)

Example-5: Klarna’s AI customer service misinterpreted refund requests.

When Klarna integrated OpenAI’s model for handling customer queries, early users reported that the bot issued incorrect refund decisions or ignored parts of customer messages.

While most queries were handled well, a portion required human correction because the AI misread tone or intent.


Klarna’s own data later showed that human review was still essential to ensure accuracy and fairness in financial communication.

(Source. Klarna Blog, Oct 2024.)

Example-6: Real estate chatbots giving misleading pricing advice.

Several real estate platforms in 2023 and 2024 deployed AI chat assistants to guide buyers and sellers.

Users discovered that the chatbots invented market data, misinterpreted queries about mortgage terms, or ignored geographic filters, producing unreliable advice that could influence major financial decisions.

These chatbots were not trained on up-to-date regional data, yet they confidently advised as if they were. The businesses had to roll back or retrain their models.

(Source. Business Insider, ZDNet, 2024.)

#2 AI workflow integrations can break silently.

APIs fail. Webhooks stop firing. Data fields change.

Even one broken link in your automation chain can cause your workflow to stop working as intended. If no one is watching, you’ll only notice after the damage is done.

#3 AI Workflows still need a human supervisor.

AI systems need oversight, daily or at least regular check-ins, to make sure they’re performing as expected.

This is where the role of an Agent Supervisor comes in.


An Agent Supervisor is a subject matter expert responsible for monitoring the AI’s performance, validating outputs, minimizing hallucinations, and providing strategic direction.

For example, a Chief Marketing Officer could act as the Agent Supervisor for one or more marketing agents, making sure the AI’s campaigns stay on brand, align with business goals, and comply with policies.

#4 100% AI workflow automation removes accountability.

Some processes require human judgment, creativity, empathy, and ethical decision-making, things AI can’t replicate reliably.

When you automate everything, you lose these safeguards and increase the risk of errors that no one is equipped to catch.

So what works better?

Part human, part AI.

>> Find and automate the repetitive, rule-based, and predictable parts.

>> Use AI to assist in the decision-heavy or judgment-based parts.

>> Keep humans involved where quality, trust, and context truly matter.


This balance gives you leverage instead of fragility.

It keeps your systems efficient, reliable, and aligned with your business goals.

In short, don’t chase full automation. Chase smart automation.