If your voice agent ends calls too early, skips confirmations, or sounds like it is reading a script instead of asking a real question, the problem may be tag questions. It may also be a model-fit issue.
What is a tag question?

A tag question is a short phrase added to the end of a statement, such as:
“right?”
“correct?”
“okay?”
“all good?”
For example:
“Your number is 5-5-5, 1-2-3-4 — right?”
This looks like a question, but in speech, it often sounds like the speaker is wrapping up. Many callers hear it as a closing signal rather than a genuine request to verify the information.
Voice agents and language models can treat it the same way.
Tag Questions often break voice agents.
When a voice agent says:
“Your callback time is tomorrow at 2 — all good?”
The sentence often flows straight into the tag question. The voice may trail off at the end, making it sound like the agent is done speaking.
Two things can happen:
- The caller may not realise they are meant to check the information. They may just say “yep, thanks, bye.”
- The model may treat the confirmation as complete and move to the next step too soon. In some flows, that next step is end_call.
Here is what I meant:
This is a common reason users complain that the agent ended the call before they finished speaking.
Model-specific behaviour matters more than expected.
What I observed during testing is that the failure isn't uniform across models, and neither is the fix.
Some models are better at understanding the wider conversation. Others follow the prompt more literally.
Anthropic models (such as Sonnet 4.5, Sonnet 4.6) often understood that the agent should wait for the caller, even when the prompt used a weak confirmation like: “5-5-5, 1-2-3-4 — right?”
These models can often determine the correct pause from the conversational context.
OpenAI models (GPT 4.1, GPT-5.1) behaved more literally.
When a prompt contains a step like:
Say the confirmation, then move to the close, then end_call
GPT-5.1 may treat that as one combined instruction. The tag question can make the problem worse because it does not clearly interrupt the flow.
So the model may say the confirmation and continue straight to the closing line.
What does this mean for you?
The same prompt may work well on Sonnet 4.6 but fail sometimes on GPT-5.1.
That does not always mean the prompt is “bad.” It may mean the prompt is not a good fit for the model you are using. So the fix should depend on the model.
A tag question can reduce reliability, especially near the end of a call.
Using a full question in a separate sentence improves how the confirmation sounds, and it helps models wait for the caller more reliably.
For example:
Instead of this:
“5-5-5, 1-2-3-4 — right?”
Use this:
“5-5-5, 1-2-3-4 — Did I get that right?”
This works better because the final sentence sounds like a real question both to your caller and to the LLM.
For GPT models, better phrasing may not be enough.
You may also need a clear structure that forces the model to stop and wait.
For example:
Before: risky confirmation.
## Role
You're Sarah, the receptionist for Bright Smile Dental. Your job is to book appointments by collecting names and phone numbers and passing them to the team.
## Greeting
"Hi, this is Sarah with Bright Smile Dental. How can I help?"
## Standard Flow
Every step: ask → wait for answer → continue.
1. **Full name** — "What's your full name?"
2. **Phone** — "What's the best number to reach you on?"
3. **Confirm phone** — "5-5-5, 1-2-3-4 — right?"
4. **Close** — "Thanks. Someone'll call you back shortly to book your appointment. Have a great day." → `end_call`
## Key Rules
1. Never claim to be human
2. Never invent appointment times or pricing
3. Every question step requires a wait for the caller's answer before continuing
After: stronger version for ChatGPT Models
## Role
You're Sarah, the receptionist for Bright Smile Dental. Your job is to book appointments by collecting names and phone numbers and passing them to the team.
## Greeting
"Hi, this is Sarah with Bright Smile Dental. How can I help?"
## Standard Flow
Every step: ask → wait for answer → continue.
1. **Full name** — "What's your full name?"
2. **Phone** — "What's the best number to reach you on?"
3. **Confirm phone** — Say exactly: "Let me read that back. 5-5-5, 1-2-3-4. Did I get that right?"
STOP. Wait for the caller's spoken response. Do not continue to step 4 and do not call any function until the caller has spoken.
- If the caller confirms, go to step 4.
- If the caller corrects the number, accept the correction, repeat the corrected number, then continue to step 4.
4. **Close terminal step** — Say: "Thanks. Someone'll call you back shortly to book your appointment. Have a great day."
Then call `end_call`.
## Key Rules
1. Never claim to be human
2. Never invent appointment times or pricing
3. Every question step requires a wait for the caller's answer before continuing
4. `end_call` may only be triggered from step 4, and only after the caller has confirmed the phone number in step 3
As you can see, optimising your prompt for ChatGPT models often increases token size.
Therefore, I prefer to use Claude models for voice agents even when they cost almost twice as much as GPT models.
A cheaper agent is not really cheaper if it ends calls too early or skips important confirmations.
Takeaway: For production voice agents, test the prompt on the exact model you plan to use. A prompt that works on one model may fail on another.
Related Articles:
- Custom Reporting For Voice AI.
- How To Bill Your Voice AI Clients Like A Pro.
- Voice AI Knowledge Base Creation Best Practices.
- How to build Cost Efficient Voice AI Agent.
- How to Self Host n8n on Google Cloud - Tutorial.
- How to use APIs in n8n, GoHighLevel and other AI Automation Workflows.
- How to use Webhooks in n8n, GoHighLevel and other AI Automation Workflows.
- What is OpenRouter API and how to use it.
- How to Connect Google Analytics to n8n (step by step guide).
- How To Connect Google Analytics MCP Server to Claude.
- State Machine Architectures for Voice AI Agents.
- Using Twilio with Retell AI via SIP Trunking for Voice AI Agents.
- Retell Conversation Flow Agents - Best Agent Type for Voice AI?
- How to build Cost Efficient Voice AI Agent.
- When to Add Booking Functionality to Your Voice AI Agent.
- n8n Expressions Tutorial.
- n8n Guardrails Guide.
- Modularizing n8n Workflows - Build Smarter Workflows.
- How to sell on ChatGPT via Instant Checkout & ACP (Agentic Commerce Protocol).
- How to Build Reliable AI Workflows.
- Correct Way To Connect Retell AI MCP Server to Claude.
- How to setup Claude Code in VS Code Editor.
- How to use Claude Code Inside VS Code Editor.
- How To Connect n8n MCP Server to Claude.
- How to Connect GoHighLevel MCP Server to Claude.
- How to connect Supabase and Postgres to n8n.
- How to Connect WhatsApp account to n8n.
- How to make your AI Agent Time Aware.
- Structured Data in Voice AI: Stop Commas From Being Read Out Loud.
- How to build Voice AI Agent that handles interruptions.
- Error Handling in n8n Made Simple.
- How to Write Safer Rules for AI Agents.
- AI Default Assumptions: The Hidden Risk in Prompts.
- Why AI Agents lie and don't follow your instructions.
- Why You Need an AI Stack (Not Just ChatGPT).
- How to use OpenAI Agent Kit, Agent Builder?
- n8n AI Workflow Builder And Its Alternatives.
- Two-way syncs in automation workflows can be dangerous.
- Missing Context Breaks AI Agent Development.
- How To Avoid Billing Disputes With AI Automation Clients.
- ChatGPT prompt to summarize YouTube video.
- Avoid the Overengineering Trap in AI Automation Development.
- How to Correctly Self Host n8n on Hostinger VPS.
- The correct way to setup Cal.com for Voice AI.
- Claude Beats ChatGPT for Voice AI Agents.