Follow me on LinkedIn - AI, GA4, BigQuery

Ever tested a voice AI agent and heard it read a comma out loud after a website or email address, but never in a normal sentence?

This isn't random. It's caused by how the voice engine's normaliser handles punctuation differently based on what precedes it.


Here's a typical system prompt for a voice AI agent confirming a website address:

- Website Address:

  "What's your business's website address, please?"

  ("Alright," "I see.")

  "And just to make sure I've got it, that's [repeat website], right?"

Here,

The placeholder [repeat website] is a variable. At runtime, it gets replaced with whatever the caller actually said. 

So if the caller says, "My website is www.james.com," the agent responds:

"Alright. And just to make sure I've got it, that's www.james.com, right?"


Here's the problem: the agent speaks that comma after "www.james.com" out loud.

But in the question "What's your business's website address, please?", the comma after "address" is silent. It's just a pause.

Same punctuation mark. Different behaviour. Why?


The voice engine has two components that matter here:

  • The normaliser prepares the text before it's spoken.
  • The TTS engine converts the prepared text into speech.

The normaliser treats punctuation differently depending on its context.

In normal sentences, commas are pause markers. The normaliser tells the TTS engine to insert a brief pause. The comma itself is never spoken.


After structured data (URLs, emails, phone numbers), the normalizer switches to verbatim mode. It reads structured values literally — "dot" for ., "at" for @, "slash" for /.

When punctuation appears immediately after structured data, the normalizer glues it to the value instead of treating it as grammar. The comma becomes part of the data and may be spoken aloud.


In short:

  • Normal sentence + comma = pause (silent).
  • Structured data + comma = punctuation read as part of the value (spoken).

The fix.