Business OptimisationCustomer ExperienceCustomer InsightSpeech Analytics

What is speech analytics?

Speech Analytics

Image credit: Iwan Gabovitch

Who does speech analytics? We do speech analytics! So what is it?


Well, without getting too technical, we’re going to talk about the big 2 types of technology currently available (speech to text and speech to phonemes)

Phonetic Indexing

The first type we will broach is called Phonetic Indexing. The process here is simple, the speech/audio is broken down into strings of phonemes, the basic units of speech. Because phonemes simply consist of uttered sounds, the indexing is unaffected by background noise, languages, dialects or speaking styles. Phonetic indexing creates the truest representation of spoken audio and enables the fastest, most accurate access to the data contained in audio recordings. The problem with phonetic indexing is that you need to know what you’re looking for, and that’s where we come in. Our experienced analysts do the hard work for you leading to root cause discovery, trends and even the ability to detect emotion or sentiment.

Speech to text

Speech to text is the other big player and while it’s not as fast or as accurate, it does have some advantages. In the industry we call ‘speech to text’ large-vocabulary continuous speech recognition or LVCSR. LVCSR recognizes uttered sounds much like phonetic indexing, but subsequently, it matches combinations of phonemes against linguistic models containing a large human-language vocabulary to build a complete database. But once that is done it can surface new business issues and trends, the parsed data is much more pliable than the phonetic indexing.

So who wins at Speech Analytics?

While you don’t need a whole bunch of analytic skill to browse speech-to-text data, it does require many more servers, and when dealing with high call volumes you’ll often have to settle for analysing a sample of calls. In addition, your analysis will be dictionary dependant and usually won’t recognise words such as product or brand names. Phonetics on the other hand requires a bit more ground work and you’ll need to know what you are looking for. Most of the time this won’t be a problem, but once in a while a new issue will come up and a phonetic indexing solution will not identify it for you. After weeks and sometimes months of work you may eventual have an engine that will identify root causes and trends, but that could be months of work!

Right now, nobody wins. Pick the right technology for the right problem, and know what to expect.


The real tragedy is most companies haven’t made the decision to use the big data available to them. It seems that almost every company begins their call center calls with “Your conversation may be recorded…” only a few are actually using speech analytics as a way to develop better training methods and deliver improved customer experience.

Leave a Reply

Your email address will not be published. Required fields are marked *