The Business Solutions Series is a compilation of solutions to various business challenges that I have encountered throughout my professional journey.
Context & Problem
Customer service representatives answer calls from thousands of clients a day. The costumer success team wanted to understand how often different issues came up on those calls and what exactly was being said about those issues by the clients, and by the costumer service representatives who answered those calls.
The team needed an efficient and flexible tool to analyze thousands and thousands of recorded phone calls for different issues.
Objective
The objective of the project was to process recorded telephone calls to rapidly help identify different topics and issues throughout many conversations. This was needed to understand the prevalence of different issues, the nature of those issues, and how those issues were being handled.
Solution
To solve this problem we were able to leverage TensorFlow Hub, which is a free and public repository of pre-trained machine learning models. We were particularly interested in their natural language (text) models. They have several of them including some very interesting multilingual ones that can embed phrases from 16 or more different languages into the same embedding space (e.g. the same phrase in different languages ends up having very similar embeddings). These models are extremely easy to use, as you can just load them into Python and use them as a function that takes a phrase (string) as an input and outputs an embedding for that phrase. You can also make use of these models inside TensorFlow Serving if needed (to easily speed up and scale inference).
For our specific problem, we set up a managed Apache Beam job that would run a few times a day to process all new calls. This process called a managed cloud transcription service that returned separate strings every time the service detected a new speaker or a long pause on the audio files. Our job was to then take each of those strings and run them through the pre-trained language model of choice, which outputted an embedding for that phrase. Then, we saved the transcript string, the call identifier, the start time of the phrase, the duration of the phrase, and the embedding of the phrase into a data warehouse.
Lastly, we let users define topics and issues of interest (by writing a description of the topic in plain english) and we used the same pre-trained model to convert those descriptions into embeddings. Once we have the topic embeddings and the transcript embeddings, we can easily use a zero shot model (basically embedding cosine similarity) coded directly inside the data warehouse to find all the phrases in the transcripts that were semantically similar to a topic of interest, and show these to the end user for their analysis.
Impact
The company was able to easily understand the true scale of different issues, and find out what clients were saying about specific issues and how representatives were responding to them. This lead to better issue prioritization and robust solutions to these common issues, which then reduced substantially the number of calls the team was receiving and increased client satisfaction over time.