Paper Title (use style: paper title)

Call Center Performance Evaluation Using Big Data Analytics Betül KARAKUS Computer Engineering Department Firat University Elazig, Turkey [email protected]

Galip AYDIN Computer Engineering Department Firat University Elazig, Turkey [email protected]

effects on increasing productivity, revenue and customer satisfaction, decreasing costs and time hence meeting quality assurance for businesses. Although there are more data than ever before with new technologies, many organizations are still looking for better options to get value from their data set [5]. The MapReduce framework [6] has emerged as one of the primary options to analyze such large amount of data efficiently and. MapReduce, the most popular paradigm for big data processing, has been implemented in open source projects like Hadoop [7]. Hadoop provides a flexible and scalable framework for analyzing structured and unstructured data and distributes data on independent disks using Hadoop Distributed File System (HDFS) in a fault tolerant manner.

Abstract—Quality monitoring for the call centers can be described as the process of listening to the recorded calls in order to measure the performance of a customer service representative or agent. The main challenge of quality monitoring is that managers have no time to listen all the records and therefore only a few of the stored calls are randomly selected. This results in inaccurate performance measurements, since most of call records can not be listened. This paper presents a distributed call monitoring system for assessing all recorded calls using several quality criteria. In the proposed system, we analyze large amount of call records using popular Hadoop MapReduce framework and utilize text similarity algorithms such as Cosine and n-gram. We also integrated slang word lists to our monitoring system. Empirical call records are used to demonstrate the performance of proposed call monitoring system.

This paper addresses the challenges of monitoring and analyzing call center agent call records. The essential goal of this study is to analyze all recorded calls and measure agent’s performance using big data analytics and text mining techniques. In particular we try to provide following key advantages:

Keywords—call center, Hadoop, similarity, big data

I. INTRODUCTION Today a large amount of information is being gathered and stored in unstructured or semi-structured format such as Twitter, Facebook pages, customer e-mails, call center records, blogs, forums and other web pages [1]. Text analytics seeks to extract useful and new information from these data for using in various applications such as business intelligence, media analytics, customer relationship management (CRM) and predictive analytics. Mining textual data with traditional machine learning techniques, natural language processing, information retrieval, and knowledge management presents many challenges due to different language features and big data challenges [2].

 To turn customer data with call records into value.  To gain new insights to make better decisions.  To gain competitive advantage and more time.  To achieve significant cost savings.  To increase revenue and customer satisfaction.  To present a reliable monitoring system resulting in accurate performance measurements by analyzing all the incoming and outgoing calls.

Big data challenges are mostly associated with the features of the data namely variety, volume, velocity, veracity, and value. Variety refers to unstructured data in different forms such as customer emails, social media data, audio and video data, volume refers to large amounts of data, velocity refers to how fast the data is generated and how fast they need to be analyzed, veracity refers to the credibility of data or how reliable it is for making crucial decisions, value, the most important V of Big Data, refers to the worth of the data stored by different organizations [3].

The rest of the paper is structured as follows. In the section II we discuss about the related work, in the section III we discuss call center agent performance and its issues and challenges. Section IV gives a review of big data analytics and explain which technologies it provides. Section V presents the proposed system and the performance evaluation. Finally, Section VI gives conclusions and explains future works. II. RELATED WORK Call centers provide services for many type of sectors such as telecommunication, finance, transportation, health, automotive etc. Several studies have proposed various approaches and solutions for the problem of evaluating agent performance. Takeuchi [8] has analyzed the recorded calls

As the amount of data set grows exponentially, it has become more significant to choose advanced analytic techniques operating on big data sets, which is also called as big data analytics [4]. Big data analytics, the combination of big data and analytics shown in Fig. 1, has had tremendous

1

from a rental car reservation office with Trigger Segment Detection to find whether a customer has the intention of booking a car or not. Mishne [9] has proposed a call center monitoring system that uses text analytics and information retrieval methods. The system is used to analyze the content of call center conversations and detect the main issue addressed in the call. The project in [10] has presented speech analytics system adapted automatic speech recognition and text mining technologies for French call center records. Contrasting the current approach in speech analytics and text mining, the study [11] has presented interaction mining tool built on pragmatic analysis to call center analytics. The tool is applied to a corpus of 213 manually transcribed conversations of a help desk call center in the banking domain. Kopparapu [12] has developed a method to enable automatic identification of problematic calls for call centers.

A. Performance Metrics Following performance metrics are used for evaluating an agent’s performance:

III. CALL CENTER AGENT PERFORMANCE Call center is an important part of customer relationship management which consists of people, processes, technology and strategies. Service quality of a call center is a result of comparison of actual service performance and customer expectations. Evaluating the service quality which is offered by customer service agent to customer is more difficult than evaluating the product quality. Performance evaluation in call centers is generally performed through listening randomly selected calls from recorded calls, and evaluating the words one by one in the related conversation. However, such an evaluation has its drawbacks in terms of time and costs since actual people are required to listen to the recorded calls and only a few of the calls can be evaluated.



Estimating and grading the answering duration of agents to incoming calls and their daily amount of conversations (the speed on answering the calls).



Evaluating how many calls are enough to bring a solution to the problem and whether there is any concurring calls or not (the speed on meeting the demands).



Evaluating the conversations with grading system (reliability).



Evaluating whether greeting and closing words which are necessary are said or not



Whether slang words are used or not (guarantee).

B. Methods and Technologies The proposed system uses following technologies:  Special purpose cloud infrastructure (OpenStack) is created. Therefore, the recorded calls can be processed and stored in parallel.  Recorded calls are converted to text files using Google Speech API and converted data is stored on distributed file system.  HBase (scalable, distributed NoSQL database for big data, working on Hadoop), Hive (business intelligence system working on big data), Mahout (scalable machine learning and data mining library), Pig (high leveled data flow language and implementation library for parallel calculations), ZooKeeper (high leveled coordination application for distributed implementations) technologies and Cosine and n-gram similarity algorithms are used to analyze big data on cloud structure.

Therefore, there is an obvious demand for automatic performance evaluation systems to reduce the employee costs and to increase the time efficiency. Traditional approaches have limitations in terms of storage and content analysis when it becomes necessary to process hundreds of conversations in parallel. The proposed system uses big data technologies to store all of the recorded conversations and analyze them using distributed text analysis methods. The virtual machines required to run the distributed storage and analysis system are run on the cloud which allows us to scale the system according to the number of inputs.

 JSF (Java Server Faces) and PrimeFaces (CSS-JS library), MongoDB (document-based NoSQL database) are used for user interface development. IV. BIG DATA ANALYTİCS

Measuring the call center agent performance using the proposed system presents following major contributions for the call centers:

In recent years, big data analytics has been used to analyze large (different sizes from terabytes to exabytes) and complex datasets that require huge computing power, storage capacity and network bandwidth. Big data analytics including text analytics, data mining and natural language processing techniques allows researchers and business analysts to make better decisions.

1) Technological Innovation: Creating a reliable and scalable storage area for the call records, performing automatic and metric-based analysis for all conversations, presenting early warning system for managers, automatic customer service agent performance scoring and daily, weekly, monthly performance analysis reports. 2) Reducing Employee Cost: Employee cost will be reduced because call records are automatically analyzed.

A. Big Data Apache Hadoop as an open source MapReduce model implementation has emerged to be a notable solution for big data processing and storage. Hadoop has two main parts: Hadoop Distributed File System (HDFS) that stores input and

2

output files by default and MapReduce framework that processes large volume data. HDFS comes with two components, a single master node known as NameNode and slave nodes known as DataNodes. NameNode is responsible for storing the metadata of HDFS, while DataNode stores the real data in HDFS. MapReduce framework offers many features that include fault tolerance, automatic parallelization, scalability and data locality-based optimizations [14]. MapReduce jobs controlled by a master node are splinted into two functions as Map and Reduce. The Map function divides the input data into a group of key-value pairs and the output of each map tasks are sorted by their key. The Reduce function merges the values into final result. MapReduce can be implemented in various programming languages

B. Speech Analytics Speech analytics refers to the process of automatically analyzing all recorded calls to extract useful and new information from agents and customers conversations. It is generally performed in three steps: Automatic speech recognition, natural language processing and text mining. First step of speech analytics is the transformation of audio recordings into text data. Natural language processing includes sentence boundary detection, tokenization, entity extraction builds a skeleton to process text mining techniques. In this work, we introduce a speech analytics software to analyze call center conversations by providing quality criteria such as accurate, completeness and reliability for monitoring and performance management.

Big data is considered to be a data collection that the size has been increasing with incredible growth rates. Such large dataset can’t be effectively managed by traditional relational database management systems. In order to handle this problem, NoSQL database management systems present a distributed soliton, which has strong consistency, high availability and partition-tolerance characteristics [15]. NoSQL databases have been classified in three popular categories by [16]: Key-value stores like SimpleDB , column-oriented databases like HBase, Cassandra, and document-based stores like MongoDB. NoSQL databases have the following advantages over relational databases:

C. Similarity Measurements Similarity measurements on big data have a wide range of research interests including information retrieval systems, artificial intelligence and natural language processing. New studies in this research fields aim to perform reliable, accurate and efficient similarity calculations on large text data. There are multiple similarity methods to detect similarity between words and sentences in large document collections. The most popular algorithms for similarity measurements have been presented briefly in following sections: 1) Cosine Similarity: When text is represented as a vector of all words on the vector space, the cosine of angle between two vectors indicates the text similarities by giving the value between [0,1]. Cosine Similarity, also used in this study, is a measure of similarity that is simple and efficient technique for evaluating the the text similarity. 2) Jaccard Similarity: Jaccard Similarity compares the proximity of the texts with the total number of shared terms divided by total number of all terms. Jaccard Similarity also gives the value between 0 and 1, that is, 1 means the texts are completely similar. 3) Levensthein Distance: Levensthein algorithm measures the similarity between two words or strings by computing the minumum number of operations (inserts, updates, deletes) or cost table for minumum step needed to transform one string into another. 4) N-gram: N-gram based measurements compute the number of common n-grams between two strings. Similarity result is obtained by dividing the number of similar n-grams by the total number of n-grams.

 NoSQL databases generally process big data faster and more efficient than relational databases. 

They provide highly scalable and large volume data storage for call data records, document and email archives, web logs, social media interaction data and sensor data.

 They offer automated failover and recovery, easier data distribution, flexible and simple data models, high availability and integration with Hadoop/MapReduce.  NoSQL databases are open source and inexpensive. They use cheaper servers and storage systems to manage big data and transactions, while hardware and storage requirements of relational databases are generally higher. Call Center

Blogs

Customer Email

Social Media

Twitter

In this paper, we compared the performances of similarity algorithms. The tests running on Hadoop cluster showed that Cosine and Jaccard Similarities have offered better performance than other similarity measurements. In order to compare similarity measurements, the textual records of the conversations between customers and agents were tested whether they are suitable or similar to predefined greeting and closing sentences or not. We adopt Cosine Similarity to analyze the recorded calls and measure agent’s performance, according to predefined metrics.

Data Sources Audio, Video

Websites

BIG DATA ANALYTICS

BIG DATA

Hadoop Cluster

ANALYTICS

NoSQL

Dashboards

Reports

Queries

Fig. 1. An overview of big data analytics

3

Call Center

Agent Key Performance Indicators (KPI)

Reports and Scores

Key Metrics-Based Agent Performance Analysis

Agent

Master Server

Costumer

Pig Zookeeper

File

HBase

Region Servers File

Hadoop Distributed File System (HDFS)

..Hello

Recorded Calls

Speech to Text

Distributed Programming Framework (MapReduce)

Hive

Hortonworks Data Platform (HDP) File

Cloud (OpenStack)

Operating System (Linux)

Fig. 2. The system architecture

agent should end the communication. In order to detect the slang word, we have primarily created a list of Turkish slang words and stored in HBase database. Then, the words in conversation texts have been compared with the list of slang words and checked whether agent or customer has used any slang word or not. The number of matching slang words has recorded as number of slang and slang score in database. 2) Greeting and closing sentences: there are opening and closing modules which customer representatives should take into account in call centers. In these modules, there are standard opening and closing sentences. The sentences which are used by agents should be suitable to predetermined standard sentences. For to measure the suitable of greeting and closing sentence, Cosine Similarity has been calculated between predetermined sentences and the sentences in the conversations texts. Furthermore, the proposed approach detects detects incoming (inbound) and outgoing (outbound) calls acording to greeting similarities and saves as call type. 3) Repetition of customer name: In conversation, agent should ask customer’s name before starting to talk about the issue. Agent should not repeat the name after asking for it. We have primarily prepared a list of Turkish names and stored in HBase database. The words in conversation texts have been compared with the list of name and customer name has been detected. Then the number of repeating name is saved as number of name. 4) Sentence repetition: Agent may need to repeat the sentence when customer could not clearly understand. This instance may be regarded as time loss and the problem in agent performance. We have firstly used Zemberek Turkish NLP Library to detect sentence segmentation, which is the segmentation of conversation content into sentences. Since Zemberek detects the sentences using punctuation marks, we prefer to use N-gram Similarity to count the sentence repetitions (because the conversation texts doesn’t contain special characters).

V. PROPOSED SYSTEM In this paper, we propose a new call monitoring system to analyze the all recorded phone conversations (called as speech analytics) that are conducted between agents and customers in the call centers. The overview of the proposed system is detailed by Fig. 2. The system architecture can be decomposed into three parts namely data conversion and storage, data analysis, scoring and reporting of customer service agent’s performance. A. Call Data Conversion and Storage The proposed system needs a speech recognition application to convert audio recordings to text recordings. There are several speech recognition systems such as Google Speech API, IBM Speech to Text, AT&T Speech API, Wit.ai and other technologies [17]. We use Google Speech API to transform the audio data into textual data. It should also be noted that the proposed system doesn’t depend on Google Speech API or other speech recognition technologies. This work only focuses on the analysis of call center conversations and future work will include our automated speech recognition software to Turkish call center conversation recognition. Moreover, we use Apache HBase that is a NoSQL database built on top of Hadoop so as to store the call center conversations. The analysis results of conversations between agents and customers are stored in HBase tables and recorded as XML format (detailed in Fig. 2.) to make our user interface as easy as possible to use. B. Call Data Analysis Call centers generally use key metrics to analyze customer service agent’s performances. Some metrics has been analyzed by proposed system that are listed as follows: 1) Detection of slang words: The agent may use slang word or words during the conversation made between agent and customer. In case that customer uses slang words, agent should warn the customer at least two times and if needed

4

pronunciation of the words, and the way of putting them in the sentences may make them appropriate but still agent should avoid using them. Some examples to these banned words could be; “Unfortunately, We refuse, I can’t do, Impossible, I am sorry”. In order to detect banned words, the predefined banned words that was decided in consultation with Vodafone Call Center, was stored in “banned_words” table. Then, the conversation has been checked whether agent has used any banned word or not. 7) Angry customers: During conversation, sometime customers may be too aggressive because of some mistakes or failures. In these instances, agent should calm the customer down. If agent is not able to calm the customer down, the agent should warn the customer at least two times. After warning, the best possible practice could be ending the conversation. To detect whether or not agent has warned the angry customer, we measure Cosine Similarity by comparing the conversation with warning sentences. Then warning score of agent is computed and stored in the database.

XML 87.49 100.0 0 Successful! inbound 4 50.0 2 77.15 0.0 100.0 0 A53 08/10/2015 output.xml Fig. 3. XML representation of agent performance result

C. Performance Scores This study presents a distributed call monitoring system to assessment all recorded calls by providing quality criteria such as accurate, completeness and reliability for call quality management. To achieve high scalability, we used Hadoop MapReduce implementation to analyze the large amount of call records. We run experiments on a 10-node Hadoop cluster built on OpenStack, an open source cloud computing software. Each node was installed on Ubuntu 14.04 LTS operating system that is recommended operating system for OpenStack. Fig. 4 shows a screenshot of the graphical result section that presents performance results of the agent. In this section, if a user (usually management) wants to see the performance scores of an agent, she can enter agent id, start and end date inside the relevant textboxes then click “Query” button, and the system will process the agent’s result file in XML format and show as graphical interface.

5) Detection of banned words: In call centers, there are some words which are not suggested to use and in some call centers there are words which are banned to use while talking to customers. Some factors like the way of using and pronunciation of the words, and the way of putting them in the sentences may make them appropriate but still agent should avoid using them. Some examples to these banned words could be; “Unfortunately, We refuse, I can’t do, Impossible, I am sorry”. In order to detect banned words, the predefined banned words that was decided in consultation with Vodafone Call Center, was stored in “banned_words” table. Then, the conversation has been checked whether agent has used any banned word or not. 6) Detection of banned words: In call centers, there are some words which are not suggested to use and in some call centers there are words which are banned to use while talking to customers. Some factors like the way of using and

Graphical Results

Fig. 4. Agent performance results

5

Fig. 5. Management interface for monitoring the agent calls [4]

In a similar manner, Fig. 5 shows a screenshot of the scoring form section that the management can control and manage the calls and performance scores of agents for each key metric such as greeting, closing, slang words, banned words, warning scores and others. Thus the proposed system provides high advantages with early warning system for managers, automatic customer service agent performance scoring and daily, weekly, and monthly performance analysis, when compared to current technologies. VI.

[5]

[6]

[7] [8]

CONCLUSION AND FUTURE WORK

In this paper, we presented a distributed system for performance evaluation of call center agents. The proposed system is implemented on top of Hadoop platform using big data analytics. Our system provides an effective automatic analysis for all call center conversations and monitoring interface for agents. The main challenge in our study is the lack of call record corpus. We dynamically constructed the system on a Hadoop cluster running on the cloud infrastructure, which allows us to scale the system according to the number of call records to be processed. We plan to further develop the system using a large call record corpus.

[9]

[10]

[11]

[12]

ACKNOWLEDGMENT

[13]

The authors would like to thank The Scientific and Technological Research Council of Turkey (TUBITAK) for its support under 1512 Support Program Grants No. 2140243.

[14]

[15]

REFERENCES [1]

[2] [3]

G. Chakraborty, M. Pagolu, and S. Garla, Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS: SAS Institute, 2014. A. Kaklauskas, Biometric and Intelligent Decision Making Support vol. 81: Springer, 2015. A. Katal, M. Wazid, and R. Goudar, "Big data: Issues, challenges, tools and Good practices," in Contemporary Computing (IC3), 2013 Sixth International Conference on, 2013, pp. 404-409.

[16] [17]

6

P. Russom, "Big data analytics," TDWI Best Practices Report, Fourth Quarter, 2011. S. LaValle, E. Lesser, R. Shockley, M. S. Hopkins, and N. Kruschwitz, "Big data, analytics and the path from insights to value," MIT sloan management review, vol. 21, 2013. J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM, vol. 51, pp. 107-113, 2008. T. White, Hadoop: The definitive guide: " O'Reilly Media, Inc.", 2012. H. Takeuchi, L. V. Subramaniam, T. Nasukawa, and S. Roy, "Automatic Identification of Important Segments and Expressions for Mining of Business-Oriented Conversations at Contact Centers," in EMNLP-CoNLL, 2007, pp. 458-467. G. Mishne, D. Carmel, R. Hoory, A. Roytman, and A. Soffer, "Automatic analysis of call-center conversations," in Proceedings of the 14th ACM international conference on Information and knowledge management, 2005, pp. 453-459. M. Garnier-Rizet, G. Adda, F. Cailliau, J.-L. Gauvain, S. Guillemin-Lanne, L. Lamel, et al., "CallSurf: Automatic Transcription, Indexing and Structuration of Call Center Conversational Speech for Knowledge Extraction and Query by Content," in LREC, 2008. V. Pallotta, R. Delmonte, L. Vrieling, and D. Walker, "Interaction Mining: the new Frontier of Call Center Analytics," in DART@ AI* IA, 2011. S. K. Kopparapu, Non-Linguistic Analysis of Call Center Conversations: Springer, 2015. E. ETSI, "202 009-2 V1. 2.1 (2007-01) User Group," Quality of telecom services. T. Gunarathne, B. Zhang, T.-L. Wu, and J. Qiu, "Scalable parallel computing on clouds using Twister4Azure iterative MapReduce," Future Generation Computer Systems, vol. 29, pp. 1035-1048, 2013. A. Moniruzzaman and S. A. Hossain, "Nosql database: New era of databases for big data analytics-classification, characteristics and comparison," arXiv preprint arXiv:1307.0191, 2013. N. Leavitt, "Will NoSQL databases live up to their promise?," Computer, vol. 43, pp. 12-14, 2010. A. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, et al., "DeepSpeech: Scaling up end-to-end speech recognition," arXiv preprint arXiv:1412.5567, 2014.

Paper Title (use style: paper title)

Paper Title (use style: paper title)

Suggest Documents