Audience Discovery and Targeted Marketing Using ...

6 downloads 0 Views 831KB Size Report
Audience Discovery and Targeted. Marketing Using SAP HANA. Abstract—Nowadays extra attention is being paid to budget management. Including sales ...
Audience Discovery and Targeted Marketing Using SAP HANA Shashank Keshava

Dr Kiran P

Nithin S J

Dept of CS&E, RNSIT Bangalore,Karnataka,India [email protected]

Dept of CS&E, RNSIT Bangalore,Karnataka,India [email protected]

VASPP Technologies Pvt Ltd Bangalore,Karnataka,India [email protected]

Abstract—Nowadays extra attention is being paid to budget management. Including sales marketing sectors, where sales agents have to spend budget wisely and find prospective clients. This paper presents an architecture based on SAP HANA framework a powerful database platform that takes advantage of large main memory and extensive parallel processors, identifies prospective clients based on fuzzy item set approach and improves the accuracy of identifying clients with minimal budget expenditure. Keywords- Audience Discovery; Targeted Marketing; Fuzzy Item set approach; LinkedIn; SAP HANA;

I.

INTRODUCTION

W

ith continuously changing technology and clients expectations businesses are strongly affected by competition. Sustainability is at stake for businesses as they are losing their most valuable clients and facing difficulties in obtaining new clients. Hence there is a lot of pressure on companies to improve marketing strategies. Organizing marketing campaign has been a common strategy of attracting new clients and retaining the existing clients. However market representatives are faced with challenges with respect to time and budget. If the marketing campaign is conducted on the wrong audience who are not interested in the product it can lead to wastage of valuable time and crucial resources. This suggests us that all of the general public must not be given equal attention and effort. Product and Service companies provide their offerings to various clients and target groups. The offerings made are based on market study, demography or contacts/network. The reach of the Sales personnel is limited due to various factors and tools they use for such activities. These sales agent usually have huge network of possible clients and it will be overwhelming task to find the right buyers to market the product or service. When the right buyer is found what is the right marketing message or the right approach to reach out to this persona is our challenge and there exists no tool as of date to do this. With the revolution in social media and digitization of transactional data, the volume of internet data continues to grow exponentially. A complete 90% of all the data in the world has been generated over the last two years. Data is interlaced with each sector and businesses in the international economy and alike alternate aspects of management such as hard commodities and human capital most of contemporary

c 978-1-4799-8047-5/15/$31.00 2015 IEEE

economy would have not been achievable [10]. This huge volumes of client’s data can be used to analyze customer behavior and help improve audience discovery for marketing purposes. Audience Discovery and Targeted marketing is an inexpensive, limited liability and revenue directed approach. Targeted marketing analyses the clients and products possessing hidden market values through examining client’s personality along with his requirements and selective spots an association of clients to endorse. The cutting-edge breakthroughs in computer technology specifically the internet, propositions us unique challenges with favorable opportunities for targeted marketing [1, 5]. II. LITERATURE SURVEY Due to budget restrictions marketers contact only a limited percentage of clients currently maintained with company’s database. Thus the fundamental goal about designing client’s feedback in marketing helps us in determining clients which are inclined to acknowledge. A necessity by the analysts into delivering superior statistics concerning accurate measure in the uppermost deciles in the data file. A superior design is spoken to be organized with greater rate of feedback in the vicinity of top deciles or quantiles against a client’s archive. Also sophisticated designs can comprise of clients analytics along with psychographic valuables, credit histories and investment patterns [3, 5]. In prior years analysts have embraced diverse stochastic machine learning approach to decode analytic disputes in fields regarding genetic algorithms, Bayesian Networks and neural networks [4, 5,6].Analysts in field of economics embraced those approaches to decode analytic disputes like mortgage defaulters, anticipating bankruptcy and designing client’s choices [5,6]. Against these advances the actual feedback percentage considering marketing campaigns are frequently low, considering example of catalogue mailings is about 5 percentage [7]. The substandard achievement of marketing projection designs is owned by sum of disputes amidst the actual design architectures along with evaluation practice. Especially owned to allocation of funds in marketing customarily exclusive to the clients from the uppermost two deciles conversely the 80th percentile( specifically clients with greater prospects to

749

acknowledge) are desired for obtaining promotional goods from organizations. This is the logical also simultaneously the aspect regarding “targeted marketing”. Avoiding that restraint in the development proceeding points to substandard achievement of the estimation designs [3, 5]. Accustomed the constraints as well as generally an limited marketing funds affecting expansion path directed toward clients preferences that is isolated to the clients amidst minimal gains is usually not practical. Hence the final objective of targeted client collection is to classify the selected clients which have the highest probability of responding also devote a large amount of capital or funds. Even a minuscule rate of gain in terms of prophecy of client’s investment odds and gains may have an astounding expenditure compensating also amplify gains for marketers [5]. III. PROPOSED ARCHITECTURE

Fig. 1. Audience Discovery and Targeted Marketing Architecture

Figure 1 depicts our Proposed architecture consists of 4 separate blocks. A. Data Extraction using REST API A sales agent relies on many marketing resources to find suitable clients. In this application we focus on finding clients based on the sales agent LinkedIn connections. The REST API is the heart of all programmatic interaction with LinkedIn. In order for our application to access LinkedIn member data and or act on their behalf, they are authenticated. LinkedIn relies on the industry standard OAuth 2.0 protocol for granting access, due to its simplicity and ease of implementation [9]. B. SAP HANA In-Memory Computing Engine SAP HANA is an in-memory, column-oriented, relational database management system developed and promoted by SAP SE. HANA's architecture is modelled to take care of both high transaction rates and sophisticated query processing on the same platform [16].SAP HANA makes complete usage of effectiveness of present day hardware to boost application capabilities trim expenditure of holdings as well as empower brand-new scenarios and applications that were formerly not conceivable. With SAP HANA, we are able to structure applications to

750

entertain business control logic and database substratum with extraordinary achievement [2]. HANA allows us to minimize data movement entire data is accessible in main memory which averts the performance penalty of disk I/O. All the data extracted from LinkedIn REST API is loaded into SAP HANA Database, we also load the list of products the sales agent wants to market [9]. Conventional database applications adopt precise interfaces (i.e. ODBC and JDBC) to establish communication along database administration system behaving as a routine data origin on top of network connection. The central SAP HANA database administration composing of what is known as index server that is composed of definite data stores and the appliance for transforming the data [8]. The SAP HANA database has its own scripting language named SQLScript. SAP HANA’s SQLScript [2] an extension of SQL that comprise enhanced control-flow capabilities allows developers to describe complex application logic inside database procedures. We make use of Fuzzy Search to match the products with prospective clients. Fuzzy Search is an agile and fault-tolerant search engine supported by SAP HANA. The Fuzzy Based search engine examines both structured and unstructured data which are the crucial functionality of this application and opens doors to wide range of service offerings. Some of the notable works based on fuzzy search include proximity ranking which improve the query efficiency of proximity aware search by using early termination techniques [11, 12] and applications of fuzzy search to improve search coverage of unmanned aerial vehicles [13] and the most notable application involving identifying complex genome sequence of DNA through fuzzy set theories and frequent item sets [14]. C. SAP HANA Extended Service(XS) Engine SAP HANA enormously boosts conventional database server performance it behaves as an encyclopedic substratum for conceiving and realization of fundamental data accelerated applications that execute effortlessly in SAP HANA, embracing superiority of its in-memory architecture and parallel processing potentials [8]. SAP HANA Extended Application services contributes an extensive collection of pre-integrated assistance so that they cater assistance on all accounts for web based applications. It composes of a lightweight web server, server-sided JavaScript processing, custom OData based services and unrestricted entry to SQL and SQLScript [8, 9]. SAP HANA XS Application are offered through SAP HANA Extended server that offers lightweight application assistances which are completely embedded with SAP HANA. It grants clients connection to SAP HANA system via HTTP. Monitoring applications can be executed fundamentally on SAP HANA and they perform independently irrespective of any supplementary foreign application server.

2015 IEEE International Advance Computing Conference (IACC)

The application services can be employed to revel database data designs alongside views, database procedures and tables for utilization to clients. They are accomplished by adopting OData based services conversely by scripting fundamental application definitive code which execute on the context of SAP HANA. Also SAP HANA Extended Services can be utilized to construct dynamic HTML5 UI applications. With respect to revealing data designs SAP HANA Extended Services further operate system based services which are a core component of SAP HANA. SAP HANA Extended Services server does not store any data. To perform query such as to expose views and tables or execute SQLScript of database calculations and procedures or to update data it establishes a contact with the index server or servers with respect to distributed systems [8, 15].

approximately. SAP HANA offers fuzzy search that is fast and fault tolerant. If the search phrase contains an invalid or missing characters or number of spelling errors the HANA database still responds with records [17]. Fuzzy search can be adopted for the following scenarios x x

x

D. Client On the client side of the application SAPUI5 development toolkit is used. SAPUI5 run time consists of a prosperous collection of standards and extension controls, SAPUI5 is a client sided HTML rendering library.SAPUI5 offers a lightweight programming architecture to mobile based applications as well as desktops. Built on top of JavaScript it backs Rich Internet Applications similar to client based features. Open Ajax based web technologies comply with SAPUI5 they can be in sync among basic JavaScript libraries [15].The SAPUI5 is pre-integrated with a vast assortment of libraries that can be a boon for developers to utilize and design. x x x

A rich collection of Open Source jQuery is utilized as a foundation. SAPUI5 backs CSS3 it provides the privilege of acclimating custom themes for branding your company in compelling tone. SAPUI5 is established on a flexible approach with reference to custom controls.

[15]Based on the results obtained by the Fuzzy Search Engine the application can categorize the interest areas and find the right match of the products for the potential customer. The sales agent can directly approach the prospective customer by sending a mail and inviting him to discuss about the product. This application introduces a new class of solutions that powers the next generation of business applications. IV. FUZZY ITEMSET PATTERN DETECTION A substantial portions of data available nowadays are either semi structured or unstructured data. A lot of this data is in text format. For corporate world, data quality is a major issue. In this application we have made use of SAP HANA to make sense of vast unstructured data available. Full text search capabilities of HANA helps accelerate search capacity within large amounts of text data significantly. Fuzzy Search functionality enables identifying strings in order that they match the pattern approximately both findings match the sub-string approximately within a string provided and identifies dictionary strings which match the pattern

Fault-tolerant search with database: Identifying a commodity named 'sun dried towmatoe' and identifies 'sun dried tomatoes'. Fault-tolerant search of columns containing text (such as pdf and html sources): Finding documents named 'DriPhenChlor' and returns a collection of documents that consist of the phrase 'TriPhenChlor'. Fault-tolerant analysis for identifying duplicate records: Ahead of inserting new client record in a Customer relationship management system, identifying identical clients record and authenticate so that the client is unique. During inserting a new record named 'SAB Aktiengesellschaft & Co KG Deutschl.' in 'Wahldorf' the system identifies 'SAP Deutschland AG & Co. KG' in 'Walldorf' as a potential duplicate[8,17].

Fig. 2. Full Text Search Architecture

Figure 2 depicts the Full Text Search architecture which supports linguistics processing and other text searches such as Fuzzy. For implementation of Fuzzy Search we make use of Python based scripts provided by SAP HANA which can be used to extract entities such as person, products and places and enrich set of structured information these further enables additional attributes such as improved analytics and search. A fuzzy search is a substitute to a non-fault tolerant SQL statement like the following example 1 which would not return any results if there are spelling mistakes. Ex 1: SELECT * FROM DOCUMENT WHERE DOC_CONTENT LIKE ‘%DriPhenChlor%’

For a Fault-Tolerant SQL statement we can perform Full text search employing CONTAINS () parameter with WHERE clause of SELECT query with the FUZZY () option. The Fuzziness threshold can be manually set to make FUZZY () call.

2015 IEEE International Advance Computing Conference (IACC)

751

Ex 2: SELECT SCORE () AS score,* FROM DOCUMENT WHERE CONTAINS (doc_content,’DriPhenChlor', FUZZY (0.6)) ORDER BY score DESC;

The HANA Database is loaded with the client profile details by calling the LinkedIn developer API’s[9].The database contains over 2,000 clients related to the sales agent Table 1 shows the average server processing time and execution time to fetch client profile details from the HANA Database and display on browser. TABLE II. Processing and Response Time for suggesting no. Clients to products based on fuzzy search function.

The fuzzy search algorithm in example 2 computes a fuzzy score for any string match. Greater the score the more identical are the strings a score of 1.0 suggests that the strings are same a score of 0.0 suggests strings different [8]. The result of the query can be sorted ascending or descending based on the SCORE () function. The query when set to descending order returns a result with leading records first. The leading record is most similar to the user input. In Example 2 the score threshold is set to 0.6[8, 17].

No. of Clients suggested based on products

Server Processing Time (ms)

Response Time (ms)

5

9.141

9.179

27

9.693

9.820

48

9.703

10.669

66

9.727

10.706

173

10.962

11.231

V. EXPERIMENTAL RESULTS We evaluate results of our application Audience Discovery and Targeted Marketing with Fuzzy item set pattern detection in this section. The discussion is accompanied by description of the system settings, data sets used in our analysis, processing and response time of different workloads. A. System Settings All the experiments were conducted on instance of m2.4xlarge of Amazon Web services with a specification of SUSE Linux Enterprise Server equipped with 68.4GB of RAM, an Intel Xeon E52670 (Sandy Bridge) Processor running at 2.6 GHz and 2X840 GB of HDD. With high I/O instance of over 100,000 IOPS.A 2.4GB/s network was used for interconnect. B. Data Sets Used Our data sets was extracted from LinkedIn accounts of Sales agents with the use of LinkedIn Developer API’s [9].The overall dataset contains over 2,000 clients and 50 unique business products. C. Processing and Response Time The workload consists of typical analytic queries issued by the application in the backend when a function is called by the client in the front end. An average processing time of server and the response time to fetch data during the execution of the workloads is noted. TABLE I.

Processing and Response Time for fetching client profiles.

Client Profiles

50 100 500 1000 1500 2000

752

Server Processing Time (ms) 2.5496 3.09 3.386 4.761 5.349 6.036

Response Time (ms) 2.7049 3.2436 3.8873 5.743 6.487 7.427

Table 2 shows the average server processing time and execution time to suggest number of prospective clients based on products using fuzzy set approach. The Products under the domain of Pharmaceuticals suggests 5 prospective clients, the products under the domain of Semiconductors suggests 27 prospective clients, the products under the domain of Research suggest 48 prospective clients, the products under the domain of Education management suggest 66 prospective clients and the products under the domain of Electrical and Electronics manufacturing suggest 173 prospective clients a total of 319 prospective client was shortlisted after analyzing over 2,000 unique client profiles. VI. CONCLUSION AND FUTURE WORK Our goal in this approach was to overcome the sales agent marketing budget issues and to identify prospective clients with low cost, low risk and a profit driven approach. We have used LinkedIn data to identify prospective clients and market products using the computing power offered by SAP HANA framework. With SAP HANA a new state-of-the-art solutions which influences new breed of business applications can be realized. Historic database designs which have critically restricted business applications to evolve and assist real-time business can be overcome by SAP HANA. Real time OLAP insights upon OLTP data structure is facilitated by SAP HANA. We can acknowledge that to meet the requirements of present day real-time business analytics through conceiving business applications which were formerly not practical nor profitable. In our approach with SAP HANA and tooling around it we can analyze the Persona in question. Collect data from various sources. Using Text Analysis capability in HANA categorizes the interest areas for the Persona. Based on the interest categories we are able to collect and analyze data from internet sources (market trends, blogs, news, etc.) and internal sources (marketing catalogues, product blogs, etc.) put together a personalized message as email or talking points to approach potential clients.

2015 IEEE International Advance Computing Conference (IACC)

Audience Discovery & Targeting powered by HANA caters exceedingly comprehensive data such as Internal Enterprise resource planning, Customer relationship management, Business warehouse and independent sources for a legitimate 360° glimpse about targeted audience further a tremendous performance from designing to development. An impulsive user interface grants business clients to naturally segment any type of data by themselves. The gap between the operational campaign management and analytical insight is bridged leveraging HANA’s superior text analytics and Predictive suggestions.

[3]

S. Bhattacharyya, “Direct Marketing Performance Modeling Using Genetic Algorithms,” INFORMS J. on Computing, Vol. 11, 3, (1999), 248-257.

[4]

Cui, Geng, M.L. Wong and H.-K. Lui (2006), “Machine Learning for Direct Marketing Response Models,” Management Science, Vol. 52, No. 4, 597-612.

[5]

Geng Cui, Man Leung Wong, Xiang Wan, "Constrained Optimization with Genetic Algorithm: Improving Profitability of Targeted Marketing", ICMECG, 2010, 2010 Fourth International Conference on Management of E-Commerce and E-Government (ICMeCG 2010), 2010 Fourth International Conference on Management of ECommerce and E-Government (ICMeCG 2010) 2010, pp. 26-30, doi:10.1109/ICMeCG.2010.14

[6]

J. Zahavi, and N. Levin, “Applying neural computing to target marketing,” Journal of Direct Marketing, Vol. 11 No. 4, (1997), pp. 76-93.

[7]

J. Bult, and T. Wansbeek “Optimal selection for direct mail,” Marketing Science, Vol. 14, No. 4, 1995, pp. 378– 394.

[8]

"Introduction to SAP HANA Development." SAP HANA Developer Guide. SPS08 Version 1.1. SAP SE, 2014.pp 11-20.

[9]

Getting Started: REST API | LinkedIn Developer Network (Getting Started: REST API | LinkedIn Developer Network) https://developer.linkedin.com/docs/rest-api.

Audience Discovery & Targeting powered by SAP HANA has the following value-driving capabilities: x

Increasing the profitability by investing the marketing funds intelligently.

x

Increasing the gains by improving influence, feedbacks, conversion rates and client experience.

x

Remodeling organizational performance by adapting swiftly to advances in market dynamics.

There are several directions worth to be explored with respect to identifying clients and increasing the accuracy of prediction by making use of Facebook and twitter feeds of the clients and providing more customized approach for the sales agent to interact and market the product. VII. ACKNOWLEDGEMENT The author wishes to thank the department of Computer Science and Engineering, RNS Institute of Technology, Bangalore. A special thanks to my internal guide Dr. Kiran P I am fully indebted to him for his understanding wisdom, patience, encouragement and enthusiasm and for pushing me farther than I thought I could go. I am very grateful and like to thank all the employee and associates of VASPP Technologies Private Limited for giving me the honor to carry out my project. A special thanks to my external guide Nithin S J for providing me with valuable resources and suggestions throughout the endeavor in completing the project successfully and supporting me for the success of this paper. REFERENCES [1]

Jiajin Huang, Ning Zhong, Chunnian Liu, Y. Y. Yao, Dejun Qiu, Chuangxin Ou, "TMS: Targeted Marketing System Based on Market Value Functions," Web Intelligence, IEEE / WIC / ACM International Conference on, pp. 775-776, 2004 IEEE/WIC/ACM International Conference on Web Intelligence (WI'04), 2004

[2]

S, Srikantarao, and Radhika K R. "Overview of Sap Hana: In-Memory Computing Technology and ITS Applications." International Journal of Innovative Technology and Exploring Engineering (IJITEE) 3, no. 5 (June 2014): 136-139.

[10] Why Big Data Is the New Competitive Advantage (" by McGuire, Tim, Manyika, James, Chui, Michael) https://www.questia.com/read/1P3-2760525251/whybig-data-is-the-new-competitive-advantage [11] R. Schenkel, A. Broschart, S. won Hwang, M. Theobald, and G. Weikum, “Efficient text proximity search,” in SPIRE, 2007, pp. 287– 299. [12] H. Yan, S. Shi, F. Zhang, T. Suel, and J.-R. Wen, “Efficient term proximity search with term-pair indexes,” in CIKM, 2010, pp. 1229– 1238. [13] Partha Sarathi Pal, Abhik Mukherjee, Subhamay Das, S.K. Chaudhuri, "A Fuzzy Based Strategy for Improved Search Coverage of an Airborne Seeker", EAIT, 2011, Emerging Applications of Information Technology, International Conference on, Emerging Applications of Information Technology, International Conference on 2011, pp. 305-308, doi:10.1109/EAIT.2011.59 [14] R. Garza-Dominguez, E. Bautista-Thompson, A. QuirozGutierrez, "Analysis of DNA-Dimer Frequency in Retroviral Genomes with a Fuzzy Item sets Pattern Induction Strategy", CIC, 2006, Computing, International Conference on, Computing, International Conference on 2006, pp. 197-202, doi:10.1109/CIC.2006.24 [15] ”SAPUI5 Developer Guide.” SAPUI5 Developer Guide for SAP HANA. SPS08 Version 1.0. SAP SE, 2014 [16] "SAP HANA." Wikipedia, the Free Encyclopedia. Accessed April 6, 2015. http://en.wikipedia.org/wiki/SAP_HANA. [17] ”What is Fuzzy Search? “SAP HANA Fuzzy Search Reference. SPS08 Version 1.1. SAP SE, 2014

2015 IEEE International Advance Computing Conference (IACC)

753