Business Intelligence Applications in Retail Business ...

19 downloads 45098 Views 918KB Size Report
Business Intelligence Applications in Retail Business: OLAP, Data Mining & Reporting Services¤. Ipek Deveci Kocakoç y,x and Sabri Erdem z,{ y. Dokuz Eylül ...
June 1, 2010

9:48:59am

WSPC/188-JIKM

FA2

00254

Journal of Information & Knowledge Management, Vol. 9, No. 2 (2010) 171181 # .c World Scienti¯c Publishing Co. DOI: 10.1142/S0219649210002541

Business Intelligence Applications in Retail Business: OLAP, Data Mining & Reporting Services¤ y

Ipek Deveci Kocakoçy,x and Sabri Erdemz,{

€ University, Faculty of Economics and Administrative Sciences Dokuz Eylul Dokuzce»smeler, Buca, 35160, Izmir, Turkey and zDokuz Eylul € University, Business Faculty, Tnaztepe Yerle»skesi Buca, 35160, Izmir, Turkey x [email protected] {[email protected]

Abstract. As a result of today's competitive business environment, companies have been trying to improve the utilization of funds e®ectively in their budgets for information technology investments. These companies retrieve more information with the same set of resources by means of business intelligence methods. According to Rubin (Chabrow, 2004) IT budgets are not simply declining or levelling o®, rather, companies are shifting from a pure cost-cut mode to a model that emphasises agility and e±ciency. Tremendous daily growth of the company data requires more funds and investment for establishing the technologies and infrastructure necessary for gathering fast and crucial information that supports the decision making process. This necessity gave birth to various business intelligence methods, which mainly aim to process mass amount of collected data from their existing application, and represent it in a way with which companies can apply to their daily competitive decisions. This application primarily concerns the implementation of business intelligence for a retail business company. The aim is to implement built-in business intelligence solutions of the Microsoft SQL Server that holds the commercial information of the company for the past three years. The customer company has already been using Microsoft products. The key items used for analyzing data are sales, momentary inventory and logistics information. The application can be grouped in ¯ve main areas: Building the data warehouse, constructing OLAP cubes, applying data mining algorithms on OLAP cubes, representing the results in reports with reporting services, and implementation.

regularly have shown importance for the last decade and serious research have been carried out in this area since then. The main requirement in these sectors is extracting usable information out of the gathered data. Reaching that goal requires lots of resources and takes a very long time unless business intelligence is used. Moreover, the information, which comes from personnel who are responsible for gathering data, may be incorrect which may lead to unwanted results. By 2010, enlightened IT and business intelligence leaders will continue to take a more business-focused approach to their BI investments by deploying new, targeted solutions that deliver the right capabilities to meet the needs of speci¯c user groups (Chitkara, 2010). BI provides business functions, spending 80% of its time analysing data instead of collecting it (Eckerson, 2010). Managers can directly access the information automatically generated by the system and manipulate it in order to make necessary decisions. In this project, the aim is to give the company, operating at the highly transactional retail business, the ability to make predictions and reach the necessary information instantly from gathered data by means of business intelligence. Since the amount of transactions are enormous, the gathered data increases in size correspondingly. In such a case, summarising and information extraction based on this data and future prediction plays a very vital role for making healthy decisions. The customer company is a medium-sized enterprise and one of the market leaders in denim products from the textile sector for many years. It has 100 stores and 300 sales points. Demand in the textile market is highly °exible, includes seasonality, product life cycles are very short due to frequent model changes, product delivery locations are in terms of hundreds all around the country, delivery

Keywords: Retail; business intelligence; OLAP; data mining; reporting services.

1. Introduction In today's highly competitive business environment in retail industry, where an immediate response to market needs and changes is so crucial, gathering and storing data * This study is a part of a project (No: 200878) which is supported by Dokuz Eylül University Scienti¯c Research Fund.

171

June 1, 2010

172

9:48:59am

WSPC/188-JIKM

FA2

00254

I. D. Kocakoc and S. Erdem

times are quietly short, hundreds of di®erent models are sold at any moment and transactions are executed real time. The company has historical data for two years and approximately 10 million sales records during this period. All data must be summarised, reported and interpreted in a short amount of time in such a market to survive among the market sharers. In sum, what we need here to manage information in this market is Business Intelligence tools. Business Intelligence tools are being widely used in the retail industry all over the world. Before we began to apply our solution, we investigated the samples which guided us. Granebring and Revay (2007) give some ideas and suggestions on determining and ¯nding the common needs in the retail industry like supporting the promotional activities with personalised customer o®ers and forming a strategy to introduce daily decision support in the retail trade. Goddard, an IT strategic development manager at Tesco, explained their challenges and results in a case study (http://www.businessobjects.com/company/ customers/spotlight/tesco.asp). Tesco is one of the top three international retailers in the world, with around 2,000 stores and 326,000 people in 13 countries across Europe and Asia. They also decided to use a web-based business intelligence product which delivers the speed, functionality and °exibility required to deliver both standard and ad hoc reporting in a timely fashion to their clients across the world. In many applications, it is shown that accurate forecast of consumer retail sales can help improve retail supply chain operation, especially for larger retailers which have a signi¯cant market share. OLAP usage is another most preferred solution all over the world. A case study in Winmetrics (http://www. winmetrics.com/olap casestudies.html) reports that a ¯nancial service company developed performance reports for clients and they saved $200K in nine months. The company had preferred a Microsoft solution, because they already have licenses and familiar with Microsoft environment. They said that \OLAP looked like the answer because it pre-computes numeric aggregations for the cross-product of all relevant dimensions so that summary information for any combination of dimensions can be displayed on demand." In our application, Microsoft SQL Server Business Intelligencer 1 tools are used for the customer company which already uses MSSQL server infrastructure and has three years of data to solve the issues listed above. The company aims to get comparative reports year to year to follow up sales and inventories according to the regions, 1 Microsoft SQL Server Business Intelligence is a registered trademark of Microsoft Corporation.

and wants to make a decision for the next year's production and determine the need for new stores or excessive stores. The ¯rst step is to create a data warehouse for each year from OLE DB to collect the necessary data speci¯cally for business intelligence applications. Then OLAP (Online Analytical Processing) cubes, which are queried and periodically loaded from data warehouse, were constructed. Data mining is the further step to analyse the data and ¯nding hidden patterns by applying classi¯cation, association and forecasting algorithms. Finally, the speci¯ed data was reported with the Reporting Services tool of MSSQL server to provide users with meaningful data in an organised way.

2. Phases of Application 2.1. Determining business requirements Determination of business requirements is the most important phase of a BI (Business Intelligence) project. Any lack of information can seriously damage the success of the project. We have interviewed all key users of the system and inspected the documents and tables, which they had already utilized, to ¯nd out the requirements. Key business objectives expected as an output from this BI application are found to be as follows: .

Sales performance and required inventory tracking: It is required to get sales and inventory information based on region and product based on each year and month. The business needs can be detailed as follows: (i) To examine sales and inventory status of each product regarding each year and a speci¯c season; (ii) To analyse the growth rates and sales performance of each region and each store monthly; (iii) To determine the total sales and the amount of required inventory for each gender.

.

Pro¯tability: Pro¯tability analysis is also required to track the actual sales against target amounts on a monthly basis for each sales region and store. Determining the shelf life of the products to decide discounts o®ered to customers is another point for the company. Therefore, a company's business requirements are determined as follows: (i) Time-based quantity of sales and inventory reports (Region, City, Branch, Product Detailed) (ii) Weekly, monthly and annual comparative tables (iii) Highest and lowest sales, salesmen reports (iv) Pro¯t and revenue reports (v) Prediction of next month's sales

June 1, 2010

9:48:59am

WSPC/188-JIKM

FA2

00254

Business Intelligence Applications in Retail Business

(vi) Determining the products mostly sold together for promotions. Then a data warehouse is built based on these requirements.

2.2 System architecture The model we developed can be examined at three layers (see Figs. 1 and 2). In the Data Layer, the necessary data for BI applications are ¯ltered and transferred to the data warehouse using ETL (Extract, Transform and Loading) packages. In the BI Layer, OLAP cubes and data mining models are created to feed up the reports. In the Interface Layer, we are presenting end-user reports on the web with the reporting services.

2.3 Building the data warehouse It is di±cult to access the operational information in the database for end-users. There are hundreds of tables spread across the database, which stores terabytes of data, and it requires expert knowledge to answer a business question. Building only new reports is not adequate because it takes too much time to get results and it is not cost e±cient. SQL Server 2005 Analysis Services provides wizards and designers for all major objects. Wizards are used to initially create any object, and in Analysis Services 2005 they are °exible enough that in many cases you

Fig. 1.

173

do not need to further re¯ne your objects afterwards (MacLennan, 2004). At this point, the ¯rst step before we apply the BI application, we need to construct our data warehouse based on the requirements above. In this context, data warehouse application takes place as depicted in Fig. 3. A typical data warehouse is a collection of databases including both active and archive ones that the analysis and reporting services are running on (Jacobson, Misner and Hitachi Consulting, 2006). The most important step of business intelligence is that it uses a separate database from the actual one that the OLEDB transactions are performed. Before designing the data warehouse, necessary analyses are made with the managers and users. Based on these analyses, it is found that the grouping and reporting for inventory properties, customer properties, time, region, branch and sales person are necessary. In the data warehouse created, all customer and inventory information to be analysed were placed. Current account information from the CRM database (profession, age, sex, marital status), report codes and inventory properties (color, size, season) that create the functional groupings are also available. The retail stores are grouped based on their geographical location, the managers, the size and the type. There are also performance outputs for the sales person of that particular store. The data warehouse structure can be seen in Appendix A.

Project stages.

June 1, 2010

174

9:49:00am

WSPC/188-JIKM

FA2

00254

I. D. Kocakoc and S. Erdem

Fig. 2.

Fig. 3.

A general view of the project structure.

Data warehouse in business intelligence application. Source: Hancock and Toren, 2006.

2.4 Designing OLAP Online Analytical Processing (OLAP) is a multidimensional solution architecture that allows the gathering of the most valuable data from relational databases for querying and reporting purposes according to business needs. Langit (2007) points out that the aim in designing OLAP cubes is reorganising data regarding business requirements to access information faster. In retail business, there are huge numbers of products with their many kinds of variants and sales transactions. Summarising meaningful data on time is very useful for sales forecasting and production planning. Thus, OLAP is the best choice for reporting under these circumstances.

The OLAP database increases relational database capability with following bene¯ts: (i) User can reach summarised data quickly; (ii) Data can be stored in a hierarchical way via cubes; (iii) It has speci¯c areas for numerical measures. OLAP components can be sampled as shown in Figs. 4 and 5. Figure 5 shows the analysis services view of the cube in Fig. 4. Any dimension can be extended to see the aggregated values for a speci¯c selection by clicking the plus sign at each main group. The columns are year, quarter and month dimensions respectively. The rows include two other dimensions: district dimension and product group

June 1, 2010

9:49:00am

WSPC/188-JIKM

FA2

00254

Business Intelligence Applications in Retail Business

Fig. 4.

175

Three-dimensional cube: product, month and variables.

Fig. 5.

The analysis services view of the cube in Fig. 4.

dimension. For example, clicking the \þ" sign of the second quarter, the sales quantities of product groups for each district will be drilled down. To see the sales quantities for a speci¯c district, click the \þ" sign for that one. The graphic also shows the totals for each dimension of the cube. Cube: A cube stores business dimensions and aggregated facts in a form that makes it easy for users to analyse the data by viewing the aggregated values where dimensions and facts intersect. Unlike a two-dimensional table, cubes are multidimensional. The cube is then reached for reporting purposes by user (Kimball and Ross, 2002). Our cube contains the following objects. Details of the objects can be seen in Appendix A. Measures: Measure is a quantitative value that you generally aggregate and analyse. Measures can be values

in the underlying fact tables of the cube, or you can de¯ne calculated measures (Thomsen, 2002). In our project, the amount of sales and total sales income are measures. Fact Tables: These are tables in the data warehouse which include detailed data for measures. A fact table contains measures and foreign key ¯elds which are related to dimension tables. The fact tables may contain the number of sales, the amount of discount, returns, existing inventories, shelf life, cover, etc., which is usually used in analysis reports in retail business (Kimball and Ross, 2002). In our application, there are two fact tables: sales and inventory, which can be seen in Fig. 3 in detail. Dimension: Dimensions form the contexts for the facts, and provide the axes of the cubes in the OLAP solution. Dimension tables in the data warehouse consist of the dimension members such as time. Our analysis is based on

June 1, 2010

176

9:49:00am

WSPC/188-JIKM

FA2

00254

I. D. Kocakoc and S. Erdem

dimension tables that connected to measures in fact tables. For example, a product group, a sales region, etc., are dimensions and number of products sold, total sales amount for a region, etc., are measures and they are summed regarding to those dimensions de¯ned in the table as in Wrembel and Koncilia's study (2006). Our dimensions in this study are the branch, customer, inventory, returns, promotion and variants. There is a unique ¯eld for each record in the dimension table which is connected to a foreign key described in the fact table in order to put a relation between dimension and fact tables.

2.5 Data mining Data mining is a method to look for new, valuable and noticeable information in large volumes of data. It is a combination of human and computer e®ort (Tang and MacLennan, 2005). When human expertise for describing problems and the search capabilities of a computer come together, best results can be achieved. When applying data mining algorithms, mostly, data drives the analyst. It is not possible to catch the same strategic correlations and clusters with human sensation or SQL queries. Finding hidden patterns in the data and getting knowledge of the cause and e®ect structure within the business processes can bring substantial bene¯ts. In this application, the Analysis Services Data Mining component is utilised. Data mining solutions are created by means of SQL Server Business Intelligence Development Studio. Data mining was applied on OLAP cubes. Additionally, SQL Server Reporting Services was used for data mining models to process data as well. In our application, the following questions were examined with speci¯c the data mining methods: (i) What is the expected income for the next month? What are the expected pro¯ts for each product for the next month? (ii) Which products are being sold mostly together? (iii) What are the customer purchase characteristics for a speci¯c region and time period? (iv) How can we classify our customers? For each question, the most appropriate data and the most accurate algorithm should be chosen. Analysis Services provides nine algorithms. In our application, we used the following algorithms: Microsoft Time Series Algorithm: This was used for predicting future values based on historical data. Production and investment plans can be based on the data obtained from the algorithm.

Microsoft Association Algorithm: This algorithm identi¯es rules to predict a customer's likely future purchases based on items that already exist in the customer's basket. We applied the algorithm to determine the most commonly sold product groups as shown in Fig. 6. Finding the most commonly sold products in the dataset is called a market basket analysis. When the mining model is processed, it ¯rst looks for the most commonly sold item set in the dataset and their size. Size means the number of items in a set. Then the model calculates the probability of each item set and orders them according to their importance. Figure 6 shows the results of a two-item market basket analysis. If a customer buys \Bermuda" and \Pants" products, he is likely to buy a \T-shirt" product with them. The customer company required the results of this analysis to use during promotion periods.

2.6

Representing reports by reporting services

Preparing reports on user friendly screens from MSSQL Analysis Services is as important as the other parts of the system. No matter how well the infrastructure of the project is, there have been many unsuccessful IT projects due to its not having an ease of use. In this project, we planned to present the results of data mining and OLAP cubes over a Web Portal. The information on the analysis services, generated by SQL reporting services, can be presented in detail and visual design as the customer requested. It is also designed to make the necessary ¯ltering and applying constraints on the reports, which are accessible based on the user permissions and content ¯ltering, via GUIs (Turley et al., 2006). With the help of these constraint GUIs, the user can ¯lter the information as a result of the report taken. Generated reports can also be accessed from any web or desktop applications, as well as mobile devices after the generation of mobile device views. This enables the customer to reach information from anywhere, at any time, very fast and very accurately. The instantaneous Online Transaction Process (OLTP) reports requested from the database, which stores actual transactions, are also added to the reporting services index. These reports are categorised in a special label and secured by user rights, enabling access by the authorised personnel only. Since the reliability of the application system can be severely a®ected, it was necessary to restrict access to some groups of users. Figures 7 and 8 demonstrate the same report generated via reporting services in two di®erent visual formats.

June 1, 2010

9:49:00am

WSPC/188-JIKM

FA2

00254

Business Intelligence Applications in Retail Business

Fig. 6.

Market basket analysis output.

Fig. 7.

Graphical chart sample.

177

June 1, 2010

178

9:49:01am

WSPC/188-JIKM

FA2

00254

I. D. Kocakoc and S. Erdem

Fig. 8.

Tabular view of graphical report.

The pie-chart shows the distribution of the sales for the top 10 colors of male products. The value list in the scale indicates the colors of the products. The bar-chart at the upper side shows pro¯ts for each color. The chart at the bottom shows the number of male products sold for each color during summer period. These are generated with MSSQL Analysis Services Reporting Services. This is the tabular view of graphical reports shown in Fig. 7. With this web report, the user can ¯lter the information to whatever he wants to see. By using that report, the exact quantities of total sales are shown in detail. In addition, users can generate their reports according to their needs by using Report Builder which is a component of Reporting Services. It gathers the columns from the dimension and fact tables de¯ned in the OLAP cubes and present the results as real-time reports on the web. Generated reports can either be saved in excel, pdf, html format or can be stored on the web portal for future reference.

2.7 Implementation and documentation After completing our BI design, the implementation phase started in the company environment. The company bought a new server machine for storing the data warehouse. We created DTS (Data Transformation Service) packages for that data °ow for getting data from OLE DB

to data warehouse based on a schedule that is suitable for the company. After the infrastructure was installed and con¯gured, we created OLAP cubes on the data warehouse designed. Then we deployed data mining applications and completed analysis services operations. Finally, to present the results to the users, we prepared the reports on a web portal that is being used by reporting services. The company required only authorised personnel accessing the reports, hence we authenticated the users and restricted accesses. The last step of implementation was giving training to end-users for system usage and technical personnel for taking, and even creating reports. Because business cases and requirements are changing day by day, the company needs additional reports. Therefore, the training subject for technical personnel was creating reports by utilising reporting builder. Finally, to provide continuity and maintenance, we trained the system administration personnel in case any problem occurs during data °ow. User manuals are also prepared and provided.

2.8

Savings of the project

With this project, the time for generating reports is decreased explicitly. The reports are taken from the data warehouse so transactional operations which are recorded

June 1, 2010

9:49:02am

WSPC/188-JIKM

FA2

00254

Business Intelligence Applications in Retail Business Table 1.

179

Data sizes and performance gains. DW

2007

2008

20072008 Total

Gain (%)

900 2.906.675

44.400 7.881.128

20.100 2.543.127

64.500 10.424.255

98.60 72.12

Average report time (sec.)

BI solution

2007

Gain (%)

2008

Gain (%)

Top sales report Monthly comparative table Weekly comparative table

15 20 30

900 1,200 1,500

98.33 98.33 98.00

360 420 480

95.83 95.24 93.75

Data ¯les (MB) Sales transaction

in OLE DB are not a®ected. Generating reports rapidly helps managers to make decisions on time. Report Builder is also an easy-to-use tool which allows users prepare their own reports without getting help from an IT person. This project also produced dramatic performance gains. Table 1 shows disk sizes for the last two years' data ¯les and data warehouse (DW), and performance gains for most common reports. Storage space decreased by 98.60%, and average report times are decreased by over 90%.

3. Conclusion This BI project is focused on delivering business requirements of a textile company. When we are designing our solution, we have worked with professionals of the company to determine the needs and business cases and to meet the expected business bene¯ts. Even these bene¯ts may or may not be measurable, shortening business

processes, showing the hidden data to lead business decisions and presenting valuable data in an easier way are the points we aimed together with the company. At the end of the project, management and sales department gained an advantage of analysing the sales status of approximately 150 branches and making critical decisions on time via sales reports. On the other hand, the inventory reports make it easier to make decisions for demands from suppliers and for inventory exchange between the branches. Top management has found forecasting reports derived from data mining applications useful since the reports help them make strategic decisions. As a result, using BI applications in retail business, which has terabytes of data, provides serious competitive advantages. Generating comparative reports based on years, reaching strategic information faster and making accurate business forecasts are vital results of the BI applications.

June 1, 2010

180

9:49:02am

WSPC/188-JIKM

00254

FA2

I. D. Kocakoc and S. Erdem

Appendix A. Data Warehouse Diagram

References Chabrow, E (2004). Tech spending is in agility and e±ciency, http://www.informationweek.com/news/globalcio/roi/showArticle.jhtml?articleID=53701039, Accessed 15/02/2010. Chitkara, AR (2010). BI 2010: Evolving to meet demands of the changing business landscape, http://tdwi.org/ Articles/2010/01/06/BI-2010-Evolving-to-MeetDemands-of-the-Changing-Business-Landscape.aspx? Page=3, Accessed 15/02/2010. Eckerson, WW (2010). TDWI announces new best practices report: Transforming ¯nance: How CFOs use business intelligence to turn ¯nance from record keepers

to strategic advisors, http://tdwi.org/Articles/2010/ 01/20/TDWI-Announces-New-Best-Practices-ReportTransforming-Finance-How-CFOs-Use-Business-Intelligence-to.aspx?Page=1, Accessed 15/2/2010. Granebring, A and P Revay (2007). Service-oriented architecture is a driver for daily decision support, http://www.eki.mdh.se/personal/agg01/SOA%20is% 20an%20OLAP%20driver.pdf, Accessed 30/06/2008. Hancock, JC and R Toren (2006). Practical Business Intelligence with SQL Server 2005 (Microsoft Windows Server System Series), Addison Wesley. http://www.businessobjects.com/company/customers/ spotlight/tesco.asp, Accessed 30/06/2008.

June 1, 2010

9:49:55am

WSPC/188-JIKM

00254

FA2

Business Intelligence Applications in Retail Business

WinMetrics Corporation (2008). Business Intelligence Case Studies. http://www.winmetrics.com/olap casestudies. html, Accessed 30/06/2008. Jacobson, R, S Misner and Hitachi Consulting (2006). Microsoft r ; SQL Server(TM) 2005 Analysis Services Step by Step (Step by Step (Microsoft)). Redmond, WA: Microsoft Press. Kimball, R and M Ross (2002). The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition). Indianapolis: Wiley Publishing, Inc. Langit, L (2007). Foundations of SQL Server 2005 Business Intelligence. New York: APress. MacLennan, J (2004). Unearth the new data mining features of analysis services 2005, http://msdn.

181

microsoft.com/tr-tr/magazine/cc300503(en-us).aspx, Accessed 30/6/2008. Tang, ZH and J MacLennan (2005). Data Mining with SQL Server (2005). Indianapolis: Wiley Publishing, Inc. Thomsen, E (2002). OLAP Solutions: Building Multidimensional Information Systems. Indianapolis: John Wiley & Sons, Inc. Turley, P, T Bryant, J Counihan and D DuVarney (2006). Professional SQL Server 2005 Reporting Services. Indianapolis: Wiley Publishing, Inc. Wrembel, R and C Koncilia (2006). Data Warehouses and Olap: Concepts, Architectures and Solutions. Hershey, PA: IRM Press.

Copyright of Journal of Information & Knowledge Management is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.