International Journal of Applied Engineering Research ISSN 0973-4562 Volume 10, Number 5 (2015) pp. 12843-12853 © Research India Publications http://www.ripublication.com

Product Demand Forecasting: Will This Product Sell Mrs. Pallavi M S1 , M K Shanmuga Shanker2, Nishanth Bhat K3 Amrita Vishwa Vidyapeetham, Mysore Campus, #114 Bogadi 2nd Stage Mysore. Karnataka India, [email protected], Mob No: 8884559752

Abstract With the latest advancement in technology the number of products available to consumers has increased exponentially. There has also been a significant rise in competitiveness in every business segment. It is paramount for a product to cater to the needs of the users explicitly to remain successful in the market. Every company now has to balance product innovation and new product feasibility; else they will suffer financial loss. In this environment, it is important for the companies to have a system which can predict the success or failure of the proposed product, by identifying its similarities with those of successful products currently existing before mainstream production take place. Our project aims at predicting the profit level of a proposed product by comparing it with the sales data of existing products in the same category to gain meaningful insight to its profitability. With this project we propose to automate the profitability prediction which exists as a largely manual process today. We propose to use the data mining techniques for prediction of profit of proposed product based on the financial performance of the existing products. Specifically, we have identified classification using the naive Bayes algorithm as the most suitable for our project. Our system will not only work with any generic product, but will also deployable on the internet so as to have maximum reach. The input dataset will be the sales data of existing products in the same category of products along with the features & cost of the proposed product. The output generated will specify the likely profitability level of the proposed product. Thereby increasing both the efficiency and speed by which companies can take better product decisions.

Introduction

In the 21st century the world has become a global marketplace. This has led to an explosion of products available in the market from various companies. From the company’s point of view, there has never been a period with higher market competition. Each company has to ensure that their products are designed for generating maximum demand in that category. The problem statement for our paper is

12844

Pallavi M S

How to predict the profit level of a product in the market based on existing sales data of similar products in that category. To obtain a baseline, we use historical records, i.e. the sales performance of previous products, in the same category along with their feature set. We then use a Naïve Bayesian algorithm which is a supervised classification algorithm to predict the performance of the proposed product based on its feature set and the training data. We are targeting a model which can be used as a generic application to predict the demand for a proposed product irrespective of its category/type. The paper is organized as follows: In the introduction, we highlight the importance of product demand forecasting. Followed by the literary survey where, we summarize the work done in related fields of classification and predictor design based on a Naive Bayes algorithm. In the overview of predictor we describe our predictor as we have envisioned it. In the Naïve Bayesian Classification section we outline the our implementation of naïve Bayesian classifier that we use on our predictor. In the experimental section we show with an example how such implementation of the predictor will help any company's sales performance.

Literature Survey In our literary survey, we saw a Predictor to infer the phone a user will switch to considering: (a) the type of previous phone, (b) the social influence [1]. We understood about using Bayesian Networks: for ICT Adoption [2]. We then learnt how doing knowledge discovery in databases (KDD) and KDD processes [3]. We looked at how to Design an intelligent predictor that would beat a simple moving average baseline across a number of products [4]. We considered Using a multicriteria decision making approach to evaluate mobile phone alternatives [5]. We learnt how to mine comparable entities from Comparative questions [6]. We understood the importance of choice criteria, concentrating on what bases the consumer will select the mobile phones [7] and about the Advantages of classifiers with probabilistic output such as Reject option, Changing utility functions, Compensating for class imbalance etc. [8]. We then learnt about the major kinds of classification method including decision tree induction, Bayesian networks, k-nearest neighbor classifier, case-based reasoning, genetic algorithm and fuzzy logic technique [9]. We also looked at methods for learning Bayesian belief network (BN) based predictive models for classification [10].

Overview of Predictor A. Category Creation The predictor we have modeled is a generic product demand predictor and hence, each company will be required to create a category specification first. They will define the commonly occurring features applicable to that category of product along with the applicable feature values. If we consider the example of television some the features applicable are

Product Demand Forecasting: Will This Product Sell

12845

Screen type: LED/LCD, Screen size: 25”/27”/54” etc. An example of the same has been illustrated in Figure 1.1.

Figure 1.1: B. Existing Product Dataset The product data set needs to be entered in the product category just created. The data should specify values for features present in the specified model of the product along with sales performance, i.e. cost of product production, no of products produced using which we use to calculate total investment cost of the product. The company is also required to input the total no of products sold along with sales cost per product this will constitute total revenue. We find the difference between total cost and total revenue to classify the profit level of that product. Total investment = Cost of production per unit * total Number of units produced Total Revenues = Sales cost per unit * total number of units Profit = Total revenue - total investment

12846

Pallavi M S

Profit level α profit

Figure 1.2: In the example illustrated above Figure 1.2, we can see the sales performance of a 29” television where the cost per unit is 800 number of units manufactured is 10000 this sets our total investment to 80,00,000 And the cost/ unit is 1000, units/sold is 9000 this sets our total revenues to 90,00,000. This means that the product has a profit

Product Demand Forecasting: Will This Product Sell

12847

of 10,00,000. Compared to the investment cost of 80,00,000 this results in a profit % of 12.5%. C. Proposed Product Features Finally, we define the feature set for the proposed product and the predictor predicts the Profit level of the proposed product based on the training set as shown in Figure 1.3 and Figure 1.4

Figure 1.3:

Figure 1.4:

12848

Pallavi M S

D. Creation of training Dataset for classification We use Naïve Bayes classifiers to predict the profit level of the proposed product based on the previous sales data. Naïve Bayes Classifier needs a training dataset for classifying the new product. The training set consists of columns of the product names, each feature the proposed product contains and the calculated profit value. Table 1.1:

The above sample Table 1.1 shows the columns of a training dataset for a product of category “Television” with features “Screen Type” and “Screen Size”. Algorithm for creating Training data set getTrainingDataset(category,new_Product) { productsTable = get_All_Products (category); for(product in productsTable) { If ( product.features match new_product.features) then trainingDataset.add(product) } Return trainingDataset; } The algorithm for creating Training dataset starts by collecting the data about all the products currently existing in the database belonging to that category. It then matches these products that have similar features to the new product and adds their information to the training data set.

Naïve Bayes Classification We have chosen naïve Bayes classifier for predicting as we have found it to be more suitable based on the accuracy and efficiency in prediction scenarios [9] [10]. Naïve Bayesian classification is a supervised learning algorithm where the training set is predefined. [8] The Naive Bayes classifier selects the most likely classification Vnb given the attribute values a1, a2... an. This results in: Vnb =argmaxvj VP(vj)π P(ai|vj) We generally estimate P(ai|vj) using m-estimates: P(ai|vj)=nc+mpn+m Where: n = the number of training examples for which v = vj

Product Demand Forecasting: Will This Product Sell

12849

nc = number of examples for which v = vj and a = ai p = a priori estimate for P(ai|vj) m = the equivalent sample size We obtain the values of n, nc, p, m from the training dataset implemented. Our algorithm calculates the profit level by identifying the Profit level with the highest level of occurrence after classification. In the illustration below medium profit has highest probability value of 0.455. Hence the proposed/new product is classified as having medium profit level as shown in Graph 2.1.

Graph 2.1: Algorithm For Predicting Demand Basd on Naïve Bayesian predictDemand(category,new_ Product) { Sub = { "Loss", "LessProfit", "MediumProfit","HighProfit" } features new_Product.getFeature() trainingDataset = getTrainingDataset(category,new_product) p = 1/sub.length m = features.length for(i=0 to s.length) do { Pi = 1; For(j=0 to features.lenghth) do { n = 0; nc = 0;

=

12850

Pallavi M S

for(k=0 to trainingDataset.length) do { n++; if (trainingDataset[k].profit == sub[i]) then { nc++; } } X = (nc + (m * p)) / ( n + m) Pi = Pi * X; } Pi = Pi *p; Res[i]=Pi; } high = max(Res); Return sub[Res.indexof ( high ) ]; }

Experimental Results To show our experimental result, we have considered a data table which contains sales and profit details of different categories of product. We assume “G-tv” series is a product of category “television” with features “Display Type” and “Display Size”. Table 2.1:

The Table 2.1 below shows the feature values of respective products with the “ProductId” in the table above.

Product Demand Forecasting: Will This Product Sell

12851

Table 2.2:

For ex. ProductId 1 is the product “G-tv x001” with features “Screen Type” and “Screen Size” with values “LCD” and “22” respectively. We have the products data table ready as shown in Table 2.2. Suppose we want to predict the profit level of a new product named “G-tv x005” with features as given below Figure 2.2.

Figure 2.2: Now we have a new product to classify and predict. To do this we use naïve Bayes classifier and as mentioned before the algorithm requires a training dataset to classify the new product. So we use ‘get Training Dataset()’ algorithm to generate a training dataset containing the products with similar features as shown in the Table 2.3 below.

12852

Pallavi M S Table 2.3: Product Name G-tv X001 G-tv X002 G-tv X003

Screen Type LCD LCD LCD

Screen Size 22 22 22

Profit High Profit High Profit High Profit

We have our Training Dataset ready for use by naïve bayes classifier. We use the function ‘predictDemand()’ to classify the product to one of the profit levels defined.

Graph 2.1:

We got the result of the probability (Pi) values as shown in the Graph 2.1 which clearly classifies the new product to be of the class ‘High Profit’ with the highest probability value of 0.085. Thus we have predicted based on previous sales of similar products that if the product ‘G-tv x005’ is produced then it will gain high profit.

Conclusion In this paper, we have outlined a module for generic product profit level prediction/estimation using past sales data and Bayesian classifier, which can be used by any manufacturers to estimate the demand for a product before actual manufacturing.

Product Demand Forecasting: Will This Product Sell

12853

References Yi Wang, Hui Zang, Pravallika Devineni, Michalis Faloutsos, Krishna Janakiraman and Sara “ Which phone will you get next: observing trends and predicting the choice”, Network Operations and Management Symposium (NOMS), 2014 IEEE [2]. Sergiu Nedevschi, Jaspal S. Sandhu, Joyojeet Pal, Rodrigo Fonseca, Kentaro Toyama.” Bayesian Networks: an Exploratory Tool for Understanding ICT Adoption”, Information and Communication Technologies and Development, 2006. [3]. Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth . “From Data Mining to Knowledge Discovery in Databases”, AI Magazine Volume 17 Number 3 (1996) [4]. Indre˙ Zˇliobaite, Jorn Bakker, Mykola Pechenizkiy “Beating the baseline prediction in food sales: How intelligent an intelligent predictor is?”, Expert Systems with Applications Volume 39, Issue 1, January 2012 [5]. Gülfem Işıklara, Gülçin Büyüközkan,” Using a multi-criteria decision making approach to evaluate mobile phone alternatives”, Computer Standards & Interfaces,Volume 29, Issue 2, February 2007, Pages 265– 274 [6]. Shasha Li,Chin-Yew Lin,Young-In Song, Zhoujun Li , “Comparable Entity Mining from Comparative Questions “. ieee transactions on knowledge and data engineering vol:25 no:7 2013.” [7]. Safiek Mokhlis, Azizul Yadi Yaakop, “Consumer Choice Criteria in Mobile Phone Selection: An Investigation of Malaysian University Students”, International Review of Social Sciences and Humanities Vol. 2, No. 2 (2012), pp. 203-212 [8]. Kevin P. Murphy, “Naive Bayes classifiers”, University of British Columbia, 2006 [9]. Thair Nu Phyu, “Survey of Classification Techniques in Data Mining”, International MultiConference of Engineers and Computer Scientists 2009 Vol I , IMECS 2009. [10]. Jie Cheng, Russell Greiner, “Learning Bayesian Belief Network Classifiers: Algorithms and System”, 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence, AI 2001 Ottawa. [1].

12854

Pallavi M S

Product Demand Forecasting: Will This Product Sell Mrs. Pallavi M S1 , M K Shanmuga Shanker2, Nishanth Bhat K3 Amrita Vishwa Vidyapeetham, Mysore Campus, #114 Bogadi 2nd Stage Mysore. Karnataka India, [email protected], Mob No: 8884559752

Abstract With the latest advancement in technology the number of products available to consumers has increased exponentially. There has also been a significant rise in competitiveness in every business segment. It is paramount for a product to cater to the needs of the users explicitly to remain successful in the market. Every company now has to balance product innovation and new product feasibility; else they will suffer financial loss. In this environment, it is important for the companies to have a system which can predict the success or failure of the proposed product, by identifying its similarities with those of successful products currently existing before mainstream production take place. Our project aims at predicting the profit level of a proposed product by comparing it with the sales data of existing products in the same category to gain meaningful insight to its profitability. With this project we propose to automate the profitability prediction which exists as a largely manual process today. We propose to use the data mining techniques for prediction of profit of proposed product based on the financial performance of the existing products. Specifically, we have identified classification using the naive Bayes algorithm as the most suitable for our project. Our system will not only work with any generic product, but will also deployable on the internet so as to have maximum reach. The input dataset will be the sales data of existing products in the same category of products along with the features & cost of the proposed product. The output generated will specify the likely profitability level of the proposed product. Thereby increasing both the efficiency and speed by which companies can take better product decisions.

Introduction

In the 21st century the world has become a global marketplace. This has led to an explosion of products available in the market from various companies. From the company’s point of view, there has never been a period with higher market competition. Each company has to ensure that their products are designed for generating maximum demand in that category. The problem statement for our paper is

12844

Pallavi M S

How to predict the profit level of a product in the market based on existing sales data of similar products in that category. To obtain a baseline, we use historical records, i.e. the sales performance of previous products, in the same category along with their feature set. We then use a Naïve Bayesian algorithm which is a supervised classification algorithm to predict the performance of the proposed product based on its feature set and the training data. We are targeting a model which can be used as a generic application to predict the demand for a proposed product irrespective of its category/type. The paper is organized as follows: In the introduction, we highlight the importance of product demand forecasting. Followed by the literary survey where, we summarize the work done in related fields of classification and predictor design based on a Naive Bayes algorithm. In the overview of predictor we describe our predictor as we have envisioned it. In the Naïve Bayesian Classification section we outline the our implementation of naïve Bayesian classifier that we use on our predictor. In the experimental section we show with an example how such implementation of the predictor will help any company's sales performance.

Literature Survey In our literary survey, we saw a Predictor to infer the phone a user will switch to considering: (a) the type of previous phone, (b) the social influence [1]. We understood about using Bayesian Networks: for ICT Adoption [2]. We then learnt how doing knowledge discovery in databases (KDD) and KDD processes [3]. We looked at how to Design an intelligent predictor that would beat a simple moving average baseline across a number of products [4]. We considered Using a multicriteria decision making approach to evaluate mobile phone alternatives [5]. We learnt how to mine comparable entities from Comparative questions [6]. We understood the importance of choice criteria, concentrating on what bases the consumer will select the mobile phones [7] and about the Advantages of classifiers with probabilistic output such as Reject option, Changing utility functions, Compensating for class imbalance etc. [8]. We then learnt about the major kinds of classification method including decision tree induction, Bayesian networks, k-nearest neighbor classifier, case-based reasoning, genetic algorithm and fuzzy logic technique [9]. We also looked at methods for learning Bayesian belief network (BN) based predictive models for classification [10].

Overview of Predictor A. Category Creation The predictor we have modeled is a generic product demand predictor and hence, each company will be required to create a category specification first. They will define the commonly occurring features applicable to that category of product along with the applicable feature values. If we consider the example of television some the features applicable are

Product Demand Forecasting: Will This Product Sell

12845

Screen type: LED/LCD, Screen size: 25”/27”/54” etc. An example of the same has been illustrated in Figure 1.1.

Figure 1.1: B. Existing Product Dataset The product data set needs to be entered in the product category just created. The data should specify values for features present in the specified model of the product along with sales performance, i.e. cost of product production, no of products produced using which we use to calculate total investment cost of the product. The company is also required to input the total no of products sold along with sales cost per product this will constitute total revenue. We find the difference between total cost and total revenue to classify the profit level of that product. Total investment = Cost of production per unit * total Number of units produced Total Revenues = Sales cost per unit * total number of units Profit = Total revenue - total investment

12846

Pallavi M S

Profit level α profit

Figure 1.2: In the example illustrated above Figure 1.2, we can see the sales performance of a 29” television where the cost per unit is 800 number of units manufactured is 10000 this sets our total investment to 80,00,000 And the cost/ unit is 1000, units/sold is 9000 this sets our total revenues to 90,00,000. This means that the product has a profit

Product Demand Forecasting: Will This Product Sell

12847

of 10,00,000. Compared to the investment cost of 80,00,000 this results in a profit % of 12.5%. C. Proposed Product Features Finally, we define the feature set for the proposed product and the predictor predicts the Profit level of the proposed product based on the training set as shown in Figure 1.3 and Figure 1.4

Figure 1.3:

Figure 1.4:

12848

Pallavi M S

D. Creation of training Dataset for classification We use Naïve Bayes classifiers to predict the profit level of the proposed product based on the previous sales data. Naïve Bayes Classifier needs a training dataset for classifying the new product. The training set consists of columns of the product names, each feature the proposed product contains and the calculated profit value. Table 1.1:

The above sample Table 1.1 shows the columns of a training dataset for a product of category “Television” with features “Screen Type” and “Screen Size”. Algorithm for creating Training data set getTrainingDataset(category,new_Product) { productsTable = get_All_Products (category); for(product in productsTable) { If ( product.features match new_product.features) then trainingDataset.add(product) } Return trainingDataset; } The algorithm for creating Training dataset starts by collecting the data about all the products currently existing in the database belonging to that category. It then matches these products that have similar features to the new product and adds their information to the training data set.

Naïve Bayes Classification We have chosen naïve Bayes classifier for predicting as we have found it to be more suitable based on the accuracy and efficiency in prediction scenarios [9] [10]. Naïve Bayesian classification is a supervised learning algorithm where the training set is predefined. [8] The Naive Bayes classifier selects the most likely classification Vnb given the attribute values a1, a2... an. This results in: Vnb =argmaxvj VP(vj)π P(ai|vj) We generally estimate P(ai|vj) using m-estimates: P(ai|vj)=nc+mpn+m Where: n = the number of training examples for which v = vj

Product Demand Forecasting: Will This Product Sell

12849

nc = number of examples for which v = vj and a = ai p = a priori estimate for P(ai|vj) m = the equivalent sample size We obtain the values of n, nc, p, m from the training dataset implemented. Our algorithm calculates the profit level by identifying the Profit level with the highest level of occurrence after classification. In the illustration below medium profit has highest probability value of 0.455. Hence the proposed/new product is classified as having medium profit level as shown in Graph 2.1.

Graph 2.1: Algorithm For Predicting Demand Basd on Naïve Bayesian predictDemand(category,new_ Product) { Sub = { "Loss", "LessProfit", "MediumProfit","HighProfit" } features new_Product.getFeature() trainingDataset = getTrainingDataset(category,new_product) p = 1/sub.length m = features.length for(i=0 to s.length) do { Pi = 1; For(j=0 to features.lenghth) do { n = 0; nc = 0;

=

12850

Pallavi M S

for(k=0 to trainingDataset.length) do { n++; if (trainingDataset[k].profit == sub[i]) then { nc++; } } X = (nc + (m * p)) / ( n + m) Pi = Pi * X; } Pi = Pi *p; Res[i]=Pi; } high = max(Res); Return sub[Res.indexof ( high ) ]; }

Experimental Results To show our experimental result, we have considered a data table which contains sales and profit details of different categories of product. We assume “G-tv” series is a product of category “television” with features “Display Type” and “Display Size”. Table 2.1:

The Table 2.1 below shows the feature values of respective products with the “ProductId” in the table above.

Product Demand Forecasting: Will This Product Sell

12851

Table 2.2:

For ex. ProductId 1 is the product “G-tv x001” with features “Screen Type” and “Screen Size” with values “LCD” and “22” respectively. We have the products data table ready as shown in Table 2.2. Suppose we want to predict the profit level of a new product named “G-tv x005” with features as given below Figure 2.2.

Figure 2.2: Now we have a new product to classify and predict. To do this we use naïve Bayes classifier and as mentioned before the algorithm requires a training dataset to classify the new product. So we use ‘get Training Dataset()’ algorithm to generate a training dataset containing the products with similar features as shown in the Table 2.3 below.

12852

Pallavi M S Table 2.3: Product Name G-tv X001 G-tv X002 G-tv X003

Screen Type LCD LCD LCD

Screen Size 22 22 22

Profit High Profit High Profit High Profit

We have our Training Dataset ready for use by naïve bayes classifier. We use the function ‘predictDemand()’ to classify the product to one of the profit levels defined.

Graph 2.1:

We got the result of the probability (Pi) values as shown in the Graph 2.1 which clearly classifies the new product to be of the class ‘High Profit’ with the highest probability value of 0.085. Thus we have predicted based on previous sales of similar products that if the product ‘G-tv x005’ is produced then it will gain high profit.

Conclusion In this paper, we have outlined a module for generic product profit level prediction/estimation using past sales data and Bayesian classifier, which can be used by any manufacturers to estimate the demand for a product before actual manufacturing.

Product Demand Forecasting: Will This Product Sell

12853

References Yi Wang, Hui Zang, Pravallika Devineni, Michalis Faloutsos, Krishna Janakiraman and Sara “ Which phone will you get next: observing trends and predicting the choice”, Network Operations and Management Symposium (NOMS), 2014 IEEE [2]. Sergiu Nedevschi, Jaspal S. Sandhu, Joyojeet Pal, Rodrigo Fonseca, Kentaro Toyama.” Bayesian Networks: an Exploratory Tool for Understanding ICT Adoption”, Information and Communication Technologies and Development, 2006. [3]. Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth . “From Data Mining to Knowledge Discovery in Databases”, AI Magazine Volume 17 Number 3 (1996) [4]. Indre˙ Zˇliobaite, Jorn Bakker, Mykola Pechenizkiy “Beating the baseline prediction in food sales: How intelligent an intelligent predictor is?”, Expert Systems with Applications Volume 39, Issue 1, January 2012 [5]. Gülfem Işıklara, Gülçin Büyüközkan,” Using a multi-criteria decision making approach to evaluate mobile phone alternatives”, Computer Standards & Interfaces,Volume 29, Issue 2, February 2007, Pages 265– 274 [6]. Shasha Li,Chin-Yew Lin,Young-In Song, Zhoujun Li , “Comparable Entity Mining from Comparative Questions “. ieee transactions on knowledge and data engineering vol:25 no:7 2013.” [7]. Safiek Mokhlis, Azizul Yadi Yaakop, “Consumer Choice Criteria in Mobile Phone Selection: An Investigation of Malaysian University Students”, International Review of Social Sciences and Humanities Vol. 2, No. 2 (2012), pp. 203-212 [8]. Kevin P. Murphy, “Naive Bayes classifiers”, University of British Columbia, 2006 [9]. Thair Nu Phyu, “Survey of Classification Techniques in Data Mining”, International MultiConference of Engineers and Computer Scientists 2009 Vol I , IMECS 2009. [10]. Jie Cheng, Russell Greiner, “Learning Bayesian Belief Network Classifiers: Algorithms and System”, 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence, AI 2001 Ottawa. [1].

12854

Pallavi M S