2009 2nd Conference on Data Mining and Optimization 27-28 October 2009, Selangor, Malaysia
Data Mining In Production Planning and Scheduling: A Review Ruhaizan Ismail1, Zalinda Othman2 and Azuraliza Abu Bakar3 Faculty of Information Science and Technology Universiti Kebangsaan Malaysia 43600 Bangi Selangor, Malaysia 1
[email protected], 2,3{zalinda, aab}@ftsm.ukm.my
Abstract-The paper reviews about the data mining tasks and methods, and its application in production planning and scheduling. Data mining will be reviewed in four classifications of data mining systems according to the kinds of databases mined, knowledge to discover, techniques utilized and the applications adapted. This paper also reviews in production planning and scheduling that focused in time frame range either short- to mid-range or long-range planning. In production planning, there are a lot of planning such as process planning, strategic capacity planning, aggregate planning, master scheduling, material requirements planning and order scheduling. From these activities different problems are arise because of the different time, product and environment of production. Keywords: scheduling, production planning, data mining. I. INTRODUCTION Recently, many companies have recognized data mining as an important technique that will have an impact on the performance in industries or companies [1]. However, there are many challenges to be faced when it involved industrial data such as ability to handle different types of data, graceful degeneration of data mining algorithms, valuable data mining results, representation of data mining requests and results, mining at different abstraction levels of data and different sources and protection of data security [1]. Beginning 1990s, there already research of data mining techniques in manufacturing and it has gradually progressed by receiving attention from the other side [2]. Data mining is now already used in many different areas in manufacturing to get the knowledge for use in predictive maintenance, production, scheduling and decision support systems. From the past research, data mining approach can be used to extract and identify the hidden patterns in large data and discover the knowledge to be used or solve the problem.
978-1-4244-4944-6/09/$25.00 ©2009 IEEE
This paper presents a brief overview of data mining tasks and methods and also application in production planning and scheduling which is focusing on time frame that either shortto mid-range, or long-range. II. DATA MINING TASK AND METHODS Commonly, from [6] defined that data mining is the task to discovering interesting patterns from large amounts of data and different kinds of databases. However from [21], data mining is data driven and knowledge-extracted that also has different tasks based on the intended results of the process. In data mining system, there are four classifications depending on the kinds of data to be mined, kinds of knowledge to be discovered, kinds of techniques utilized and kinds of applications. Depending on the kinds of data to be mined, data mining system may also integrate techniques from other fields such as spatial data analysis, information retrieval and pattern recognition. For data mining approach, there are another techniques from other disciplines such as neural networks, fuzzy, rough set theory, knowledge representation, inductive logic programming or high performance computing [6]. In principle, data mining is not specific to one type of data. The approaches and algorithms may differ when applied to different types of data. Types of data to be mined can be categorized to data in flat files, relational databases, data warehouses, transaction databases, multimedia databases, spatial databases, time-series databases and data in the World Wide Web that can be text, audio, raw data and even applications [4]. Kinds of knowledge to be discovered also known as data mining functionalities such as characterization, discrimination, association, classification, clustering, outliers analysis etc. [5]. In general, data mining tasks can be classified into two categories that are descriptive and predictive. Descriptive focused on general properties or
154
pattern about data in database. It describes a given set of task-relevant data in a concise and summarative manner. Meanwhile predictive is using some variables on the current data to predict future values [5], [6]. Prediction and description can be achieved by using data mining tasks such as classification, prediction, association, regression, clustering, summarization, dependency modeling, change and deviation detection. In descriptive modeling the aim is to describe not to predict models. As a consequence, descriptive are used in the setting of unsupervised learning. Typical methods of descriptive are density estimation, smoothing, data segmentation and clustering. For predictive, it falls into the category of supervised learning with method like classification, regression and decision tree [5], [6]. For the types of techniques, we have two themes that are first, classical techniques; i.e. statistics, neighborhood and clustering, the second theme is next generation techniques i.e. tree, networks and rules [7]. The statistical techniques are driven by the data and are used to discover patterns and build predictive models. The differences between statistical method and data mining are the classical data mining methods such as CART (Classification and Regression Trees), neural networks and nearest neighbor techniques tend to be more robust to real world data, more robust to being used by less expect users and need large quantities of data. Nearest neighbor is a prediction technique that is quite similar to clustering because of in order to predict, nearest neighbor technique will look at records with similar predictor. This is similar to clustering task that groups data with high similarity in comparison to one another [7], [6]. In the second theme techniques, a decision tree is a predictive model that resembles a tree structure. It creates segmentation such as customers and products in business perspective. Tree has ability to generate rules and is one of the popular techniques for building understandable models. Decision tree is widely used in business problems for exploration and prediction because it scores so highly on critical features of data mining [7]. Another predictive technique is neural networks. This technique is good at searching large databases for hidden patterns or trends because of their processing capabilities are similar with the human brain. In human brain, stimuli flow from neuron to neuron along multiple paths instantaneously. The strength of the connections between neurons increases with frequency of stimulation and the neural networks also operates in the same manner [8]. The third technique is rules. Rules induction is one of the major forms of data mining and is perhaps the most common form of knowledge discovery in unsupervised learning systems because it is relatively easy to understand. When the
978-1-4244-4944-6/09/$25.00 ©2009 IEEE
rules are mined out of the database, the rules can be used either to understanding the business problems or to perform an actual predictions against some predefined prediction target. Beside the previously mentioned techniques, genetic algorithms and fuzzy logic were another example of artificial intelligence techniques that have place in data mining applications [8]. Genetic algorithms adapted the principle of natural genetic systems in searching and optimization problems. Genetic algorithms are widely used in production scheduling, dynamic process control and complex design [9], [10]. For fuzzy logic, this technique can makes the decisions based on incomplete information by simulating the ability of the human mind when dealing with ambiguity. Data mining also can be categorized according to the application. Data mining also have been developed such as in finance, biomedicine and DNA analysis, the retail industries and telecommunications. From [6], when using the suitable data mining products for one’s task, it is important to consider various features of data mining systems from different views. These include types of data, system issues, data sources, data mining functions, scalability, tight coupling of data mining system with a database and graphical user interface (GUI). III. MANUFACTURING SCHEDULING Environment in manufacturing fields are complex, dynamic, and have the stochastic systems. There are a number of activities involves in managing and operating production system. Those activities include organizing work, selecting processes, arranging layouts, designing jobs, scheduling work and planning production [8]. Meanwhile, different stages of planning involve the operation such as process planning, capacity planning, aggregate planning, master scheduling, material requirements planning and order scheduling. These planning stages can be divided into three time frames; short, medium and long term. In business forecasting, short-term usually refers to under three months; medium-term or intermediate, three months to two years; and long-term, greater than two years. Long-range planning is generally done annually, focusing on range that greater than one year. Process planning and strategic capacity planning are the major operations in longrange term. Process planning converts the design into workable instructions for manufacture. It deals with determining the specific technologies and procedures required to produce a product or service. For the strategic capacity planning, it deals with long-term capabilities of the production systems [11]. It extends over a time horizon long
155
enough to obtain those resources that affect product lead times, customer responsiveness, operating costs and a firm’s ability to compete. So, for building the new facilities or acquiring new business, it suitable uses a capacity planning. For intermediate-range, there are aggregate planning, master scheduling and material requirements planning. Intermediate-range planning usually covers a period from 3 to 18 months, with time increments that are weekly, monthly or sometimes quarterly. The output from the aggregate planning is the feasibility to hire or lay off workers, increase or reduce the work-week, add an extra shift, subcontract out work, use overtime or build up and deplete inventory levels [8]. Usually aggregate planning’s aim is to minimize total cost over the planning horizon include inventory investment, workforce levels and production rates. Master production schedule (MPS) is a part of the material requirements plan
Items
Product lines or families
Individual products
Components
Manufacturing operations
Production planning
Capacity planning
Resource level
Aggregate production plan
Resource requirements plan
Master production schedule
Rough-cut capacity plan
Material requirements plan
Capacity requirements plan
All work centers
Shop floor schedule
Input/output control
Individual machines
Fig. 1
Plants
Critical work centers
Hierarchical planning
scheduling is the allocations of resources and the sequencing of tasks to produce goods and services, also to specify the time for each job starts and completes on each machines [3]. The hierarchical planning as shown in Fig. 1 describes each level of production and capacity planning. In production planning, after the aggregate production
978-1-4244-4944-6/09/$25.00 ©2009 IEEE
that show the details of how many end items will be produced within specified periods of time. MPS is a one of the important activity in manufacturing planning and control and works within the constraints of the production plan but produces more specific schedule. It breaks the aggregate production plan into specific product schedule, maintaining customer service levels and stabilizing production planning within a Manufacturing Resource Planning (MRP II) environment and a just-in-time (JIT) based production systems [8], [12], [13]. Short-range planning covers a period from one day or less to six months, with the time increment daily or weekly and because of that, operation planning activity in short-range is order scheduling. In individual machines, there are shop floor schedule to maintain and communication status information on shop orders and work center [11]. Shop floor
planning, the next level is master production schedule [8]. At another level, material requirements planning (MRP) plan the production of the components. MRP will take the end product requirements from MPS and breaks into component parts to create a materials plan [11]. After that, the shop floor scheduling schedules the manufacturing operations requires
156
to make each component and specific to machines, production lines or work centers. In capacity planning, the resource requirements plan is develop to verify that an aggregate production plan is doable and a rough-cut capacity plan is to see if the master production schedule is feasible. Next, the capacity requirements plan matches the factory’s machine and labor resources to the material requirements plan. Finally, input/output control is using to monitor the production that takes place at individual machines. The process from aggregate production plan to the shop floor schedule is called disaggregation. Disaggregation is the process of breaking an aggregate plan into more detailed plans [8]. IV. PROBLEM ARISES IN PRODUCTION PLANNING AND SCHEDULING Production control is the most difficult problem when dealing with the dynamic manufacturing environment. Proper scheduling activity is an important tool for improving this situation. However, scheduling is a difficult task because of manufacturing environment that have high levels of uncertainty, detailed processes, specific requirements and management objectives are varied, dynamic and often conflicting [3]. Therefore, many manufacturing organizations generate and update the production schedules to increase productivity and minimize operating costs. A production schedule can identify resource conflicts, control the release of the jobs to the shop, ensure that required raw materials are ordered in time, determine whether delivery promises can be met and identify time periods available for preventive maintenance. The aggregate planning will become a challenge when demands are more fluctuates over the planning horizon. If demand for company’s products or services are stable over time, then the aggregate planning is trivial. For example a strategy like chase demands, maintaining resources for highlevel demands or overtime and undertime, subcontracting and backordering to meet the demand [8]. In [14] say that production control in dynamic manufacturing system is a very challenging problem and problem will arise when constant design system change and have to deal with the cost and job preemption. In MPS, there are some conditions that would make schedule will replanned again to make it more significant in stability, productivity, production and inventory cost and customer service. First condition when there is a rolling effect due to extension of planning period and the second condition, when the demand is uncertain [13], [15]. The main problems in MPS, when many companies do not know
978-1-4244-4944-6/09/$25.00 ©2009 IEEE
their future demands and have to rely on demand forecasts to make production planning decision, so they often use the same capacity to manufacture several products. Therefore, developing and maintaining MPS under capacity constraints and demand uncertainty is far more challenging because it may significantly influence the selection of the MPS freezing parameters [13]. For shop floor scheduling, there are few parameters that always being an issue such as throughput time, work-inprogress (WIP) inventory and the tardiness. When the large WIP, it’s make a longer throughput time and will minimize the number of inventory. Tardiness is the time different between actual production and demand. So, if the inventory is low, it will make tardiness longer. Shop floor scheduling was focusing on single machine, production lines or work centers. Generally, data in manufacturing is difficult to be preprocessed and mined by using data mining approach because of they have uncertainty of payback values, complexity and diversity of different manufacturing process, imbalanced distribution, curse of dimensionality and mixed-type data, long time scales and out of range data. In shop floor control, the effective way to solve the problem is using the dispatching rules. However, there still a possible problem such as the appropriateness of dispatching rule at a real time because the dispatching decision of the rule base made by observing the pattern of the real-time system [14]. V. PAST RESEARCH IN PRODUCTION PLANNING AND MANUFACTURINGSCHEDULING From the section IV, the major problem arises in scheduling is the uncertainty demand and fluctuates of data depends on situation, environment and parameters in scheduling. There are also another factor why research in data mining not really grown in manufacturing field. In [21] says that, the majority of researchers in manufacturing are not familiar with data mining algorithms and tools or the otherwise and it is difficult to evaluate the effectiveness and advantages while data mining is implemented in manufacturing. Hence, [16] have classified the scheduling problems into four categories. This including heuristic rules, mathematical programming techniques, neighborhood search methods, and artificial intelligence techniques that generally are from categories of technique that used in data mining. However, we only focusing on job shop scheduling (JSS) problem in this research. Artificial intelligence (AI) techniques in scheduling environment will implement on this problem. AI techniques have four main advantages from the past researches. Firstly, these techniques can use both qualitative
157
and quantitative knowledge in a decision making process. Secondly, their capability to generate heuristics that are more complex than the simple dispatching rules. Thirdly, the selection of the best heuristic can be based on information about entire job shop and lastly, the relationship between data structures and techniques for powerful manipulation of the information in these structures [16]. There are also disadvantages when using the AI techniques such as large time consumption, difficult to maintain and change the framework or system because of dealing the uncertainty data. AI techniques also lack of ability to generate the feasible solution without checking the optimality and getting trapped in local optima [16]. The relevancy of data mining in manufacturing industry was reviewed [2]. Many areas such as production processes, control, maintenance, customer relationship management (CRM), decision support system (DSS), quality improvement, fault detection and engineering design has been discussed in detailed. The challenging areas such as manufacturing planning and shop floor control are less considered. This is related to scheduling, product quality, work-in-progress, cost reduction and fault diagnosis. There are a develop tool called DBMine that used Bacons algorithm, decision trees, DB learn and genetic algorithm to find a pattern in job shop scheduling sequences [17], [18]. Reference [19] explored the use of data mining for lead time estimation in make-to-order manufacturing with regression tree approach. Another issue from [2] says that there is a less attention from data mining community in shop floor control which is a lowest level of control in production planning. There are a few research has been used such as association rules to identify the occurrences of other machines with the occurrence of a machine in the cell [20] and also a datamining-based production-control approach for testing and rework cell in a dynamic computer-integrated manufacturing system using data mining approach and decision tree [14]. In aggregate production planning, there are many uncertain factor involved such as the randomness of arriving orders, the imprecise information and fuzzy environment. Because of that, [22] proposed to build the fuzzy models and inexact approach using genetic-based algorithms which is imitates the human decisions making for production planning. Table I shown the problems under the same categories in production planning that dealing with AI techniques [2], [16] – [19], [22]. These including knowledge based model, artificial neural networks, fuzzy logic, genetic algorithm and decision tree. Artificial neural networks also can be applied to a variety of production planning and control problems,
978-1-4244-4944-6/09/$25.00 ©2009 IEEE
that can be used for lead time estimation as an alternative to regression trees [19]. TABLE I A LIST OF AI TECHNIQUES IN MANUFACTURING IN PAST RESEARCH Techniques
Problem
Knowledge based model Artificial neural networks
Dealing problem in both parallel workstation clusters and batch processor. Related with several job types, exhibiting different arrival patterns, process plan, precedence sequences and batch sizes. To solve regarding on due dates and operation times and technology constraints.
Fuzzy logic
Genetic algorithm
Regression tree
To find a family of preferred solution which provide more candidates than the exact approach. To minimize the product design time and minimizing maximum lateness in presence of dynamic job arrivals. To improve lead time (LT) estimation
Production planning
Job shop scheduling
Aggregated planning
Job shop scheduling
Job shop control
V. CONCLUSION This paper presented a brief review of the data mining task and method in perspective manufacturing field, production planning and scheduling in three terms range of operation management that were produce a lot of type of planning or scheduling. The problem that arises in scheduling also have been discussed with each terms range of manufacturing scheduling and related past research between data mining and manufacturing scheduling. From the past research, there are still a lot of problem that cannot be solved because of the uncertainty and changeable demand and operation system. Many research’s have been done using different data mining algorithms, hybrid approach and developed the system or tools to solve the problem in aggregate planning and shop floor control. Data mining also can be used to improve
158
quality control but in this paper, we are not focused on quality data but more on production machine and process planning in scheduling. REFERENCES [1] Lee, S. J. & Siau, K. 2001. A review of data mining techniques. Industrial Management & Data Systems. 101 (1): 41-46. [2] Harding, J. A., Shahbaz, M., Srivinas & Kusiak, A. 2006. Data mining in manufacturing: a review. Transactions of the ASME on Journal of Manufacturing Science and Engineering. 128 (4): 969-976. [3] Aytug, H., Bhattacharya, S., Koehler, G. J. & Snowdon, J. L. 1994. A review of machine learning in scheduling.IEEE Transactions on Engineering Management. 41 (2): 165-171. [4] Zaiane, O. R. 1999. Principles of knowledge discovery in databases. [5 August 2009]. [5] Fayyad, U. M. 1996. The primary tasks of data mining.http://www2.cs.uregina.ca/~hamilton/courses/83 1/notes/kdd/2_tasks.html [3 august 2009]. [6] Han, J. & Kamber, M. 2001. Data mining Concepts and techniques. Data mining on what kind of data. Academic press. Morgan Kaufmann. [7] Berson, A., Smith, S. & Thearling, K. 1999. Building Data mining applications for CRM. McGraw-Hill Professional. [8] Russell, R. S. & Taylor III, B. W. 2003. Operations Management, 4th Edition. Prentice Hall. [9] Knosala, R. & Wal, T. 2001. A production scheduling problem using genetic algorithm. Journal of Materials Processing Technology. 109 (1 – 2): 90 – 95. [10] Contreras, A. R., Valero, C. V. & Pinninghoff, J. M. A. 2005. Applying genetic algorithms for production scheduling and resource allocation: special case: a small case manufacturing company. Proceedings of the 18th International conference on Innovations in Applied Artificial Intelligence, pp. 547 – 550. [11] Chase, R., Jacobs, R. & Aquilano, N. 2006. Operations Management for Competitive Advantage.11th ed., Irwin/McGraw-Hill, New York.
978-1-4244-4944-6/09/$25.00 ©2009 IEEE
[12] Krajewski, L. J. & Ritzman, L. P. 2005. Operations Management Process and Value Chains. Prentice Hall. [13] Romli, A., Ameendeen, M. A. & Ahmad, N. 2006. Methodology of rolling horizon scheduling under demand uncertainty. International Conference on Technology Management 2006. [14] Kwak, C. & Yih, Y. 2004. Data mining approach to production control in the computer integrated testing cell. IEEE Transactions on Robotics and Automation. 20 (1): 107-116. [15] Tang, O. & Grubbstrom, R. W. 2002. Planning and replanning the master production schedule under demand uncertainty. International Journal of Production Economics. (78): 323 – 334. [16] Gupta, A. K. & Sivakumar, A. I. 2006. Job shop scheduling techniques in semiconductor manufacturing. The International Journal of Advanced Manufacturing Technology. 27 (11-12): 1163-1169. [17] Koonce, D. A., Fang, C.-H. & Tsai, S.-C. 1997. A data mining tool for learning from manufacturing systems. Computer and Industrial Engineering. 33 (3-4): 27-30. [18] Koonce, D. A. & Tsai, S.-C. 2000. Using data mining to find patterns in genetic algorithm solutions to a job shop schedule. Computer and Industrial Engineering. 38 (3): 361-374. [19] Ozturk, A., Kayahgil, A. & Ozdemirel, N. E. 2006. Manufacturing lead time estimation using data mining. European Journal of Operational Research. 173 968976. [20] Chen, M. C. 2003. Configuration of cellular manufacturing systems using association rule induction.International Journal of Production Resources. 41 (2): 381-395. [21] Wang, K. 2007. Applying data mining to manufacturing: the nature and implications. Journal Intelligent Manufacturing 18 487-495. [22] Wang, D. and S.-C. Fang 1997. "A genetics-based approach for aggregated production planning in a fuzzy environment." Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE 27(5): 636-645.
159