Online Simulation Based Decision Support System for Resource Failure Management in Multi-Site Production Environments Sebastian Bohlmann, Matthias Becker, Sinan Balci, Helena Szczerbicka FG Simulation Gottfried Wilhelm Leibniz Universit¨at Hannover 30167 Hannover, Germany {bohlmann, xmb, hsz}@sim.uni-hannover.de Eric Hund IPH - Institut f¨ur Integrierte Produktion Hollerithallee 6 30419 Hannover Germany
[email protected]
Abstract Planning in a multi-site, non-mass production environment is a special challenge because of several sources of uncertainty. Unlike in mass production facilities, in our setting the current state is not easily and exactly known when the case of re-planning occurs. The planning procedure has to contribute to that fact, as well as to further uncertainties concerning the effects of a plan when evaluating the plan. Thus in this work, we apply online simulation as means for re-planning multi-site production in the case of resource failure. This work is a first step where two alternatives are considered when a resource fails: either wait for repair of the resource, or transport another instance of this resource from another site, if there is more than one available. Our study shows that the planning using online simulation is superior to a static strategy such as ’always wait for repair’ or ’always import resource’ in case of resource failure.
1. Introduction Online simulation is a known tool to achieve efficiency gains under different problem-settings. In this context simulation is used to forecast the future behavior of a system. Based on this prediction a decision is made to influence the future. For deterministic systems and their corresponding models prediction is often sufficiently accurate to get good results. In case of non-deterministic systems with uncertain or stochastic elements, simulation offers the chance to make a statistically well-founded prediction. However, it is much more expensive to compute these data, since the simulation for this purpose depending on the model must be repeated many times. If the model’s
complexity increases and side effects occur decision support becomes problematic. Combining an automated decision process with a predictive evaluation of simulation results could lead to multi decision support systems. In this paper we focus on non-deterministic models of assembly processes at multi-site assembly [11, 3]. The sources of non-determinism are stochastic process durations, as in our setting we do not deal with nearly deterministic mass production. Additionally, production resources can fail randomly, being a main cause of congestion in the production. In simple one site production environments there is often no real choice. The only possibility is to repair the failed machine. In multi-site environments at this point there are maybe multiple options. If a second site has the same resource the possibility exists to simply transfer this resource to the first side. However, the resource being then transported is no more available at its original place, slowing down production there. Because one resource then needs to travel from site to site and the failed resource needs to be repaired there is an overall increase in downtime. Despite that, in some cases it could be advantageous to transport an operational resource on failure of another one. Via online simulation based on the current state of a production line it can be assessed whether a transport might be advantageous or not. In this paper we present a simplified model of two identical production sites with the ability to transport one resource from site to site on failure. To decide whether it is favorable to transport a resource or wait for repair, stochastic online simulation is used and the predictions are used to influence the ongoing simulation. This could also be seen as an online scheduling technique [10]. To ensure the future perspective of using more complex models we present some methods for simplified deployment of online simulation, using general purpose models. All of these methods are at
978-1-4799-0864-6/13/$31.00 ©2013 IEEE
least partially implemented and tested. 1.1. Related Work Resource allocation [9] is a related task setting of the problem presented in this paper. But some more aspects are taken into account in resource failure management. When using an online simulation the prediction can improve the production planning decisions in general. Event nearly perfect decision support can be derived for specific problem subsets [5]. More complex and realistic models of real life applications usually require the usage of high level modelling methods [6, 8]. Although all of these can be reduced to DES used in this paper. Other known models share main specifica with the problems in non-mass multi site production environments [12, 9]. Usually these are not specifically designed for online simulation. If this is done systems get more complex [4]. On the other hand smarter algorithms can be used to get to a decision proposal. Specific multi site optimisation strategies and an overview of methods can be found here [1]. 1.2. Simulation System Related to online simulation there are a couple of challenges which in normal simulations do not have such impact. Deadlines for the result are much harder to meet since stochastic simulation requires repeated runs. To get good results an exact definition of the current state of the simulation is necessary. Although this is usually a simple task for a model, modern simulation environments often cover up significant parts of this model state while being in operation. To solve this we implemented a simulator environment capable of saving and in particular copying the model state very efficiently. This feature is then used to replicate more simulation instances used to get a prediction. On the other hand the same problem applies to the deterministic parts of the simulation. To retrieve correct online simulation results the simulation system itself has to be state free. This applies for example to embedded pseudo random generators or deterministic execution of events occurring in parallel. The implemented simulation system currently uses a conventional simulator as a base system. Here we used DesmoJ as the subordinate simulator. In future this could be replaced using a different simulator. To strictly encapsulate the state of the current model instance used for a simulation run we are using two methods. First Java annotations are used to mark non relevant stateful variables. This annotations are then processed using online bytecode optimization techniques based on the ASM library. In the end this ensures that any states could not be copied between different simulation instances. Secondly specialized simulation classloader separate all simulation runs inside the JVM to prevent access between the different independent simulations. This approach ensures that even static references inside the model classes are decoupled completely. For example if a singleton pattern is used inside the model classes one independent singleton is generated for each simulation. This is made possible by
the fact that the object type within the Java JVM is always defined by the classname and the classloader used. As result multiple simulations are obtained, using the same simulation model classes but none of them is equal to one present in a different simulation although the classname is equal. The usage of online bytecode modification also leads to the possibility to resynchronize different simulations very efficiently. In other words, to reuse finished online simulations to generate new ones. This pooling of the simulation models is much more efficient in terms of memory management and following from this the execution time. Taken together, two goals are achieved this way. First, it is much simpler to comply with the temporal conditions. Furthermore, the warranty of correctness and independence of the generated results in online simulation mode. One useful byproduct is the fact that a variety of already existing verified models can be used directly for online simulations. For example the one described in the next section. The combination of techniques results in model development being independent of the online simulation application. This helps to prevent diverse errors made early in the modeling phase.
Figure 1. Simplified demonstration model
2. Production Environment The model used in this paper consists of a two production site setup (plant 1 and plant 2). Both sites have two identical products being referenced as X and Y. Finally there are four production steps (I, A, B and O) in both sites. Product X has three production steps in sequential order I → A → O. Analogously, product Y is produced using steps I → B → O. The time consumed in each of the steps has a normal distribution while using one of the two production resources placed at each production step. The consecutive production starts are calculated using a Pois-
son distribution (referred as injection rate). Figure 1 gives a simplified overview to the described setup. At production step B at plant 1 resources could have failures. For simplification of the evaluation in the next section failures occur at a fixed rate. The distance between the individual defect events is such that the system in the meantime assumes the steady state. Thereby influencing between failure events is excluded in this experiment (See further work section). Resources of the type B can be transported between B1 and B2. The time consumed for this operation is fixed assuming an almost constant travel speed. The repair time for a failed resource is derived from an Erlang distribution. No repair is executed while a defective resource is transported. All measurements generated refer to the entire system or to both production lines. The model is implemented using separate Java classes for all parts of the model. The basic methodology is a Discrete Event Simulation [2]. The corresponding process based model was also implemented. As both behave statistically the same way the process based model was not used because of the bad simulation performance. This was caused by the DesmoJ simulator framework using one thread per process simulation approach. To verify the basic model (without transportation) results have been compared with analytic results by replacing all distributions with exponential distributions.
3. Experiments The exemplary system is implemented using the DesmoJ simulator [7]. The model has been implemented as described in the previous section. All operations are interruptible. This means if a transfer request form plant A has been made then the current operation is finished before the resource will be moved to plant B. The same applies to the reverse situation after the repair of the resource at plant A is finished. If none (of the two) resources at a production place is available the simulation currently is dropped. This event has not occurred in the experiments shown here. The basic experimental execution flow is shown in figure 3. The main simulation is the entry point of all experiments. If one resource breaks down the main simulation is paused. All system states (state variables, call stack, pending events) are saved to memory. Next 1000 new simulations for each possible action are replicated from
Figure 3. Sequence of simulations
this state and all random seeds are being reseeded from a global persistent random generator. These simulations are executed until a stable system is reached. The aggregated results are analyzed and the best action is chosen based on a metric. This action then is injected to the paused main simulation and is restarted. The metric to choose the best action to take place is for the presented experiment the average throughput over each corresponding set of 1000 simulations. Each simulation run consists of 10 independent resource failures. To get more stable results each of these experiments is repeated 1000 times to get one measurement. In figure 2 the average was formed. On the ordinate, the time is plotted in units of simulation time. The abscissa of figure 2 denotes the mean injection rate of the two order types to step I1 and I2. 100 for example corresponds to mean execution for product 1 and 2 each 100 simulation time units. The full load of the described system is only approximately known by long time simulation and is around one injection per 120 time units for the system without failure. This is referenced in figure 2 as optimum. Between the transport and the repair action there is almost no difference (the two upper lines). The line in between is for the online simulation based decision system. Obvious the maximum benefit is generated if the system is operated a bit below the full load point. For example if we take a static decision and a system with an injection time of 160. If instead a variable system is used it would use only about 135 time units mean value. Leading to an absolute improvement of 15%. Taking the failure-free simulation as a reference into account this results in an effective increase of about 45%. Just to get an idea figure 2 consists of about 3,000,000 simulation runs. For light loads the static decisions converge. The corresponding value is the optimum value summed with the average downtime of the resources. At this point no efficiency gain can be achieved with the method demonstrated in this paper. 3.1. Conclusion Our study shows that the planning using online simulation is superior to a static strategy such as ’always wait for repair’ or ’always import resource’ in case of resource failure. This comes from the fact that a decision based on online simulation takes into account the current state of the system. When in one system a resource fails, it depends on the current state of the second system whether the export of a resource is feasible or not. If the second system has a high load then it might be better to wait for repair of the resource in the first system. We demonstrated that it is possible to simplify the decision process by using online simulation. In this paper a simple value comparison leads to an improvement of 45%. Using more sophisticated decision methods, as mentioned in Section 1.1 would probably increase the improvement further. Online simulation could provide very useful input parameters to a automated decision process. In case of resource failure management prediction systems are able
12000
Optimum Repair 10000
Transport Online
8000
6000
4000
2000
0
90
100
110
120
130
140
150
160
170
180
190
Figure 2. Comparison of simulation results to provide accurate results by concentrating all processing power to a single decision point. For easy application of prediction based decision support most simulation tool at th moment are non optimal.
[4]
4. Summary and Further Work As mentioned before this exemplary experiments are for an independent single point decision. If multi point decisions are taken into contemplation this transforms to a decision tree. Therefore it is absolutely necessary to speed up simulation as much as possible to meet the real time requirements of a modern product planning system. To do so a specialized online simulator is currently under development. The first simulator prototypes are also implementing the above described model for verification purposes. Speedups up to factor 800x could be observed using state de-duplication and polymorphic code generation techniques. On the other hand branch cutting methods are evaluated using Petri net analysis methods. More complex models are build using template based building blocks mapping to Petri nets with industrial partners. In the end this research could lead to more efficient construction site production.
References [1] N. Aissani, A. Bekrar, D. Trentesaux, and B. Beldjilali. Dynamic scheduling for multi-site companies: A decisional approach based on reinforcement multi-agent learning. Journal of Intelligent Manufacturing, 23(6):2513– 2529, 2012. [2] J. Banks, J. S. Carson, B. L. Nelson, and D. M. Nicol. Discrete-Event System Simulation. Prentice-Hall, Inc., Upper Saddle River, New Jersey, 3rd edition, 2000. [3] S. Chung, H. Lau, K. Choy, G. Ho, and Y. Tse. Application of genetic approach for advanced planning in multi-factory
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
environment. International Journal of Production Economics, 127(2):300 – 308, 2010. ¡ce:title¿Supply Chain Planning and Configuration in the Global Arena¡/ce:title¿. M. Gnoni, R. Iavagnilio, G. Mossa, G. Mummolo, and A. Di Leva. Production planning of a multi-site manufacturing system by hybrid modelling: A case study from the automotive industry. International Journal of production economics, 85(2):251–262, 2003. ¨ P. Goodwin, D. Onkal, and M. Thomson. Do forecasts expressed as prediction intervals improve production planning decisions? European Journal of Operational Research, 205(1):195–201, 2010. C. M. Macal and M. J. North. Tutorial on agent-based modelling and simulation. J. Simulation, 4(3):151–162, 2010. P. Martin. The development of an object-oriented, discrete-event simulation language using java. In Proceedings of the Fourth Asia-Pacific Software Engineering and International Computer Science Conference, APSEC ’97, pages 123–, Washington, DC, USA, 1997. IEEE Computer Society. P. Renna. Multi-agent based scheduling in manufacturing cells in a dynamic environment. International Journal of Production Research, 49(5):1285–1301, 2011. N. Taghezout, I. Bessedik, and A. Adla. Application to resource allocation problems in a flow-shop manufacturing system. Journal of Decision Systems, 20(4):443–466, 2011. M. Thomas and H. Szczerbicka. Evaluating online scheduling techniques in uncertain environments. In 3rd Multidisciplinary Internation Scheduling Conference: Theory and Applications, Paris, France, 2007. C. H. Timpe and J. Kallrath. Optimal planning in large multi-site production networks. European Journal of Operational Research, 126(2):422 – 435, 2000. M. Webster, A. P. Muhlemann, and C. Alder. Decision support for the scheduling of subcontract manufacture. International Journal of Operations & Production Management, 20(10):1218–1237, 2000.