Developing a maintenance programme is an iterative process that .... This case study demonstrates the application of the proposed model and its effect on ...
JQME 4,1
66
World-class maintenance using a computerised maintenance management system Ashraf W. Labib University of Manchester Institute of Science and Technology (UMIST), Manchester, UK Introduction With the increasing demand on productivity, quality, and availability, machines have become more complex and capital intensive. Developing and implementing a maintenance programme is a difficult process that suffers from many problems. It often suffers from lack of a systematic and a consistent methodology. In addition, since the process of developing the programme relates to different parties with interests in maintenance, it becomes difficult to achieve all round satisfaction of these parties, and at the same time achieve the objectives of the company. Developing a maintenance programme is an iterative process that involves different decision makers, who may have conflicting objectives. In deriving these objectives maintenance managers usually try to achieve multiple, and sometimes, conflicting objectives such as maximising throughput, availability, and quality subject to constraints on production plan, available spares, manpower, and skills. World class World class can be defined as a tool used to search for and allow a company to perform at a best-on-class level. It is useful to use the plant as the level of analysis because, although world-class manufacturing (WCM) is a strategic approach, many of its measurable improvements initiatives have occurred at the plant level (Flynn et al., 1989; Mackenzie, 1977). Strategic considerations and operational decisions are influenced by other corporate functions such as production, finance, quality, and human resources. It is true that the information gathered by these systems at the operational level and actions taken are in fact strategic – improved asset availability, productivity, and quality, as well as resource management, inventory control, planning, and so on.
Journal of Quality in Maintenance Engineering, Vol. 4 No. 1, 1998, pp. 66-75 © MCB University Press, 1355-2511
Maintenance and CMMS It is believed that an available CMMS (computerised maintenance management system) is not an aim by itself, but rather a platform for decision analysis that can lead to the development of the world-class model. This coincides with the
view presented by Olafsson (1990) regarding TPM implementation, where it is emphasised that
World-class maintenance
… an early prerequisite for TPM implementation in a company, is the development of a CMMS for collecting and recording the data for TPM implementation.
However, experience gained by the author in developing computerised maintenance management systems (CMMS) in several automotive industries (Labib, 1996; Labib et al., 1996a, b; 1997a) has shown that managers rely on such systems for data collection and data analysis, but seldom for decision analysis. A data collection system that can gather relevant data is prerequisite for worldclass status. It is advantageous to have a data collection system that is real-time, and that can handle data related to frequency and duration of maintenance breakdowns as well as spare parts costs, while enhancing and utilising knowledge of operators and maintenance engineers. A practical example of such a system is the breakdown recording system developed at Meritor (UK) (Labib et al., 1997b) where data are collected systematically in a real time basis, and analysed accordingly. The two objectives of this paper are to demonstrate a practical methodology for adding value to data collected through offering decision analysis, as well as facilitating the link between preventive maintenance and emergency maintenance in an adaptable and dynamic approach. Model implementation The first step towards minimising emergency maintenance was to analyse breakdowns through dividing the time spent for a breakdown into phases. The second step was to develop solutions that could minimise each of the identified phases. Although each of those solutions was relatively a simple improvement idea, however, the total outcome was a substantial improvement. Different phases and their corresponding solutions are illustrated in Figure 1. In general, downtime was categorised into three main phases; response phase, diagnostics phase, and repair phase. Response phase is the time between the occurrence of a breakdown and the attendance of a maintenance engineer. It passes through sub-phases of realisation by production operator, to reporting to the CMMS, and finally to logging on by maintenance technician. It is considered purely a wasted time that ought to be not just minimised but eliminated totally. Solutions that have been implemented were to network the CMMS between production and maintenance departments, and to link the CMMS to alarms and bleepers via a PLC (programmable logic controller). Diagnostic phase is considered to be the time spent in identifying the cause of breakdown and means of rectifying it. It is often the case in advanced machinery that this phase consumes the majority of time taken to deal with breakdown problems. In order to minimise this phase, solutions were implemented to address two main requirements; improve skills, and provision of knowledge regarding past maintenance actions done for the machine in
67
JQME 4,1
How Did We Get Here? Breakdowns have been reduced by 80% and Corrective Actions by 60%. Downtime Waste Time Diagnostic Repair Skills
68 (Breakdown Production Starts) Log On Minimised By: Figure 1. Downtime phases and solutions implemented
Maintenance Log On
Skills and Spares
Maintenance Log Off (Breakdown Finishes)
• Automate • Networking • Alarm • TPM ordering • Bleep • Operator • Failure mode help menus of previous faults analysis • MCDMG.
question. Solutions included the provision of past maintenance actions during logging on by maintenance, as well the provision of multi-levelled analytic reports examining failure mode trees using a prioritisation mathematical tool called the analytic hierarchy process (AHP) (Labib et al., 1998) that is related to TPM schedules. The repair phase is the time taken to implement rectification of the breakdown problem. This phase often needs the provision of adequate tools and spare parts, in addition to the skill needed to carry out the action. This phase was minimised by the development of a maintenance spare parts ordering system that automated ordering spares as well as monitoring suppliers. In addition a decision making grid (DMG) was developed to help in monitoring performance of machines and suggesting appropriate actions. This solution is the subject of the remaining part of the paper. The effect of the DMG helps in the repair phase and, more importantly, it helps in minimising and preventing breakdowns from occurring in the first place. The main idea of the model is to transfer data collected from an existing computerised maintenance system, such as the data shown in Figure 2, into decision analysis for management using a “decision making grid” as shown in Figure 3. The first step is to extract the worst performing machines based on different criteria. This is shown in Figure 2, where the top ten worst machines according to both downtime and number of calls are presented. As shown, according to the downtime criterion, 89 per cent of the problems originated from ten machines out of total 130 machines available. As for the frequency of calls criterion, 81 per cent of the problems originate from ten machines. The next step is to subdivide both criteria, downtime and frequency, into high, medium and low values. For example, downtime is considered low if less
Criteria:
Downtime
Frequency
Name Machine [A] Machine [B] Machine [C] Machine [D] MEDIUM Machine [E] Machine [F] Machine [G] Machine [H] LOW Machine [I] Machine [J] Sum of Top 10 Sum of All Percentage HIGH
Downtime (hrs) 30 20 20 17 16 12 7 6 6 4 138 155 89%
Name
Frequency (No. off) Machine [G] 27 Machine [C] 16 HIGH Machine [D] 12 Machine [A] 9 Machine [I] 8 Machine [E] 8 MEDIUM Machine [k] 8 Machine [F] 4 Machine [B] 3 LOW Machine [H ] 2 Sum of Top 10 9797 Sum of All 120 Percentage 81%
World-class maintenance
69
Figure 2. Criteria evaluation
than 10 hours, high if more than 20, and medium if between 11 and 19. The same applies to the criterion of frequency but using a different scale. The objective is to try to cater for the ten worst machines under both criteria. Decision rules The final step is to place the machines in the “decision making grid” shown in Figure 3, and accordingly, to recommend maintenance decisions to management. This grid acts as a map where the performances of the worst machines are placed based on multiple criteria. The objective is to implement appropriate actions that will lead to the movement of machines towards the north-west section of low downtime, and low frequency. In the top-left region, the action to implement, or the rule that applies, is OTF (operate to failure). The rule that applies for the bottom-left region is SLU (skill level upgrade) because data collected from breakdowns – attended by maintenance engineers – indicate that machine [G] has been visited many times (high frequency) for DOWNTIME Med.
Low FREQUENCY
10
Low
0.T.F. T.P.M. 10
High
T.P.M.
C.B.M.
[H] (When?) [F]
5
Med.
High
20
(Who?)
S.L.U.
[I] [J]
T.P.M.
[B]
T.P.M. [E] (What?) [A]
T.P.M. [G] (How?)
D.O.M. [D]
[C]
CBM: Condition Based Monitoring OTF: Operate To failure SLU: Skill Level Upgrade DOM: Design Out M/C.
Figure 3. Decision making grid (DMG)
JQME 4,1
70
limited periods (low downtime). In other words maintaining this machine is a relatively easy task that can be passed to operators after upgrading their skill levels. A machine that is located in the top-right region, such as machine [B], is a problematic machine, in maintenance words “a killer”. It does not break down frequently (low frequency), but when it stops it is usually a big problem that lasts for a long time (high downtime). In this case the appropriate action to take is to analyse the breakdown events and closely monitor its condition, i.e. condition base monitoring (CBM). A tool that is useful for this kind of analysis is the MCDM system based on AHP where failure criteria and their sub-criteria are prioritised. A machine that enters the bottom-right region is considered to be one of the worst performing machines based on both criteria. It is a machine that maintenance engineers are used to seeing not working rather than performing normal operating duty. A machine of this category, such as machine [C], will need to be structurally modified and major design-out projects need to be considered, and hence the appropriate rule to implement will be design out maintenance (DOM). If one of the antecedents is a medium downtime or a medium frequency, then the rule to apply is to carry on with the preventive maintenance schedules. However, not all of the mediums are the same. There are some regions that are near to the top left corner where it is “easy” TPM because it is near to the OTF region and it requires re-addressing issues regarding who will perform the instruction or when will the instruction be implemented. For example, in case of machines [I] and [J], they are situated in a region between OTF and SLU and the question is about who will do the instruction – operator, maintenance engineer, or sub-contractor. Also, a machine such as machine [F] has been shifted from the OTF region owing to its relatively higher downtime and hence the timing of instructions needs to be addressed. Other preventive maintenance schedules need to be addressed in a different manner. The “difficult” TPM issues are the ones related to the contents of the instruction itself. It might be the case that the wrong problem is being solved or the right one is not being solved adequately. In this case machines such as [A] and [D] need to be investigated in terms of the contents of their preventive instructions and expert advice is needed. Industrial case study This case study demonstrates the application of the proposed model and its effect on maintenance performance. The application of the model is shown through the experience of a company seeking to achieve world-class status in maintenance. The company has implemented the proposed model, which has had the effect of reducing total downtime from 800 hours per month to less than 100 hours per month as shown in Figure 4.
World-class maintenance
Breakdown Trends (hrs.) 1200 1000
71
800 600 400 200 0 Nov Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug Sep
Oct
Nov
Company background and methodology The company is in the automotive sector producing roof systems and is part of one of the world leaders in the design and manufacture of automotive systems. In this particular company there are 130 machines, varying from robots, and machine centres, to manually operated assembly tables. Notice that in this case study, only two criteria are used (frequency and downtime). However, if more criteria are included such as spare parts cost and scrap rate, the model becomes multi dimensional, with low, medium, and high ranges for each identified criterion. The methodology implemented in this case was to follow three steps. These steps are: criteria analysis, decision mapping, and decision support, and are presented in Figure 5. Step 1: criteria analysis As indicated earlier the aim of this phase is to establish a Pareto analysis of two important criteria: downtime, the main concern of production; and frequency of calls, the main concern of maintenance. The objective of this phase is to assess how bad are the worst performing machines for a certain period of time, say one month. The worst performers in both criteria are sorted and grouped into high, medium, and low sub-groups. This is presented in Figure 6. Step 2: decision mapping The aim of this step is twofold; it scales high, medium, and low groups and hence genuine worst machines in both criteria can be monitored on this grid. It also monitors the performance of different machines and suggest appropriate actions, as explained earlier. This step is shown in Figure 7.
Figure 4. Total breakdown trends per month
JQME 4,1
72
Figure 5. Steps for decision analysis
Figure 6. Step 1: criteria analysis
World-class maintenance
73
Figure 7. Step 2: decision mapping
Step 3: decision support Once the worst performing machines are identified and the appropriate action is suggested, it is now a case of identifying the cost of each action, or the amount of money expected to be saved if the appropriate action is implemented. This is done by multiplying the hourly rate of production hours by the number of production operators allocated on those machines by the hours wasted in each machine and then averaging the value for each category. This step is shown in Figure 8. Results The results of implementing the decision making grid have been a continuous reduction in total downtime, as indicated in breakdown trend in Figure 4. Notice that although the same grid is used every month, the range of the scales differs. Since breakdown duration has been reduced, accordingly the scale had to be altered. For example, what used to be considered a “low” downtime (five hours or less) is now considered a “medium” range, and what used to be considered “medium”, is now a “high” value. When big problems are dealt with, the attention is focused on smaller ones and they are treated as major problems. This shows that a process of continuous improvement is being implemented. In addition, even when the scale is tightened, there is seldom any machine in the DOM region which shows that a total shift towards the favourable north-west zone of the grid is being achieved.
JQME 4,1
74
Figure 8. Step 3: decision support
Discussion and conclusion The main objective of this paper is to emphasise the fact that the right policy to counter any mode of failure is that which improves the life cycle profit or most reduces the life cycle cost. The logic used throughout our derivation of the model is aimed at reducing the cause of breakdowns in the form of identifying and analysing different criteria such as downtime, frequency, spare parts used, and bottle-neck status. Finding and improving the worst machines is not a new concept, as it is the core concept of TPM. However, using a formalised decision analysis approach based on multiple criteria and rule-based system is the contribution of the presented model. The decision making grid shown in the case study is considered a map where the performances of the worst machines are monitored based on multiple criteria, and actions are taken based on the relative position of the machines in this grid. It is sometimes claimed that all items in a series system with no inter-stage store are of equal importance. This claim is based on the fact that a bottleneck is the one controlling throughput and hence bottleneck is the sole criterion for assessment. However, as identified by Goldratt (1986), bottlenecks often change and when dealing with machines in terms of breakdown criticality, one reduces their possibility of becoming candidates as bottlenecks by tackling the cause of failure based on multiple criteria rather than one criterion.
References Flynn, B.B., Bates K.A. and Schroeder, R.G. (1989), “World class manufacturing in the United States”, Proceedings of the Decision Sciences Institute, New Orleans, Decision Science Institute, USA. Goldratt, E. and Cox, J. (1986), The Goal – A Process of Ongoing Improvement, North River Press, Croton-on-Hudson, NY. Labib, A.W. (1996), “Integrated and interactive appropriate productive maintenance”, PhD thesis, University of Birmingham, UK. Labib, A.W., Cutting, M.C. and Williams, G.B. (1997b), “Towards a world class maintenance programme”, Proceedings of CIRP Int. Symposium: Advanced Design & Manufacture In The Global Manufacturing Era, Hong Kong, 21-22 August . Labib, A.W., O’Connor, R.F. and Williams, G.B. (1998), “An effective maintenance system using the analytic hierarchy process”, Journal of Integrated Manufacturing Systems, Vol. 9 No 2, April. Labib, A.W., Williams, G.B. and O’Connor, R.F. (1996a), “Formulation of an appropriate productive maintenance strategy using multiple criteria decision making”, Maintenance Journal, Vol. 11 No 11, April. Labib, A.W., Williams, G.B. and O’Connor, R.F. (1996b), “An intelligent decision analysis maintenance system: application of AHP and fuzzy logic”, Proceedings of The Fourth International Symposium on AHP, Vancouver, Canada, 12-16 July. Labib, A.W., Williams, G.B. and O’Connor, R.F. (1997a), “Deriving a maintenance strategy through the application of an MCDM methodology”, (Lecture Notes in Economics and Mathematical Systems, No. 448, Multiple Criteria Decision Making), Proceedings of the 12th Int. Conference, Germany, Springer-Verlag, Berlin. Mackenzie, J. (1997), “Turn your company’s strategy into reality”, Manufacturing Mgmt, January, pp. 6-8. Olafsson, S.V. (1990), “An analysis for total productive maintenance implementation”, MSc thesis, Virginia Polytechnic and State University, USA.
World-class maintenance
75