Progress Monitoring and Gesture Control in Manual ... - Science Direct

3 downloads 290 Views 990KB Size Report
incorporation of progress monitoring, gesture control and worker support by using a budget ..... many license free or open source software packages, that can.
Available online at www.sciencedirect.com

ScienceDirect Procedia CIRP 37 (2015) 1 – 6

CIRPe 2015 - Understanding the life cycle implications of manufacturing

Progress monitoring and gesture control in manual assembly systems using 3D-image sensors Simon Kaczmareka *, Sebastian Hogrevea and Kirsten Trachta a

Bremen Insitute for Mechanical Engineering, University of Bremen, Badgasteiner Straße 1, 28359 Bremen, Germany

* Corresponding author. Tel.: +49-421-218-64832. E-mail address: [email protected]

Abstract The high amount of manual work in industrial assembly requires special measures in terms of progress monitoring and human-machine interaction to integrate these workstations into a digital production environment. A new concept for an assembly assistance system enables the incorporation of progress monitoring, gesture control and worker support by using a budget image sensor with per-pixel depth information. The functionality and potential of the markerless, real-time capable system is evaluated with an experimental setup. © 2015 2015 The The Authors. Authors. Published Published by by Elsevier Elsevier B.V. B.V.This is an open access article under the CC BY-NC-ND license © Selection and peer-review under responsibility of the International Scientific Committee of the “4th CIRP Global Web Conference” in the (http://creativecommons.org/licenses/by-nc-nd/4.0/). person of theunder Conference Chair of Dr.the John Ahmet Erkoyuncu. Peer-review responsibility organizing committee of CIRPe 2015 - Understanding the life cycle implications of manufacturing Keywords:assembly, object recognition, monitoring

1. Introduction In industrial assembly a general trend towards automation and digitalisation can be observed. Nonetheless, it still remains a high amount of assembly operations, which are performed manually due to requirements in flexibility and low investments [1,2]. Hence, three major points can be highlighted, why a solution for progress monitoring and gesture control is required to enable integration of manual assembly operations into a digital environment. Firstly, industrial assembly is based on digital tools and methods like assembly simulation [1], process planning [3] or assembly time determination [4]. With increasing complexity of assembly processes there is a need for support in controlling [5]. Therefore, information and data about assembly processes and assembly progress has to be gathered. Especially the required data about current status leads towards a point, where the possibility to automatically transfer information from manual working systems into a digital production environment is necessary. Secondly, an increasing amount of product variants enhances the complexity of

assembly as well as the challenges a worker faces during his daily work [6]. This requires a solution that supports the worker and provides assistance in dealing with the different assembly processes. Thirdly, the interaction of the worker with the machine, which is mainly based on hardware interfaces like buttons, keyboards or touch devices, or the use of documents like instructions cause an interruption of focusing on the main tasks and lower the quality and quantity of the output [1]. A system, which is able to meet the stated issues, offers the option of integrating information into a digital environment and supports the worker with helpful or required information in a way that the influence on natural acting and movements during work is minimal. A contactless and real-time capable system appears as a suitable solution. Available camera systems, that could be an answer to those demands, show certain drawbacks in terms of easiness of use and user friendliness, as they use for example markers, that are attached on clothing or hands of the worker. A possible answer to face those needs is the use of a 3D-image sensor.

2212-8271 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the organizing committee of CIRPe 2015 - Understanding the life cycle implications of manufacturing doi:10.1016/j.procir.2015.08.006

2

Simon Kaczmarek et al. / Procedia CIRP 37 (2015) 1 – 6

Therefore, a new concept for an assembly assistance system that enables the incorporation of progress monitoring, gesture control for human-machine interaction and worker support is developed and presented in this paper. The suitability of this approach for practical application is experimentally demonstrated by using a budget sensor, that combines two-dimensional RGB video with per-pixel depth information (RGB-D). 2. Gesture control in assembly systems To realise gesture control, the motions of a worker have to be recognised and tracked. In general, different approaches and technologies for motion tracking of persons are available. In [7] the distinction between mechanical systems like exoskeletons, magnetic systems, inertial systems (accelerometers and gyroscopes) and optical systems with or without markers is presented. Magnetic, inertial, mechanical and marker based optical systems share the disadvantage of objects, that have to be attached on the monitored persons. This may disturb the natural behaviour and comfort of the person [1]. It can be differentiated between full body recognition and a more detail-based approach that focuses on single body parts like hands. In [8] a method is presented, that shows the application of full body motion tracking to optimise ergonomics for the worker. A different application is demonstrated in [9], where the detection of full body movements of a person is used for a follow function for a robot. Other fields of application require a more detailed view. Depending on the size of the product, in industrial assembly the movements of the hands are most important. A trajectory-based approach for the recognition of dynamic hand gestures in a two-dimensional space is presented in [10]. Other approaches focus on the detection of static hand gestures including flexion of the hand using a single camera setup [11]. Monitoring of assembly processes is not a new topic in research and industrial application. It can be distinguished between systems that focus on the assembly object and parts detection and systems that use an indirect approach by monitoring the movements of the worker to draw conclusions on the current status of assembly. Markerless, camera-based systems for assembly progress monitoring using object recognition are available for industrial application. By using a 2D RGB camera a system by Optimum GmbH covers a single assembly station. This system detects objects within a certain area and compares the image with information from a database. Conducted assembly steps are recognised by detecting the assembled parts. Hand recognition and gesture control is not supported by the system, therefore the worker still has to use a touchscreen to interact with the system. Another available camera-based system using parts detection monitors the assembly work station and compares the video data with information from a CAD model. A display provides assistance for the worker by colouring correctly assembled or missing parts. This system does not use gesture control either [12]. An example for motion based assembly monitoring is the system AssyControl, which is a marker based ultrasonic

solution. It uses an array of microphones that tracks the movements of a marker which is attached to the hand of a worker. The actual progress and status of the assembly can only be tracked indirectly by interpreting the movements of the worker and comparing it to predefined motions. Parts cannot be detected and the worker is forced to wear a marker. It can be stated, that the different approaches and systems all have their own advantages and disadvantages, but none of the systems is providing all features that would be needed to set up a system that meets the demands that were described in the introduction. The approach for a system for assembly progress monitoring and gesture control which is presented in this paper therefore is based on the following demands: x no attachments like markers for the worker x real-time capability x tracking and recognition of hand gestures x detection of orientation and position of assembled parts x support for workers by providing information upfront and feedback about conducted assembly steps afterwards 3. Concept for a status detection, gesture control and worker support system An incorporating concept that targets the three stated issues requires a clear structure of the different connections between the three main elements. The different elements and an interconnection structure are presented in the following. The worker, as depicted in Fig. 1, who is conducting the assembly process is interacting with the assembly environment and the assembly object.

Fig. 1: Interconnection structure of worker, controller and assembly object

The assembly environment is providing the technical basis for the assembly steps. As the assembly work station is interconnected with the surrounding production environment, it handles and forwards information and commands in both directions and is named controller. The controller receives commands from the worker and provides support. Additionally, it collects status information from the assembly object and influences the assembly object by controlling actuators of the assembly workstation, like forwarding the conveyer belt. The assembly object gets manipulated by the worker and is influenced by the controller. The status of the assembly object is not actively send, but recognised by the controller. Therefore, the assembly object is a passive element.

Simon Kaczmarek et al. / Procedia CIRP 37 (2015) 1 – 6

In the following, the interconnection structure is used to develop a concept for an integrative status detection, gesture control and worker support system. The aim of this concept is, to realise the communication and connection between the three elements with a single tool. Based on the demands towards a system for gesture control, assembly status detection and worker support that have been presented in the previous chapter and after reviewing different approaches, the concept presented in this paper will be using a budget RGB-D sensor. Compared to a two-dimensional camera this type of sensor provides an additional channel from which information can be gathered. Object and hand recognition with twodimensional markerless cameras are based on colours and contrast. Using depth data delivers information that are independent from certain external influences like deviations in colour when hands with different skin complexions conduct the assembly. In addition, the distinction between target objects and undesired objects within an unsettled background becomes easier, as the recognition can be limited to a defined depth range. The monitoring of a workstation, as it is depicted in Fig. 2, will comprise the hands of the worker, the assembly object area of the workstation, the material supply area and functional areas.

x

x

Material and Tools Supply Area: The necessary material supplies and tools required for the assembly are hold available within this area. The assembly worker can pick the needed objects from the according boxes or tool mounts. Functional Areas: No assembly objects are searched in these areas, as they are used to detect static gestures for the gesture control function.

Using a sensor that delivers depth data enhances the possibilities of monitoring the workstation compared to a 2D imaging sensor like a RGB camera. With the additional dimension it is now possible to distinguish the progress of the assembly good and the hands and gestures of the worker on different layers. Fig. 3 shows the lateral view on the assembly workstation.

Fig. 3: Lateral view on assembly workstation and different layers.

Fig. 2: Topview on assembly workstation and different areas. Monitored area is marked with fat black line.

The sensor is positioned in a way that it focuses the area from a perpendicular position. With this setup, the additional advantage of securing the workers anonymity can be achieved, as his face is not located within the monitored area. The different marked areas are explained in the following. x

Assembly Object Area: This area is the central element of the assembly workstation. The workpiece carrier is placed in this area while the assembly work is conducted. Consequently, only this part of the monitored assembly station will be scanned for an assembly object and detectable parts, surrounding areas are excluded. This simplifies the task of the software to correctly and rapidly detect assembled parts.

Similar to the defined areas, different layers help to increase the robustness and speed of the hand and object recognition software, as the detection area is limited. For this concept the assembly object layer is defined right above the conveyer belt. The functional layers are located directly above the assembly object layer, so it only needs a small movement of the hand of the worker to be able to use gestures to interact with the controller. Materials and tools are located within the highest defined layer. With the use of a depth filter to mask out unsettled backgrounds, no unwanted detection below the conveyer belt level is possible. To realise the gesture control function it is important to create a reliable way to recognize gestures and distinguish them from natural movements of the hands while performing assembly tasks. This function uses a combination of so called functional spaces, which are defined as limited three dimensional spaces created from functional area and functional height layer, and programmed movements and poses of the hand of the worker. As soon as the hand of the worker enters the functional space and remains there for a certain amount of time on the functional height layer, the software checks if the hands position and pose matches with a programmed gesture. The recognised static gesture can now be used to confirm a notification or to trigger events like forwarding the conveyer belt or checking the assembled

3

4

Simon Kaczmarek et al. / Procedia CIRP 37 (2015) 1 – 6

product with a comparison image from a database. The variety of triggerable events can be extended by adding new functional spaces or by defining new gestures. The status monitoring in this concept is realised by splitting the assembly process into different steps, that can be recognised by the used sensor. This function is closely interconnected with the support function for the worker. The support function is using a display attached to the assembly work station. On this display information about the current assembly step are presented. By using the two-dimensional video stream of the camera, the current situation is displayed in real time. This video is processed by a software to show differently coloured overlays. The position on which a part is expected in the current assembly step is coloured blue, so the worker can easily see, where he needs to assemble a part. When he assembles the correct part in correct position and orientation, the overlay changes its colour to green, if failures are made, the colour changes to red. This direct feedback helps the worker to deal with an increased amount of variants, enhanced responsibilities and stricter requirements of quality assurance. In addition, he is able to recognise assembly failures at an early stage and make corrections without negatively influencing downstream workstations and processes. After the worker has finished all assembly tasks in the current step, he can trigger the system with a gesture to run a check using the object recognition and report the successfully conducted assembly step as a status update to the controller.

Fig. 4: Experimental setup

In a preliminary test it became clear, that the used sensor has difficulties to collect depth information from objects with reflective surfaces. Therefore, the assembly object for the experimental setup was created from a wooden constructionset. 4.2. Software implementation The software for this scenario comprises different classes, which will be described in the following. Their interconnection is shown in Fig. 5.

4. Implementation 4.1. Experimental setup The described concept was transferred into an experimental setup in the laboratory of the Bremen Institute for Mechanical Engineering. The setup comprises a single manual assembly work station which is used to assemble an exemplary fictional product that shows similar characteristics like an industrial assembly good and is shown in Fig. 4. The used 3D image sensor for this setup is a Microsoft Kinect sensor, which is mounted in a way that it monitors the assembly work station from an perpendicular perspective. The sensor unit combines an infrared (IR) projector, an IR camera and a RGB camera in a single housing and uses a triangulation measurement for collecting the depth information [13]. The advantages of the Kinect are a reasonable price point, a high availability and the access to many license free or open source software packages, that can be used for applications. The depth sensor works with the structured light method. A structured pattern of light dots is sent by the IR projector and recorded by the IR camera. The comparison of the projected and the recorded pattern provides data about disturbances, which can be processed in a way, that the surface where the pattern is projected on can be recognised as closer or further away compared to surrounding areas [14].

Fig. 5: Interconnection of different classes and databases

The object recognition segments the depth information of the sensor into different regions by comparing each depth pixel to its neighbours. As soon as the difference in distance to the sensor exceeds a threshold value, the pixels are coloured black, pixels that have similar depth information are coloured white. Afterwards, each pixel row in the image is analysed and pixels of the same colour are connected within an array. The next step is to compare consecutive rows and connect pixel arrays with the same depth information. The resulting image shows the detected edges of objects. The inclosed pixels are connected to regions and depict segmented objects that can be used for progress monitoring. Progress monitoring functionality uses the comparison between a stored reference image and the video stream. Thus, the reference images are recorded by a special class that uses the described object recognition, while the assembly object is put together step by step. After each assembly step a save button in the software is pressed and the object recognition

Simon Kaczmarek et al. / Procedia CIRP 37 (2015) 1 – 6

scaans the currrent status. A reference image withh the reccognised assem mbled parts is stored into a database. The comparisson is realisedd with the com mpare class. First, F a refference imagee with deteccted parts is loaded from m the dattabase. The second inputt for this cllass is the object reccognition classs, which analyses the deppth data of thhe live viddeo stream to find detectabble objects. Thhe detected objects o aree now comparred to the refeerence image. When a predeefined deggree of coveerage of dettected objectts is reachedd, the com mpare class reecognises the assembly a stepp as fulfilled. The interactioon of the worrker and the system is bassed on gesstures, which can trigger actions a of the software or events e likee forwardingg the conveyeer belt. Gestture recognitiion is connducted in diifferent steps.. Firstly, a hand h of the worker w neeeds to be deteected by exam mining the deppth data for objects o on a predefinedd range of height. The found objectts are scaanned for a coonvex outline hull, defects in the convex hull andd maximum values v of thee defects like they appear when fingers are deteccted. Afterwarrds the palm and a the centre of the hannd are markedd. In this setup,, static gestures are used. Predefined P geestures aree stored in a daatabase similaar to the assem mbly steps dataabase. These are combbined with thhe described functional sppaces, whhich are consttantly checkedd for the apppearance of a hand. The left side off Fig. 6 show ws the live im mage of the diisplay. d the outtline of a funcctional space. Once The red frame depicts h enters that t functionaal space, the movements of o the a hand hannd are analyseed and compaared to the deffined gesturess from thee database. When W a matchh is found, thhe outline fraame is colloured green as a depicted onn the right sidde of Fig. 6. During D thee conducted exxperiments diifferent gesturres have beenn used. In this experim mental setup the detectionn of a handd with t outtspread fingerrs within a funnctional area is used as a trigger to compare the current assem mbly status too the stored image mbly step as correctly perfoormed. andd mark the connducted assem If failures are detected, d a hint to check the assemblyy step agaain and make necessary corrrections is givven.

Fig. 7: Worker suupport. (a) awaitiing assembly of part, p (b) correct asseembly.

nised. Fig. 8: Worker suupport. Failure inn assembly recogn

and d therefore neeed to be assembled in thee current asseembly step p, are overlaaid over the RGB videeo stream as blue colloured areas (F Fig. 7 (a)). Ass soon as partss are detected in the livee video stream m, the colour overlay changes to green where w thee position andd alignment oof the referen nce object annd the asssembled objecct show a m match (Fig. 7 (b)). If a part p is dettected, but is located l in a w wrong position n or orientatioon, the dettected surfacee is coloured red (Fig. 8). Using that method m thee worker receiives supportinng information n upfront on where w a part p has to be b assembled and direct feedback f abouut the quaality of his asssembly work. 5. Results R of thee experiments

Fig. 6: Gesture reecognition. (a) Reed frame outliness functional spacee while waiiting for a gesturee. (b) Green framee symbolizes a reecognised gesturee in the funcctional space.

4.33. Worker suppport The worker support is realised by using a display on which w thee live RGB viideo stream iss shown. The different stattes are deppicted in Fig. 7 and Fig. 8. The informattion from the object reccognition class and the refeerence image are used in a way, thaat parts, whichh have been deetected in the reference imaage

Experimentall assembly ruuns have been n conducted using thee described seetup. During tthe different runs r the settinngs of thee software have h been thhoroughly adjjusted. The noisy bacckground is blinded ouut by tunin ng the maxximum reccognition deptth. This allows to avoid thee faulty detection of objjects that are positioned bbelow the asssembly objectt. The min nimum size of o objects is adjusted as well. w This helps to sup ppress the inteerpretation off image noise as small objects. A vallue of 2.5 mm m minimum ssize delivers the best resullts. In add dition, the leveel of overlappping of the dettected objects in the refe ference imagee and the livee object detection are adjusted. Thiis setting mitigates smallerr errors in ed dge recognitioon and objject segmentaation as well as small dev viations in poosition and d orientation,, that are nnot considered critical foor the asssembly proceess. A valuee of 85% overlapping of o the reccognised objecct and referennce image turn ns out to be a good ballance betweenn exact posiitioning and robustness of o the sysstem. Thee preparation of the experriments includ des the creatiion of

5

6

Simon Kaczmarek et al. / Procedia CIRP 37 (2015) 1 – 6

the reeference imagges. The assem mbly object is put together with w a stroong focus on correct execution and rigght positioningg of the different d parts. The preparinng assembly run r is segmennted into different stepps. After eacch step, a refference imagee is o to be a convenient and stored. This proccess comes out practticable solutioon. Dependingg on the requiirements towaards the leevel of detail of the progreess monitoringg, the recognition of more m than onee part in an assembly a stepp is possible and increeases the usabbility of the syystem as the assembly a is more m fluennt. In addition to the requireed level of dettail the maxim mum numbber of parts, thhat can be dettected at once,, is limited byy the charaacteristics of the t assembly good, as partss on lower layyers of a multilayer product p can be b covered byy elements from fr i shown in Fiig. 9. upperr layers, as it is

ness of use off this approacch in terms of o setting up the easin system, which com mprises only ffew parts and the fast teachhing t potential. of thee reference assembly processs illustrates the Th he usage of different defineed spaces by the implemennted gestu ure recognitionn and gesturee control functtionality helps to increase the robusstness of the system, as movements m durring mbly cannot be b misinterpreeted as gesturees. This approoach assem offerss expandabillity by definning new geestures and new n functtional spaces and is therefo fore highly veersatile. The fact, f that the presentedd gesture reccognition solu ution is workking kerless is a bigg advantage inn terms of con nvenience forr the mark work ker compared to systems uusing differentt technologiess to track motions or geestures. Ackn nowledgemen nts Th he authors would w like too thank Eike Breyer, Maathis Engeelbart, Andrej Kolesnikov annd Karsten Reeupke. Referrences

F 9: Covering of Fig. o lower level parrts.

Thhe results of thhe assembly runs r after adjuusting the settiings show w, that the segmentation s of objects from the deepth inform mation is woorking in reall time withouut noticeable and disturrbing delay tiimes. Nonethheless, it is nooticeable that the detecction perform mance is lim mited due to the technnical speciification of thhe sensor. Thee resolution off the used buddget RGB B-D sensor hinnders the deteection of smaall objects. In the show wn setup, the smallest s reliabbly detectable object has a size of 5 x 5 mm. As the parts of the t assembly good are biggger, c does not limit thee result of thiss experiment. The this constraint used sensor showss limitations in i accuracy as a well, when the s sunlighht, which hindders moniitored area is exposed to strong the reecording of thhe IR light patttern by the IR R camera. Thhe reliable reccognition of objects o dependds on the relaative calibrration of the sensor and thhe monitoredd area. Especiially angular deviation of o objects, whhich can be caaused for exam mple by movements m of the workpiecee carrier, betw ween the live and compparison image cause difficuulties in recognnition of objeccts. 6. Coonclusion Thhe experimentts under laborratory conditioons show, thatt the conceept of combiined gesture control, worrker support and assem mbly progresss monitoring offers potenntial for practtical appliication. It is possible p to deemonstrate, thhat the use off the depthh informationn of a buddget RGB-D sensor delivvers prom mising results and a can be a suitable s solutioon. It becomes cleear, that the performance p m is of the system stronngly dependinng on how thee informationn of the sensoor is proceessed. The presented p appproach of obbject recognition appeaars to be suuitable as it is real-time capable and the possiibility of adjuusting differennt settings allows to adapt the systeem to differentt use cases and environmennts. Especiallyy the

[1] Leu MC, ElMaraghhy HA, Nee AYC C, Ong SK, Lanzeetta M, Putz M, et e al. CA AD model basedd virtual assemblly simulation, pllanning and trainning. CIRP Annals - Mannufacturing Technnology 2013;62(2 2):799-822. mbly – Scope and [2] Feldmann K, Slama S. Highlyy flexible Assem Jusstification. CIRP Annals - Manuffacturing Technollogy 2001;50(2):48998. W R, Erohin O, Klinkenbergg R, Deuse J, Stromberger S F. Data [3] Wallis Miining-supported Generation G of Asssembly Process Plans. P Procedia CIRP C 2014;23:178-83. [4] Errohin O, Kuhlanng P, Schallow JJ, Deuse J. Intelligent Utilisatioon of Digital Databases for f Assembly Tim me Determinatio on in Early Phasees of oduct Emergencee. Procedia CIRP 2012;3:424-9. Pro [5] Traacht K, Funke L, Schottmayer M. Online-control of o assembly proceesses in paced productioon lines. CIRP A Annals - Manuffacturing Technoology 2015, Available onlline 2 May 2015. otter B. Einführuung - Entwicklunng der Montageteechnik. In: Lotteer B, [6] Lo Wiiendahl HP, edittors. Montage inn der industrielleen Produktion - Ein Haandbuch für die Praxis. P Berlin Heiidelberg: Springer Vieweg; 2012. p. 18. (in german) [7] Reegazzoni D, de Vecchi V G, Rizzi C C. RGB cams vs RGB-D R sensors: Low cosst motion capturee technologies peerformances and limitations. J Manuf M Sy ys 2014;33(4):7199-28. [8] Ng guyen TD, McFarrland R, Kleinsorge M, Krüger J,, Seliger G. Adapptive Qu ualification and Assistance A Modulles for Manual Assembly A Workplaaces. Pro ocedia CIRP 2015;26:115-20. [9] Trenkle A, Seiboldd Z, Stoll T, Furrmans K. FiFi – Steuerung eines FTF durrch Gesten- undd Personenerkennnung. Logistics Journal 2013;10. (in gerrman) [10] Kılıboz K NC, Güddükbay U. A hannd gesture recog gnition techniquee for human–computer interaction. i J Viis Commun Imaage Represent 2015; 2 28:97-104. B MK, MaccDorman KF, Kaar MK, Neog DR R, Lovell BC. Praathik [11] Bhuyan Gaadde. Hand pose recognition from m monocular im mages by geomettrical and d texture analysiss. J Vis Lang Com mput 2015;28:39--55. [12] Maresch M R. Digitaale Montageprüfuung - Automatiscche Qualitätskonttrolle aucch bei kleinen Chargen. hhttp://www.iff.fraaunhofer.de/de/preesse/ preesseinformation/22013/digitale-monntagepruefung-au utomatische-qualii taeetskontrolle-auch--bei-kleinen-charrgen.html (accesssed 23 April 20155) (in gerrman) [13] Khoshelham K K. Accuracy A analysiss of kinect depth h data. In: Lichti DD, Haabib AF, editorss. ISPRS worksshop laser scann ning 2011, Calggary, Caanada, 29-31 Auggust 2011:133-8. [14] Cruz C L, Lucio D, D Velho L. Kineect and rgbd imaages: Challengess and app plications. In: Prroc. SIBGRAPI Conf. Graph. Patterns P Images Tuts. T (SIIBGRAPI-T), Ouuro Preto, Brazil, 2012:36–49.

Suggest Documents