Computational Challenges for the CBM Experiment - Semantic Scholar

1 downloads 126 Views 939KB Size Report
forward calorimeter (PSD) at the end of the setup serves for event characterisation. (centrality, event .... dead times etc.). A dedicated task ... index_e.html. 4.
Computational Challenges for the CBM Experiment Volker Friese GSI Helmholtzzentrum f¨ ur Schwerionenforschung, Planckstraße 1, 64291 Darmstadt, Germany [email protected] Abstract. CBM (“Compressed Baryonic Matter”) is an experiment being prepared to operate at the future Facility for Anti-Proton and Ion Research (FAIR) in Darmstadt, Germany, from 2018 on. CBM will explore the high-density region of the QCD phase diagram by investigating nuclear collisions from 2 to 45 GeV beam energy per nucleon. Its main focus is the measurement of very rare probes (e.g. charmed hadrons), which requires interaction rates of up to 10 MHz, unprecedented in heavy-ion experiments so far. Together with the high multiplicity of charged tracks produced in heavy-ion collisions, this leads to huge data rates (up to 1 TB/s), which must be reduced on-line to a recordable rate of about 1 GB/s. Moreover, most trigger signatures are complex (e.g. displaced vertices of open charm decays) and require information from several detector subsystems. The data acquisition is thus being designed in a free-running fashion, without a hardware trigger. Event reconstruction and selection will be performed on-line in a dedicated processor farm. This necessitates the development of fast and precise reconstruction algorithms suitable for on-line data processing. In order to exploit the benefits of modern computer architectures (many-core CPU/GPU), such algorithms have to be intrinsically local and parallel and thus require a fundamental redesign of traditional approaches to event data processing. Massive hardware parallelisation has to be reflected in mathematical and computational optimisation of the algorithms. This is a challenge not only for CBM, but also for current and future experiments, in particular for heavy-ion eperiments like e.g. ALICE at the LHC. For the development of the proper algorithms, a careful simulation of the input data is required. Such a simulation must reflect the free-running DAQ concept, where data are delivered asynchronously by the detector front-ends on activation, and no association to a physical interaction is given a priori by a hardware trigger. It hence goes beyond traditional event-based software frameworks. In this article, we present the challenges of and the current approaches to simulation, data processing and reconstruction in the CBM experiment.

1

The CBM Experiment

The Facility for Anti-proton and Ion Research (FAIR) is an accelerator complex under preparation near the GSI research centre in Darmstadt, Germany [6–8].

2

Computational Challenges for the CBM Experiment

Its backbones are two superconducting synchrotrons (SIS-100 and SIS-300) with 100 and 300 Tm bending power, respectively, delivering highly intense beams of protons (up to 90 GeV), light ions (up to 45 GeV per nucleon) and heavy ions (up to 35 GeV per nucleon). The FAIR facility will serve various fields of physics research such as hadron physics using secondary beams of anti-protons, nuclear structure physics with rare isotope beams, plasma physics with highly pulsed heavy-ion beams, and atomic physics with highly charged ions and low-energy antiprotons. The Compressed Baryonic Matter experiment (CBM) is being prepared to be operated at this facility from 2018 on [3, 4]. It will investigate nuclear matter under extreme conditions (temperature, density) as produced in relativistic nuclear collisions. The collision energies available with the SIS-100 and SIS-300 accelerators provide access to the high-density regime of the phase diagram of strongly interacting matter as depicted in Fig. 1. CBM will search for the landmarks of this phase diagram as predicted by theory, namely the onset of the transition from confined to deconfined matter, the critical point, the onset of chiral symmetry restoration, and hypothetical new phases of matter like quarkyonic matter or color superconductivity. The CBM physics programme is complementary to the heavy-ion research conducted at RHIC and LHC, which explore QCD matter at high temperatures but vanishing net-baryon density. For a detailed review of the CBM physics, see [11].

Fig. 1. The phase diagram of strongly interacting matter, indicating the regimes of confined and deconfined matter. The critical point separates the region of a cross-over (explored by RHIC and LHC) from that of a first-order phase transition to be studied by the CBM experiment.

The task of a heavy-ion experiment is to characterise the collision, and to try to infer the underlying physics, from the multitude of final-state hadrons emitted

Computational Challenges for the CBM Experiment

3

from the reaction zone. The abundances of these hadrons vary over many orders of magnitude, from the frequent pions to the rare charmonium states (Fig. 2). The emphasis of CBM will be the measurement of rare probes giving access to the early stage of the collisions, like charmed hadrons, multi-strange hyperons and leptonic decays of short-lived vector mesons. As the expected multiplicities of these observables are extremely low, their measurement requires high interaction rates, which drive the experimental requirements. Consequently, CBM is being designed to cope with collision rates of up to 10 MHz, unprecedented in heavyion experiments so far. Such rates call for fast and radiation hard detectors and read-out electronics, but also constitute challenges for the data acquisition, online data reduction and data processing.

Fig. 2. Predicted hadron multiplicities (including the relevant branching ratio) for central Au+Au collisions at 25A GeV

The experimental setup of CBM is shown in Fig. 3. The experiment will operate in fixed-target mode and measure charged hadrons, electrons and muons as well as photons. The core of the setup is a silicon tracking system (STS) located inside the yokes of a superconducting dipole magnet. Displaced vertices of open charm decays will be detected by a precision vertex tracker (MVD) consisting of monolithic active pixel sensors (MAPS) close to the interaction target. Charged hadrons will be identified by a time-of-flight detector (TOF) about 10 m downstream the target. A RICH detector and several layers of transition radiation detectors (TRD) serve electron identification (Fig. 3, left). The TRD is also used for tracking purposes. Photons will be detected in an electro-magnetic calorimeter (ECAL) behind the TOF wall. For the measurement of muons, the RICH will be replaced by an active absorber system consisting of several detector layers inside the iron absorber (Fig. 3, right).

4

Computational Challenges for the CBM Experiment

Fig. 3. Left: Setup of the CBM experiment for electron and hadron measurements. The beam enters from the left. From left to right: Superconducting magnet hosting the target, the micro-vertex detector (MVD) and the silicon tracking system (STS), ring-imaging cherenkov detector (RICH), three stations of transition radiation detectors (TRD), time-of-flight wall (TOF) and electro-magnetic calorimeter (ECAL). The forward calorimeter (PSD) at the end of the setup serves for event characterisation (centrality, event plane). Right: CBM setup for muon measurements. The RICH detector is replaced by an active absorber system (MUCH). Only one TRD station is used for tracking between MUCH and TOF.

2

Event Reconstruction in CBM

Event reconstruction in the CBM experiment starts, after cluster and hit finding in the various detector systems, with track finding in the silicon tracking system. The challenge of this task is depicted in Fig. 4, showing a simulated central Au+Au collision at 25A GeV in the STS. About 700 charged tracks are emitted into the detector acceptance with the strong kinematical focussing typical for fixed-target experiments. These tracks have to be reconstructed with high precision and efficiency in order to sort out the interesting, rare observables from the bulk of the produced hadrons. An additional challenge is posed by the large number of fake hits in the STS (about eight times the number of real hits) caused by the projective strip geometry of the sensors. To cope with these conditions, a reconstruction algorithm based on the Cellular Automaton has been developed [13], which has proven to be both fast and efficient. The algorithm starts from short segments built of three space points (hits). Short segments are connected to larger ones if their junction satisfies the specific track model. Finally, track candidates appear, the best of which are selected, the others rejected. The advantage of this method is that it is conceptually simple, local w.r.t. data and thus intrinsically parallel, which makes it well suited for modern, many-core computer architectures. Currently, about 150 central events can be reconstructed per second on a Intel X5550 with 2 × 4 cores (2.67 GHz). The reconstruction efficiency is above 95% for momenta larger than 1 GeV (Fig. 5). Similar efficiencies were obtained by an alternative approach

Computational Challenges for the CBM Experiment

5

Fig. 4. Simulation of a central Au+Au collision at 25A GeV beam momentum in the main tracking system of CBM. About 700 charged tracks are in the detector acceptance.

using the Hough transformation method, which was developed and tested on a CELL-BE as a prototype for an FPGA array.

Fig. 5. Track finding efficiency in the STS, obtained with the CA track finder in central Au+Au collisions at 25A GeV, as a function of momentum

After track finding, the track parameters are reconstructed using the Kalman filter algorithm, taking into account the propagation in a non-homogeneous magnetic field and multiple scattering and energy loss in the detector materials. The Kalman filter is also used for track propagation throughout the detector system and is thus inherently used be the track finding algorithms.

6

Computational Challenges for the CBM Experiment

Track finding in the downstream detectors (TRD and/or MUCH) is performed with a track following method, using the tracks already found in the STS as seeds. A particular challenge is constituted by the muon detection system since the tracks have to be traced through the considerable material budget of the absorber slices. In the stray field of the magnet, a fourth-order Runge-Kutta is applied for the track propagation. Figure 6 illustrates the method. Tracks are subsequently propagated to the respective detector layers, where the nearest hit within a validation region defined by the errors of the track parameters is associated to the track, the parameters of which are then updated. Alternatively to the nearest-neighbour association, a branching method can be applied. The resulting average efficiency for muons with momenta above 5 GeV is 96%.

Fig. 6. Illustration of tracking in the muon detection system. Track seeds from the STS are propagated through an absorber towards a detector layer. After associating a hit to the track, its parameters are updated and the track is propagated further.

The pattern recognition of rings in the RICH detector uses a localised Hough transform, where a triplet of hits is assigned a ring centre and radius. In order to reduce the combinatorics, hit triplets are preselected using the expected ring radius for electrons to define the maximal hit distance. After the application of several quality criteria (e.g. number of hits per ring, χ2 ), found rings are attached to tracks based on proximity. Despite the large ring density (about 80 rings per events) and ring distortions due to the imperfect mirror geometry, about 92% of electron rings are correctly reconstructed by this method. For more details about reconstruction in the RICH, TRD and MUCH detectors, see [15]. Simulations for CBM and development and tests of the reconstruction algorithms are done within the software framework FAIRROOT [2, 1]. This C++ based framework uses the Virtual Monte Carlo concept, enabling to switch between different transport engines like GEANT3 or GEANT4, ROOT I/O, the ROOT geometry package for navigation and the ROOT task concept for both reconstruction and analysis. The simulations include detailed detector geometries with passive materials, supports and front-end electronics, advanced detector response models comprising charge propagation and discretisation on the read-

Computational Challenges for the CBM Experiment

7

out planes, and full reconstruction of space points, tracks and vertices. Using this framework, the feasibility to measure the major observables was demonstrated [9, 16, 10]. Figure 7 shows two examples of such feasibility studies for the physics channels D± → Kππ and J/ψ → µ+ µ− .

Fig. 7. Examples for physics feasibility studies with the FAIRROOT framework for central Au+Au collisions at 25A GeV. Left: Reconstruction of D± → Kππ with MVD and STS; right: reconstruction of J/ψ → µ+ µ− with STS and MUCH.

3

Reconstruction of Freely Streaming Data

As discussed in Sect. 1, the CBM experiment aims at very high interaction rates of up to 10 Mhz. At this event rate, we expect a raw data flow of about 1 TB/s from the front-end electronics to the data acquisition system (DAQ). This rate has to be reduced on-line to the targeted archival rate of 1 GB/s. Moreover, the trigger signatures for the rare observables are mostly complex and require the reconstruction of a major part of the event. As an example, selecting open charm candidates by their displaced decay vertex requires the reconstruction of the majority of all tracks in the STS and involves two-, three- or even four-particle combinatorics for the vertex reconstruction. Both the huge raw data rates and the absence of simple trigger primitives lead to a novel DAQ concept for CBM. The front-end electronics will run in an autonomous, self-triggered mode and deliver asynchronously time-stamped data messages on activation by a particle. The data messages will be shipped through fast optical links to a readout buffer and a high-throughput event building network. The archival decision is obtained after (partial) event reconstruction in a large computer farm (first-level event selector FLES). This DAQ concept is schematically depicted in Fig. 8. Because of the absence of hardware triggers, the system is not limited by latency, but only by throughput. As a consequence of this readout concept, the association of detector signals to a physical event is not given a priori by a hardware trigger. It must be

8

Computational Challenges for the CBM Experiment

Fig. 8. Comparison of the CBM data acquisition concept in comparison to conventional systems

instead performed in software, employing the timing information of each hit. At the highest interaction rates, the time difference of hits within the same event will be larger than the average event spacing, meaning that events will overlap in time. The problem is illustrated in Fig. 9 for two different interaction rates. The hit-to-event association then becomes non-trivial and probably requires an extension of the so far developed track finding routines to four dimensions, where the hit-to-track association employs proximity not only in coordinate space but also in time. In any case, the CBM reconstruction chain will have to operate not on eventsorted, but on freely streaming data. The development of proper algorithms thus requires the simulation of such streaming data. However, the current framework, as most other HEP frameworks, is fully oriented on event-by-event processing, where the input event is defined by a suitable event generator. The results presented in the previous section were obtained in this “conventional” way and, in this sense, represent an ideal situation. Simulation of a continuous data stream requires considerable modifications to the software framework and possibly also to the data model in general, which up to now is based on ROOT TClonesArrays as branches of a ROOT TTree. First steps to such modifications have already been undertaken. On the MonteCarlo side, the hits as delivered by the transport engine are buffered and sorted with respect to their absolute time, where the event time is sampled from a given beam model. A continuous, time-ordered stream of Monte-Carlo hits is delivered to the digitisers which model the detector response. The current digitisers will be modified in order to process this streaming MC data, taking into account also the timing response of the respective detector (e.g. charge propagation, channel dead times etc.). A dedicated task representing the DAQ system collects and

Computational Challenges for the CBM Experiment

9

Fig. 9. Hit time distribution in the STS for an interaction rate of 1 MHz (upper panel) and 10 MHz (lower panel). While in the first case, the hit-to-event association is obvious, events overlap at 10 MHz event rate.

buffers the detector hits and delivers them to the reconstruction chain in suitable packets (time slices). In running modes with moderate event rates, the digital hits can be pre-sorted using their time information alone (Fig. 9, upper panel), and event-based reconstruction can be run as before. At highest interaction rates, the reconstruction will have to operate directly on the freely streaming raw data. This scheme, once realised, will be close to simulating the real raw data stream and its processing, hence blurring the traditional difference between online and offline software.

4

Online Data Processing

Another challenge arising from the huge data rates envisaged by CBM is the necessity to online reduce the raw data rate by about three orders of magnitude. This task will be performed by the FLES, a complex high-performance computing farm using FPGAs at the input stage and heterogeneous many-core architectures for feature extraction and event selection [5]. The reconstruction algorithms employed by the online event selection in the FLES must thus be extremely fast by properly exploiting the benefits of modern and future computer architectures. While vectorisation (SIMD) allows to introduce parallelism on the data level, multi-threading and many-core processing enable parallelism on the task level. Already by now, off-the-shelf computers come with several cores; in

10

Computational Challenges for the CBM Experiment

the future, hundreds of cores per computing unit will be standard. Not exploiting this features would thus mean to use only a small fraction of the hardware capabilities. All reconstruction routines within the CBM framework are thus being optimised with respect to parallelism. Several computer architectures and their corresponding programming environment have been investigated: multi-core Intel and AMD CPUs, the IBM cell processor, and graphics cards like NVIDIA using the CUDA language. As an example, the Kalman filter track fit could be sped up by five orders of magnitude from the initial scalar version to an optimised, vectorised version running on an array of Cell processors [12]. Lately, the KF track fit was implemented by Intel using their Array Building Blocks; it showed similar performance in terms of speed and track fit quality. In a similar manner, also the Cellular Automaton track finding algorithm in the STS was re-written using massive parallelisation. Of high importance is the scalability of the software on many-core systems, which demonstrates the feasibility to effectively run the reconstruction in a complex computer farm. Figure 10 shows the performance of the KF track fit and the CA track finder on an AMD E6164HE with 48 cores [14]. Scalability is obtained if the data load per thread is optimised such that the relative overhead of the thread management is negligible.

Fig. 10. Scalability of the performance of the Kalman Filter track fit (left) and the CA track finder (right) with respect to the number of used cores on an AMD E6164HE computer with 48 cores. The results were obtained in cooperation with CERN Openlab.

Based on the current performance of the CBM reconstruction algorithms, we estimate the FLES farm to consist of about 60,000 cores. The complexity of this system will thus be comparable to today’s LHC grid. Given the fast development in hardware architectures, the concrete layout of the FLES farm is not fixed yet. CBM will thus continue to investigate its reconstruction routines on coming

Computational Challenges for the CBM Experiment

11

architectures. These activities will be pursued in cooperation with other current and future experiments (ALICE, STAR, PANDA) partly sharing the problems of fast event reconstruction in a high track density environment.

References 1. Al-Turany, M. and Uhlig, F.: The FAIRROOT framework. PoS(ACAT08)048 (2008) 2. Bertini, D. et al.: The FAIR simulation and analysis framework. J. Phys. Conf. Ser. 119, 032011 (2008) 3. CBM Experiment, http://www.gsi.de/forschung/fair_experiments/CBM/ index_e.html 4. CBM Experiment, Technical Status Report, Darmstadt (2005) 5. De Cuveland, J.: A first-level event selector for the CBM experiment at FAIR. In: Proceedings of CHEP 2010, in press 6. Facility for Antiproton and Ion Research, http://www.fair-center.de 7. FAIR Conceptual Design Report, Darmstadt (2001) 8. FAIR Baseline Technical Report, ISBN 3-9811298-0-6, Darmstadt (2006) 9. Friese, V.: The CBM experiment at FAIR. PoS(CPOD07)056 (2007) 10. Friese, V.: Prospects for strangeness and charm measurements with the CBM experiment. J. Phys. G 37, 094025 (2010) 11. Friman, B. et al. (Eds.): The CBM Physics Book. Lect. Notes Phys. 814, Springer, Berlin, Heidelberg (2011) 12. Gorbunov, S. et al.: Fast SIMDized Kalman Filter track fit. Comp. Phys. Comm. 178, 374 (2008) 13. Kisel, I.: Event reconstruction in the CBM experiment. Nucl. Instrum. Meth. Phys. Res. A 566, 85 (2006) 14. Kisel, I.: Many-core scalability of the online event reconstruction in the CBM experiment. In: Proceedings of CHEP 2010, in press. 15. Lebedev, S.: Algorithms and Software for Event Reconstruction in the RICH, TRD and MUCH detectors of the CBM experiment. This volume 16. Senger, P.: FAIR/CBM capabilities for charm and dilepton studies. PoS(CPOD 2009)042 (2009)

Suggest Documents