Muon calibration data extraction and distribution for

0 downloads 0 Views 691KB Size Report
trigger allows to select all (and only) the hits from a single track and to add some ... The MDT (Monitored Drift Tube) calibration is shown as a use case, ... procedures for tracking detectors. .... data from ROD on VME bus is a heavy task, and is.
2005 IEEE Nuclear Science Symposium Conference Record

N14-197

Muon calibration data extraction and distribution for the Atlas experiment E. Pasqualucci, S. Falciano, A. De Salvo, A. Di Mattia, F. Marzano, L. Luminari, D. Orestano, B. Martin, C. Meirosu Abstract– In the ATLAS experiment, fast calibration of the detector is vital to feed both prompt data reconstruction with fresh calibration constants. At the same time, online data extraction presents several advantages, since the data rate needed to have data sets taken in homogeneous conditions can be achieved without performing special runs that would require special tuning of parameters and architecture of the ATLAS TDAQ system. The best place to get muon tracks suitable for muon detector calibration is the second level trigger, where the pre-selection of data sitting in a limited region by the first level trigger allows to select all (and only) the hits from a single track and to add some useful information to speed up the calibration process. The MDT (Monitored Drift Tube) calibration is shown as a use case, as it can be generalized to the entire muon system. In this case, the extracted fragment size has been evaluated to be about 800 bytes. Since at low luminosity the achievable data throughput is about 12 kHz, the total data throughput is 9.6 MB/s. The data collection model is based on a two level data concentration: pre-selected hits from muon tracks plus some auxiliary information are sent by any second level trigger machine in a rack to a server; a calibration server collects data from the several rack servers and sends them to one or more calibration farm(s). Different options are being explored to ensure the quality of service needed for data distribution.

I. MDT CALIBRATION AND REQUIREMENTS the ATLAS experiment, fast calibration of the detector is Iessential to feed prompt data reconstruction with fresh N

calibration constants. Starting the reconstruction within 24 hours from data taking is a stringent requirement to calibration procedures for tracking detectors. This requirement is enforced by the need of data sets taken in as homogeneous as possible run conditions. The calibration of the precision chambers of the ATLAS muon spectrometer is a typical example of a calibration procedure that can benefit from online data selection and extraction in order to fulfill this requirement. The precision chambers of the ATLAS muon spectrometer have been built with the Monitored Drift Tube (MDT) technology. The requirements of high accuracy and low Manuscript received October 14, 2005. E. Pasqualucci, S. Falciano, A. De Salvo, A. Di Mattia, F. Marzano and L.Luminari are with INFN (Istituto Nazionale di Fisica Nucleare) sezione di Roma, P.le A. Moro 2, 00185 Roma (Italy). D. Orestano is with Dipartimento di Fisica, Universita’ di Roma Tre and INFN (Istituto Nazionale di Fisica Nucleare) sezione di Roma 3, via della Vasca Navale 84, 00146, Roma (Italy). B. Martin and C. Meirosu are with CERN, CH-1211 Genève 23 (Switzerland).

0-7803-9221-3/05/$20.00 ©2005 IEEE

systematic error in muon momentum reconstruction can only be accomplished if the calibrations are known with an accuracy of some micrometers. The relation between the drift path and the measured time (the so called r-t relation) depends on many parameters (e.g. temperature, hit rate, gas composition, signal thresholds and so on) varying with time. It has to be measured from data without the use of an external detector, using the auto-calibration technique. This technique relies on an iterative procedure applied to the same data sample, starting from a preliminary set of constants. In order to obtain the required precision, the r-t relation computation requires a large amount of non-parallel tracks crossing a region, called calibration region, i.e. the region of the MDT chamber sharing the same r-t relation. The evaluated number of required tracks is 10 to 100 thousands per region. The required data rate can be obtained considering the data taking time and the number of calibration regions. We suppose to take data between two LHC fills (10 to 15 hours). The number of calibration regions has to be evaluated as a compromise between the need of keeping things simple and not requiring too high statistics (that would also imply a very large CPU processing time) and the need of reducing systematic errors by separately calibrating regions where parameters may take very different values. Since there are 1200 chambers, each of them made of two multi-layers, a good guess for calibration granularity can be obtained dividing each multi-layer into 4 wire segments, so giving 9600 calibration regions. Since each muon crosses 3 chambers, i.e. 6 regions (see Fig. 1), we have 1600 calibration towers. Taking into account the number of required tracks per region, we get a data rate between 300 Hz and 3 kHz. Other important parameters for the operation of this detector can be extracted from the drift time distributions, in particular the time corresponding to the null drift distance, called T0. This is a slowly varying quantity, which can be recomputed weekly using 104 muons per tube, corresponding to a minimum rate of 1 kHz. We stress the fact that the final event rate in ATLAS is about 200 Hz, the rate of events containing interesting muon tracks being about 40 Hz. Moreover, we are interested only in data from muon tracks (i.e. less than 1 KB data fragments), while full event size is evaluated about 1.6 MB. Then, data cannot be taken from the final data set. On the other hand, special calibration runs involving only the muon detector and excluding high level trigger selection would require ad-hoc tuning of data acquisition parameters, having to deal with high

830

event rate and small data size. In addition, only a small fraction of data taking time could be dedicated to them. All these difficulties can be overcome if we are able to extract our data online, provided we fulfill some additional requirements: 1) The data collection scheme must fit in the actual trigger/DAQ architecture; 2) The event rate must be sufficient for the calibration task; 3) The data fragments must contain only interesting data; 4) The overhead of tasks providing calibration data collection must be negligible; 5) The required bandwidth must be already available in the trigger/DAQ system; 6) Some flexibility is desirable, in order to provide data streaming, data pre-selection and possibly seeding of calibration procedures.

ROBins, each of them connected to 3 ROD. For each event, only data in the RoI are requested to the ROS and sent to the level-2 system. In this way, data transfer to the level-2 trigger is simplified and only data from chambers in the RoI are moved and reconstructed in the level-2 Processing Units.

II. DATA ORGANIZATION In the ATLAS muon detector, MDT chambers are read out by on-chamber ADC and TDC; data are then concentrated into a CSM (Chamber service Module). Each set of 6 (or 8) CSM, i.e. 6 or 8 chambers, is connected via optical links to a ReadOut Driver (ROD). Fig. 1 shows the data organization for the muon spectrometer. The level-1 trigger has been designed to identify high momentum muon candidates. Starting from the hits in the trigger chambers, it defines a Region of Interest (RoI), defined as a trigger tower as shown in the figure plus the “nearest” one – i.e. the one closest to the level-1 hit pattern. The RoI is then used to input the level-2 trigger with data from trigger and precision chambers to perform local data reconstruction. It is then natural to organize the detector readout into trigger towers.

Fig. 1. Data organization in the ATLAS muon spectrometer. The level-1 trigger identifies a muon candidate starting from the hits in the trigger chambers (in this figure, the red points are RPC hits in the barrel), and defines a Region of Interest of the detector containing the interesting hits to be processed by the level-2 trigger.

A ROD collects data from chambers in the same trigger tower. Each ROD sends its event fragments via an optical link to a ROBin, i.e. an intelligent buffer sitting in a Read-Out System (ROS), where data are stored until the second level trigger accepts or discards the event. Each ROS contains 4

Fig. 2. Simplified picture of the ATLAS data acquisition system.

III. DATA SOURCES Fig. 2 shows a simplified picture of the ATLAS data acquisition system. Data coming from on-detector front-end electronics are collected by ROD (Read-Out Drivers) at level1 trigger rate (75 to 100 kHz). Only data from a RoI are transferred to the level-2 trigger; fragments of accepted events are then collected by ROS and sent to an event building farm node (Sub Farm Input, SFI). The full events are then analyzed by the event filter and sent to an SFO (Sub Farm Output) process to be staged on disk. In principle, data can be extracted at any level, both using data sampling and monitoring facilities and creating ad-hoc systems. If we analyze the characteristics of each possible level, we get the following figures: 1) At ROD level, data are level-1 accepted and granularity is a trigger tower. Muon rate is 23 to 38 kHz and fragment size is less than 1 kB. Nevertheless, sampling data from ROD on VME bus is a heavy task, and is practically impossible at high rate. RoI information is not available at this level, implying the readout of all the muon ROD and a later event building. 2) At ROS level, data are level-2 accepted and granularity is 12 trigger towers. Rate is of the order of 1 kHz. Data sampling at this level can significantly affect data flow performance, and no RoI information is available in order to select data. 3) At level-2 level, granularity is a RoI. Muon rate is 12 to 23 kHz and data from both trigger and precision chambers are available. Data can be pre-selected and seeding information for offline reconstruction is available. Data size is about 800 bytes. 4) At SFI level, data are level-2 accepted. Total rate is of the order of 1 kHz (muon rate is about 150 Hz), but data size is 1.6 MB and the sampling task is heavy at high rate. No muon selection.

831

5) At the event filter level, muon rate is 40 to 150 Hz and event size is 1.6 MB. Overhead for data extraction is negligible. 6) At the SFO level, muon rate is 40 Hz and event size is 1.6 MB. The most appropriate place to fulfill our requirements is the level-2 Processing Unit, all and only the needed information is available, data rate is sufficient, pre-selection is possible and tracking parameters are available to seed data reconstruction. IV. DATA EXTRACTION The level-2 trigger algorithms run in a farm of about 500 nodes, divided in 20 racks. Each rack hosts a boot/file server for the level-2 nodes. Each node runs 3 L2PU (Level-2 Processing Units). Fig. 3 shows the overall architecture of the proposed system. Data prepared in level 2 PU are sent to a collector in the local file server. The collector packs events together in order to optimize data transfer to a global collector and, optionally writes data to the local disk. Given a global muon rate of 12 kHz, the total data rate to the local server is 480 kB/s. A global collector (Calibration Server) collects all data from the local collectors and writes them to a staging disk. Data can then be sent to calibration farms (Tier 0 or Tier 2 in the LHC computing model) for processing. We stress that the calibration server and its network connections are the only additional hardware resources needed to implement the system.

V. DATA DISTRIBUTION The calibration procedure requires that each calibration process analyze the full statistics for a calibration region. Calibration data can be easily streamed according to the calibration region and sent to different processor of a local farm or even distributed among different remote farms. Due to the large amount of CPU needed, the second option is currently under investigation. Three sites are candidate to host Tier 2 farms to be used for muon detector calibration: Rome, Munich and Michigan University. Data transfer protocols to remote farms are under study. Currently, European sites are evaluating data transfer through the Tier0-Tier1-Tier2 data path; data transfer through Ultra Light network is under study for the American site. VI. PRELIMINARY MEASUREMENTS We tested two data moving protocols for data extraction: the CORBA-based ATLAS monitoring protocol and the TCP protocol. All tests have been performed using a data generator emulating data length distribution, data arrival time distribution and level-2 processing time. Preliminary measurements of latency in data transmission and CPU occupancy have been performed using the test farm in INFN, Rome. The farm is made of a file/boot server with 12 clients plus a machine working as a calibration server. Aim of the measurements is to demonstrate the effectiveness of the model with the two protocols.

Fig. 3. Global architecture of the proposed system.

In order to fulfill the requirements, the latency added to the muon level-2 trigger must be negligible with respect to the processing time (about 10 ms) and the load on local servers must be negligible not to affect servers’ tasks.

The software architecture is shown in Fig. 4. Data prepared by L2PUs are written to ring buffers. A data reader collects data from the local buffers and sends them to the local server. In this way, the L2PU is completely decoupled from the data

832

moving system and the latency added to the level-2 trigger is only due to data preparation and output to the ring buffer. Preliminary measurements show an additional latency of the order of some hundreds of microseconds.

Moreover, in case of problems in the data moving system, the L2PUs see a buffer full condition and continue to work without sending data to it. The further degradation in level-2 algorithm performance due to reader CPU usage is negligible (less than 0.1% with both protocols). The local server packs them into large buffers to optimize data transmission to the calibration server. The two implementations of the server are very different. The monitoring-based server is implemented as a standard C++ ATLAS monitoring application capable to act as an event sampler on its internal buffer. The TCP-based server is implemented as a CORBA application managing incoming TCP connection and re-sending data to another server. In servers’ case, the parameter affecting the level-2 system is the usage of the local boot/file server. In this sense, since the disk is generally not used and the data throughput is very low, the only problem can be related to CPU usage. Preliminary measurements show a CPU usage around 4% with both protocols, thus proving the effectiveness of the proposed model.

Fig. 4. Software structure for data extraction from level-2 trigger. Three Level-2 Process Units run in each node of a level-2 node. Data are prepared and written to a ring buffer; a data reader sends data to a server program in the local file server. Data are then packed together, optionally written to disk and sent to a calibration server. The calibration server writes data to a staging disk.

833

Suggest Documents