Popularity. Service. Traces. Replica. Service. Accounting. ANALYSIS JOBS. Traces ... policy defined in the ATLAS Software Model, and consequently has been ...
Popularity framework to process dataset tracers and its application on dynamic replica reduction in the ATLAS experiment Angelos Molfetas, Fernando Barreiro Megino, Andrii Tykhonov, Vincent Garonne, Simone Campana, Mario Lassnig, Martin Barisits, Gancho Dimitrov, Florbela Tique Aires Viegas CERN, PH-ADP/DDM, 1211 Geneva, Switzerland Joˇzef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia Deutsches Elektronen-Synchrotron, Hamburg, Germany Abstract. The ATLAS experiment’s data management system is constantly tracing file movement operations that occur on the Worldwide LHC Computing Grid (WLCG). Due to the large scale of the WLCG, direct statistical analysis of the traces is impossible in real-time. Factors that contribute to the scalability problems include the capability for users to initiatiate on-demand queries, high dimensionality of tracer entries combined with very low cardinality parameters, the large size of the namespace as well as rapid rate of file transactions occuring on the Grid. These scalability issues are alleviated through the adoption of an incremental model that aggregates data for all combinations occurring in selected tracer fields on a daily basis. Using this model it is possible to query on-demand relevant statistics about system usage. We present an implementation of this popularity model in the experiment’s distributed data management system, DQ2, and describe a direct application example of the popularity framework, an automated cleaning system, which uses the statistics to dynamically detect and reduce unpopular replicas from grid sites. This paper describes the architecture employed by the cleaning system and reports on the results collected from a prototype during the first months of the ATLAS detector operation.
1. Introduction The ATLAS experiment [1] utilises the WLCG grid to store and process data in numerous sites around the world. The experiment’s distributed data management operations are undertaken by the DQ2 system, which manages petabytes of experimental data for over 750 storage elements. DQ2 is built on top of the EGEE gLite [2], NorduGrid ARC [3], Open Science Grid (OSG) [4]. The DQ2 provides a middleware interface allowing users to interact with the data on the grid without requiring an awareness of the heterogenous systems that make up the grid. Furthermore, the DQ2 system maintains a central catalog of all data stored on the ATLAS grid, which is organised into logical collections of files called datasets. DQ2 insures consistency between the central catalog, local file catalogs in sites, and storage systems. Datasets that are replicated on sites in the grid are referred to us replicas. This paper describes the DQ2 popularity framework, and demonstrates its implementantion as a DQ2 subsystem responsible for providing usage statistics on datasets on the grid.
Attribute UUID Event Type Tool Version Remote Site Local Site Time Start Time End DUID Version Client State File Size GUID Catalog Time Transfer Time Validation Start Application ID Hostname IP
Description Unique identifier shared by traces originate from the same operation Application type initiating download, e.g. an analysis job Release version of grid application Site where data is being downloaded/uploaded to Site where data is being downloaded/uploaded from Time operation started Time operation ended Unique dataset ataset identifier Version of Dataset Final state of client; either done or error code Size of file Unique identifier for file Time stamp when file catalogs were queried Time stamp when first byte arrives Time stamp when validation began Operation ID, assigned by the application Name of Host Host’s IP address Table 1. Attributes and descriptions for the DQ2 Tracer
Furthermore, this paper demonstrates an application of this system, a new DQ2 service, the replica reduction agent, which is responsible for removing unpopular replicas from the grid.. The paper continues as follows: Section 2 discusses the DQ2 tracer framework which is responsible for providing the raw tracer data to popularity. Section 3 describes DQ2 popularity. The paper then discusses an application of DQ2 Popularity, the Replica Reduction Agent 4, before concluding. 2. DQ2 Trace system All grid applications that move data on the ATLAS grid submit a hash table per file transaction to a central server, utilising the HTTP protocol. Each hash table contains the transaction parameters which are summarised in table 2. Once receive by the tracer central server, the hash tables are converted into Oracle DB rows and inserted into the tracer table. 3. Popularity The popularity framework, generates quickly queryable dataset statistics. These queries can return how often datasets were used in a given time period, and can provide this information in terms by dataset name, remote site, local site, and user. Figure 1 shows the interaction of the DQ2 Popularity subsystem with the rest of the DQ2 System. The popularity service works by aggregating traces into dataset popularity statistics about how often datasets were requested on the grid. These statistics are collected both in terms of file and dataset transfers. The popularity statistics are established for a number of attributes (dataset name, day, eventtype, duid, version, local site, remote site, user). Thus, a user can get information about how many times datasets were requested over the grid from a particular site or using a particular protocol. The system calculates these statistics for all existing combinations of the stated attributes. This allows the user to launch queries involving any number of attributes. For example, a user can request the popularity for all datasets that have a dataset name matching
END USERS (DDM CLIENTS) Traces
Published Summaries
ANALYSIS JOBS
Traces
Tracer Service
Deletion Status
Traces Replica Service ORACLE DATABASE
Accounting
Popularity Service
Deletion Instructions
Replica Reduction Agent
Information (e.g. disk space) Unpopular Datasets
CENTRAL SERVICES
Figure 1. Popularity and the Replica Reduction Agent the following wild card string data107 T eV.∗, and belong to a certain event type. Another example, is requesting the number of requests done for all datasets between two specific sites. Since the popularity framework maintains popularity statistics for all the existing attribute query combinations, the number of rows generated is very high. To resolve this problem, the popularity framework aggregates traces using a daily granularity. In addition, since traces are collected at the file level, the popularity framework is able to further reduce the number of rows inserted onto the Oracle database by reducing aggregating all transactions at the dataset level. For example, if a particular dataset is downloaded on ten different occasions (for a particular dataset, between two sites, by a specific user) resulting in 10,000 file transfers, this will create 10,000 rows in the tracer system. In the popularity system, however, this will aggregated into a single row with two additional column entries, specifying that there were 100 requests for this datasets and 10,000 file transfers (for the given set of attribute parameters). Figure 2 illustrates a simplified example of how multiple traces are aggregates into popularity entries. In this example, the first three rows share the same dataset, local site, remote site, and user, and consequently these three traces are combined into a single popularity entry. As all three rows share the same operational id, they all belong to the same transaction, thus in the popularity row the number of dataset events is set to 1. As there are three different files downloaded, so the number of files parameter is set to three. The following three rows also share common parameters (dataset, local site, remote site, and user) and involve the movement of three files on the grid. However, there are two separate operation ids, that they are two separate transactions. Consequently, when these rows get aggregate, the number of dataset events is set to two instead.
Traces
Tracer Table ROW OP ID DATASET
USR L_SITE R_SITE FILE
1
55
XYB
AM
SITE_A SITE_B
13
2
55
XYB
AM
SITE_A SITE_B
15
3
55
XYB
AM
SITE_A SITE_B
16
4
23
XYA
AM
SITE_A SITE_B
14
5
24
XYA
AM
SITE_A SITE_B
17
6
24
XYA
AM
SITE_A SITE_B
18
Popularity Report Table ROW DATASET USR L_SITE R_SITE DST EV FILE NO 1
XYB
AM SITE_A SITE_B
1
3
2
XYA
AM SITE_A SITE_B
2
3
Popularity Summaries
Figure 2. Aggregation of traces into popularity entries 4. Replica Reduction Agent 4.1. Background The system is particularly critical given the evolution of the storage space usage as can be seen in figure 3. The green upper green line shows the available space and the stacked coloured plot is the used space. The figure demonstrates that storage resources have been rapidly used in the ATLAS grid, the total of which is illustrated by the blue line which shows the total pledged space by the sites. 4.2. Solution In order to insure the optimal usage of grid storage resources, a dynamic replica reduction agent was designed that removes unpopular replicas from grid sites. It acquires the replica popularity information from the DQ2 Popularity system. It facilitates data analysis at sites by allowing a generous data distribution and keeping accessible space with data that has proven to be the most popular for analysis and that is known to be replicated in other sites. Replicas on the grid are organised into two categories. Replicas mandated by the policy defined by the ATLAS Computer Model [?]. The replica reduction agent must comply to the policy defined in the ATLAS Software Model, and consequently has been configured to only delete replicas that are made in addition to the ATLAS computing model requirements. 4.3. Method The replica reduction system performs has to perform three steps (as illustrated in 4): (i) Selection of full site (ii) Selection of unpopular datasets
PLEDGED SPACE
Figure 3. Available evolution of storage usage large (T1) sites in the ATLAS Grid. The upper green line shows the available space and the stacked, colored plot is the used space (iii) Publication of actions 4.3.1. Selection of full site The storage space information of each site (used and available storage space) is retrieved from the DDM Accounting [?] subsystem and it is checked if the used space is exceeding certain thresholds. The selection of these thresholds has to be done in a way that prevents a disk becoming full and has to be taken off the new data distribution during the clean-up, as the deletion from the storage element is executed after a certain grace period and the replication of new data can be very fast. Operational experience has shown that the threshold has to be set between 10% and 24%, depending on the endpoint type and the activity that it serves. As well, individual thresholds for particular sites with different needs can be set in the DDM Information Server and will be used by Victor. 4.3.2. Selection of unpopular datasets The next step is to select the replicas that are going to be deleted from the full sites. This is done by an incremental algorithm, where numerous deletion checks are made, each being more permissive and allow the deletion of more popular datasets until the site can be cleaned or a maximal popularity parameter has been reached. The datasets in each step are sorted from older to newer replica creation date and the cleaning is constrained to secondary replicas with creation dates older than 15 days. Replicas need to be at the site for a minimal period to prove their popularity and a data flow on sites that is writing and deleting simultaneously the same replicas has to be avoided. Once the list of replicas to delete has been decided, it is passed to the central deletion service in DQ2 that will take care of scheduling the physical deletion of the files after a certain grace period. 4.3.3. Publication of actions Once the cleaning agent has performed the actions they have to be published to inform the central operations team and the different cloud teams. At the moment a web page is generated on the fly, but as the use of the system is spreading and historical statistics and more advanced monitoring is demanded, future plans include the generation of dynamic web pages that on demand query the information from the DDM Central Catalogues and if needed from a dedicated database backend.
DDM Accoun-ng Space information
DDM Popularity
DDM info server
Secondary replica popularity
Victor 1. Selec-on of full sites
Full sites
2. Selec-on of unpopular replicas
Replicas to delete
DDM Dele-on Service
Deleted replicas
Full sites
3. Publica-on of decisions
Figure 4. Dynamic Replica Reduction Agent 5. Conclusion This paper demonstrated DQ2 Popularity which is a system for calculating usage statistics for sets of data on the grid. This paper also demonstrated an application based on DQ2 Popularity, a replica reduction agent which reduces the number of replicas on the grid. Calculating and storing statistics for every existing combination of attributes generates a lot of combinations. Furthermore, many attributes have a small cardinality rendering Oracle Indexes ineffective. Consequently, in order to insure that the popularity system scales effective, it is necessary to utilise a daily incremental model where tracer entries are aggregated on a daily basis. References [1] CERN 1994 Atlas technical report Tech. rep. CERN [2] Laure E, Hemmer F, Prelz F, Beco S, Fisher S, Livny M, Guy L, Barroso M, Buncic P, Kunszt P Z, Di Meglio A, Aimar A, Edlund A, Groep D, Pacini F, Sgaravatto M and Mulmo O 2004 Computing in High Energy Physics and Nuclear Physics 2004 p 826 [3] Ellert M, Grønager M, Konstantinov A, K´ onya B, Lindemann J, Livenson I, Nielsen J L, Niinim¨ aki M, Smirnova O and W¨ aa ¨n¨ anen A 2007 Future Generation Computer Systems 23 219–240 URL http://portal.acm.org/citation.cfm?id=1232286.1232292 [4] Pordes R, Petravick D, Kramer B, Olson D, Livny M, Roy A, Avery P, Blackburn K, Wenaus T, W¨ urthwein F, Foster I, Gardner R, Wilde M, Blatecky A, McGee J and Quick R 2007 Journal of Physics: Conference Series 78 011345