|
Fernandez Casani, A., Orduña, J. M., Sanchez, J., & Gonzalez de la Hoz, S. (2021). A Reliable Large Distributed Object Store Based Platform for Collecting Event Metadata. J. Grid Comput., 19(3), 39–19pp.
Abstract: The Large Hadron Collider (LHC) is about to enter its third run at unprecedented energies. The experiments at the LHC face computational challenges with enormous data volumes that need to be analysed by thousands of physics users. The ATLAS EventIndex project, currently running in production, builds a complete catalogue of particle collisions, or events, for the ATLAS experiment at the LHC. The distributed nature of the experiment data model is exploited by running jobs at over one hundred Grid data centers worldwide. Millions of files with petabytes of data are indexed, extracting a small quantity of metadata per event, that is conveyed with a data collection system in real time to a central Hadoop instance at CERN. After a successful first implementation based on a messaging system, some issues suggested performance bottlenecks for the challenging higher rates in next runs of the experiment. In this work we characterize the weaknesses of the previous messaging system, regarding complexity, scalability, performance and resource consumption. A new approach based on an object-based storage method was designed and implemented, taking into account the lessons learned and leveraging the ATLAS experience with this kind of systems. We present the experiment that we run during three months in the real production scenario worldwide, in order to evaluate the messaging and object store approaches. The results of the experiment show that the new object-based storage method can efficiently support large-scale data collection for big data environments like the next runs of the ATLAS experiment at the LHC.
|
|
Mendez, V., Amoros, G., Garcia, F., & Salt, J. (2010). Emergent algorithms for replica location and selection in data grid. Futur. Gener. Comp. Syst., 26(7), 934–946.
Abstract: Grid infrastructures for e-Science projects are growing in magnitude terms. Improvements in data Grid replication algorithms may be critical in many of these infrastructures. This paper shows a decentralized replica optimization service, providing a general Emergent Artificial Intelligence (EAI) algorithm for the problem definition. Our aim is to set up a theoretical framework for emergent heuristics in Grid environments. Further, we describe two EAI approaches, the Particle Swarm Optimization PSO-Grid Multiswarm Federation and the Ant Colony Optimization ACO-Grid Asynchronous Colonies Optimization replica optimization algorithms, with some examples. We also present extended results with best performance and scalability features for PSO-Grid Multiswarrn Federation.
|
|
Mendez, V., Amoros, G., & Kaci, M. (2011). A Decentralized Deployment Strategy and Performance Evaluation of LCG File Catalog Service. J. Grid Comput., 9(3), 345–354.
Abstract: The LHC Computing Grid (LCG) leads by CERN, has solved with the LCG File Catalog (LFC) the major problem of scaling the data management catalog. However, additional performance issues should be faced to deploy a painless catalog service. With this aim, we present a decentralized LFC server configuration, and its performance evaluation compared with the traditional LFC deployment. A performance analysis is shown, including not only the catalog server, but also analysing the client side overhead. We find that the LFC service has in the clients a relevant workload of the overall service. The experimental results show that the proposed LFC deployment for servers and clients, improves the performance of the service.
|