Khosa, C. K., Sanz, V., & Soughton, M. (2022). A simple guide from machine learning outputs to statistical criteria in particle physics. SciPost Phys. Core, 5(4), 050–31pp.
Abstract: In this paper we propose ways to incorporate Machine Learning training outputs into a study of statistical significance. We describe these methods in supervised classification tasks using a CNN and a DNN output, and unsupervised learning based on a VAE. As use cases, we consider two physical situations where Machine Learning are often used: high-pT hadronic activity, and boosted Higgs in association with a massive vector boson.
|
Garcia Navarro, J. E., Fernandez-Prieto, L. M., Villaseñor, A., Sanz, V., Ammirati, J. B., Diaz Suarez, E. A., et al. (2022). Performance of Deep Learning Pickers in Routine Network Processing Applications. Seismol. Res. Lett., 93, 2529–2542.
Abstract: Picking arrival times of P and S phases is a fundamental and time‐consuming task for the routine processing of seismic data acquired by permanent and temporary networks. A large number of automatic pickers have been developed, but to perform well they often require the tuning of multiple parameters to adapt them to each dataset. Despite the great advance in techniques, some problems remain, such as the difficulty to accurately pick S waves and earthquake recordings with a low signal‐to‐noise ratio. Recently, phase pickers based on deep learning (DL) have shown great potential for event identification and arrival‐time picking. However, the general adoption of these methods for the routine processing of monitoring networks has been held back by factors such as the availability of well‐documented software, computational resources, and a gap in knowledge of these methods. In this study, we evaluate recent available DL pickers for earthquake data, comparing the performance of several neural network architectures. We test the selected pickers using three datasets with different characteristics. We found that the analyzed DL pickers (generalized phase detection, PhaseNet, and EQTransformer) perform well in the three tested cases. They are very efficient at ignoring large‐amplitude transient noise and at picking S waves, a task that is often difficult even for experienced analysts. Nevertheless, the performance of the analyzed DL pickers varies widely in terms of sensitivity and false discovery rate, with some pickers missing a significant percentage of true picks and others producing a large number of false positives. There are also variations in run time between DL pickers, with some of them requiring significant resources to process large datasets. In spite of these drawbacks, we show that DL pickers can be used efficiently to process large seismic datasets and obtain results comparable or better than current standard procedures.
|
Conde, D., Castillo, F. L., Escobar, C., García, C., Garcia Navarro, J. E., Sanz, V., et al. (2023). Forecasting Geomagnetic Storm Disturbances and Their Uncertainties Using Deep Learning. Space Weather, 21(11), e2023SW003474–27pp.
Abstract: Severe space weather produced by disturbed conditions on the Sun results in harmful effects both for humans in space and in high-latitude flights, and for technological systems such as spacecraft or communications. Also, geomagnetically induced currents (GICs) flowing on long ground-based conductors, such as power networks, potentially threaten critical infrastructures on Earth. The first step in developing an alarm system against GICs is to forecast them. This is a challenging task given the highly non-linear dependencies of the response of the magnetosphere to these perturbations. In the last few years, modern machine-learning models have shown to be very good at predicting magnetic activity indices. However, such complex models are on the one hand difficult to tune, and on the other hand they are known to bring along potentially large prediction uncertainties which are generally difficult to estimate. In this work we aim at predicting the SYM-H index characterizing geomagnetic storms multiple-hour ahead, using public interplanetary magnetic field (IMF) data from the Sun-Earth L1 Lagrange point and SYM-H data. We implement a type of machine-learning model called long short-term memory (LSTM) network. Our scope is to estimate the prediction uncertainties coming from a deep-learning model in the context of forecasting the SYM-H index. These uncertainties will be essential to set reliable alarm thresholds. The resulting uncertainties turn out to be sizable at the critical stages of the geomagnetic storms. Our methodology includes as well an efficient optimization of important hyper-parameters of the LSTM network and robustness tests.
|
Esser, F., Madigan, M., Sanz, V., & Ubiali, M. (2023). On the coupling of axion-like particles to the top quark. J. High Energy Phys., 09(9), 063–39pp.
Abstract: In this paper we explore the coupling of a light axion-like particle (ALP) to top quarks. We use high-energy LHC probes, and examine both the direct probe to this coupling in associated production of a top-pair with an ALP, and the indirect probe through loop-induced gluon fusion to an ALP leading to top pairs. Using the latest LHC Run II data, we provide the best limit on this coupling. We also compare these limits with those obtained from loop-induced couplings in diboson final states, finding that the +MET channel is the best current handle on this coupling.
|
Barenboim, G., Del Debbio, L., Hirn, J., & Sanz, V. (2024). Exploring how a generative AI interprets music. Neural Comput. Appl., 36, 17007–17022.
Abstract: We aim to investigate how closely neural networks (NNs) mimic human thinking. As a step in this direction, we study the behavior of artificial neuron(s) that fire most when the input data score high on some specific emergent concepts. In this paper, we focus on music, where the emergent concepts are those of rhythm, pitch and melody as commonly used by humans. As a black box to pry open, we focus on Google’s MusicVAE, a pre-trained NN that handles music tracks by encoding them in terms of 512 latent variables. We show that several hundreds of these latent variables are “irrelevant” in the sense that can be set to zero with minimal impact on the reconstruction accuracy. The remaining few dozens of latent variables can be sorted by order of relevance by comparing their variance. We show that the first few most relevant variables, and only those, correlate highly with dozens of human-defined measures that describe rhythm and pitch in music pieces, thereby efficiently encapsulating many of these human-understandable concepts in a few nonlinear variables.
|