|
Otten, S., Caron, S., de Swart, W., van Beekveld, M., Hendriks, L., van Leeuwen, C., et al. (2021). Event generation and statistical sampling for physics with deep generative models and a density information buffer. Nat. Commun., 12(1), 2985–16pp.
Abstract: Simulating nature and in particular processes in particle physics require expensive computations and sometimes would take much longer than scientists can afford. Here, we explore ways to a solution for this problem by investigating recent advances in generative modeling and present a study for the generation of events from a physical process with deep generative models. The simulation of physical processes requires not only the production of physical events, but to also ensure that these events occur with the correct frequencies. We investigate the feasibility of learning the event generation and the frequency of occurrence with several generative machine learning models to produce events like Monte Carlo generators. We study three processes: a simple two-body decay, the processes e(+)e(-)-> Z -> l(+)l(-) and pp -> tt<mml:mo><overbar></mml:mover> including the decay of the top quarks and a simulation of the detector response. By buffering density information of encoded Monte Carlo events given the encoder of a Variational Autoencoder we are able to construct a prior for the sampling of new events from the decoder that yields distributions that are in very good agreement with real Monte Carlo events and are generated several orders of magnitude faster. Applications of this work include generic density estimation and sampling, targeted event generation via a principal component analysis of encoded ground truth data, anomaly detection and more efficient importance sampling, e.g., for the phase space integration of matrix elements in quantum field theories. Here, the authors report buffered-density variational autoencoders for the generation of physical events. This method is computationally less expensive over other traditional methods and beyond accelerating the data generation process, it can help to steer the generation and to detect anomalies.
|
|
|
Otten, S., Rolbiecki, K., Caron, S., Kim, J. S., Ruiz de Austri, R., & Tattersall, J. (2020). DeepXS: fast approximation of MSSM electroweak cross sections at NLO. Eur. Phys. J. C, 80(1), 12–9pp.
Abstract: We present a deep learning solution to the prediction of particle production cross sections over a complicated, high-dimensional parameter space. We demonstrate the applicability by providing state-of-the-art predictions for the production of charginos and neutralinos at the Large Hadron Collider (LHC) at the next-to-leading order in the phenomenological MSSM-19 and explicitly demonstrate the performance for pp ->(chi) over tilde (+)(1)(chi) over tilde (-)(1), (chi) over tilde (0)(2)(chi) over tilde (0)(2) and (chi) over tilde (0)(2)(chi) over tilde (+/-)(1) as a proof of concept which will be extended to all SUSY electroweak pairs. We obtain errors that are lower than the uncertainty from scale and parton distribution functions with mean absolute percentage errors of well below 0.5% allowing a safe inference at the next-to-leading order with inference times that improve the Monte Carlo integration procedures that have been available so far by a factor of O(10(7)) from O(min) to O(mu s) per evaluation.
|
|
|
Panes, B., Eckner, C., Hendriks, L., Caron, S., Dijkstra, K., Johannesson, G., et al. (2021). Identification of point sources in gamma rays using U-shaped convolutional neural networks and a data challenge. Astron. Astrophys., 656, A62–18pp.
Abstract: Context. At GeV energies, the sky is dominated by the interstellar emission from the Galaxy. With limited statistics and spatial resolution, accurately separating point sources is therefore challenging. Aims. Here we present the first application of deep learning based algorithms to automatically detect and classify point sources from gamma-ray data. For concreteness we refer to this approach as AutoSourceID. Methods. To detect point sources, we utilized U-shaped convolutional networks for image segmentation and k-means for source clustering and localization. We also explored the Centroid-Net algorithm, which is designed to find and count objects. Using two algorithms allows for a cross check of the results, while a combination of their results can be used to improve performance. The training data are based on 9.5 years of exposure from The Fermi Large Area Telescope (Fermi-LAT) and we used source properties of active galactic nuclei (AGNs) and pulsars (PSRs) from the fourth Fermi-LAT source catalog in addition to several models of background interstellar emission. The results of the localization algorithm are fed into a classification neural network that is trained to separate the three general source classes (AGNs, PSRs, and FAKE sources). Results. We compared our localization algorithms qualitatively with traditional methods and find them to have similar detection thresholds. We also demonstrate the robustness of our source localization algorithms to modifications in the interstellar emission models, which presents a clear advantage over traditional methods. The classification network is able to discriminate between the three classes with typical accuracy of similar to 70%, as long as balanced data sets are used in classification training. We published online our training data sets and analysis scripts and invite the community to join the data challenge aimed to improve the localization and classification of gamma-ray point sources.
|
|
|
Stoppa, F., Bhattacharyya, S., Ruiz de Austri, R., Vreeswijk, P., Caron, S., Zaharijas, G., et al. (2023). AutoSourceID-Classifier Star-galaxy classification using a convolutional neural network with spatial information. Astron. Astrophys., 680, A109–16pp.
Abstract: Aims. Traditional star-galaxy classification techniques often rely on feature estimation from catalogs, a process susceptible to introducing inaccuracies, thereby potentially jeopardizing the classification's reliability. Certain galaxies, especially those not manifesting as extended sources, can be misclassified when their shape parameters and flux solely drive the inference. We aim to create a robust and accurate classification network for identifying stars and galaxies directly from astronomical images.Methods. The AutoSourceID-Classifier (ASID-C) algorithm developed for this work uses 32x32 pixel single filter band source cutouts generated by the previously developed AutoSourceID-Light (ASID-L) code. By leveraging convolutional neural networks (CNN) and additional information about the source position within the full-field image, ASID-C aims to accurately classify all stars and galaxies within a survey. Subsequently, we employed a modified Platt scaling calibration for the output of the CNN, ensuring that the derived probabilities were effectively calibrated, delivering precise and reliable results.Results. We show that ASID-C, trained on MeerLICHT telescope images and using the Dark Energy Camera Legacy Survey (DECaLS) morphological classification, is a robust classifier and outperforms similar codes such as SourceExtractor. To facilitate a rigorous comparison, we also trained an eXtreme Gradient Boosting (XGBoost) model on tabular features extracted by SourceExtractor. While this XGBoost model approaches ASID-C in performance metrics, it does not offer the computational efficiency and reduced error propagation inherent in ASID-C's direct image-based classification approach. ASID-C excels in low signal-to-noise ratio and crowded scenarios, potentially aiding in transient host identification and advancing deep-sky astronomy.
|
|
|
Stoppa, F., Ruiz de Austri, R., Vreeswijk, P., Bhattacharyya, S., Caron, S., Bloemen, S., et al. (2023). AutoSourceID-FeatureExtractor Optical image analysis using a two-step mean variance estimation network for feature estimation and uncertainty characterisation. Astron. Astrophys., 680, A108–14pp.
Abstract: Aims. In astronomy, machine learning has been successful in various tasks such as source localisation, classification, anomaly detection, and segmentation. However, feature regression remains an area with room for improvement. We aim to design a network that can accurately estimate sources' features and their uncertainties from single-band image cutouts, given the approximated locations of the sources provided by the previously developed code AutoSourceID-Light (ASID-L) or other external catalogues. This work serves as a proof of concept, showing the potential of machine learning in estimating astronomical features when trained on meticulously crafted synthetic images and subsequently applied to real astronomical data.Methods. The algorithm presented here, AutoSourceID-FeatureExtractor (ASID-FE), uses single-band cutouts of 32x32 pixels around the localised sources to estimate flux, sub-pixel centre coordinates, and their uncertainties. ASID-FE employs a two-step mean variance estimation (TS-MVE) approach to first estimate the features and then their uncertainties without the need for additional information, for example the point spread function (PSF). For this proof of concept, we generated a synthetic dataset comprising only point sources directly derived from real images, ensuring a controlled yet authentic testing environment.Results. We show that ASID-FE, trained on synthetic images derived from the MeerLICHT telescope, can predict more accurate features with respect to similar codes such as SourceExtractor and that the two-step method can estimate well-calibrated uncertainties that are better behaved compared to similar methods that use deep ensembles of simple MVE networks. Finally, we evaluate the model on real images from the MeerLICHT telescope and the Zwicky Transient Facility (ZTF) to test its transfer learning abilities.
|
|