Assessing the Predictability of Solar Energetic Particles with the use of Machine Learning techniques Full text

Eleni Lavasa, Giorgos Giannopoulos, Athanasios Papaioannou, Anastasios Anastasiadis, IA Daglis, Angels Aran, David Pacheco, B Sanahuja
Solar Physics volume 296, Article number: 107 (2021)
Abstract. A consistent approach for the inherently imbalanced problem of solar energetic particle (SEP) events binary prediction is being presented. This is based on solar flare and coronal mass ejection (CME) data and combinations of both thereof. We exploit several machine learning (ML) and conventional statistics techniques to predict SEPs. The methods used are logistic regression (LR), support vector machines (SVM), neural networks (NN) in the fully connected multi-layer perceptron (MLP) implementation, random forests (RF), decision trees (DTs), extremely randomized trees (XT) and extreme gradient boosting (XGB). We provide an assessment of the methods employed and conclude that RF could be the prediction technique of choice for an optimal sample comprised by both flares and CMEs. The best-performing method gives a Probability of Detection (POD) of 0.76(±0.06), False Alarm Rate (FAR) of 0.34(±0.10), true skill statistic (TSS) 0.75(±0.05), and Heidke skill score (HSS) 0.69(±0.04). We further show that the most important features for the identification of SEPs, in our sample, are the CME speed, width and flare soft X-ray (SXR) fluence.