In the SDOBenchmark dataset, each sample consists of 4 time steps (→) with 10 images (↓, 1,2...10):
- 256x256 px JPEG
- coming from 2 detectors onboard the SDO satellite: 8 from AIA, 2 from HMI
- Missing images in many samples
- the peak emission ("flux") of the sun
- within 24h
- 1e-9 = "quiet sun"
- 1e-3 = largest flare in a decade
The dataset comes with 8'336 training samples and 886 test samples.
We use the mean absolute error as a performance metric for this dataset.
Hints and tipps about the data
The data has gaps, but it's reasonably stable
The data is not stationary
The data is imbalanced
..but the mean absolute error takes care of that for you in this case (the FAQ explains why).
The data analyis gives a rough idea of the imbalance by counting the binned emissions.
- SDO is a satellite mission which is orbiting around earth
- AIA and HMI are two instruments on SDO that record the activities of the sun
While SDO has been observing the Sun since 2010, we're using data from 2012 onward.
- Sample complexity: A single sample is a collection of images over four time steps
- Not many samples
- Regression problem
The easy answer is that we can't artificially produce or simulate more flares. But there is also
a strategic choice behind it:
Imagine you have to train a network to recognize cats. We'd give you a training set of 10 million images, recorded by cameras around our neighbourhood. While 10 million images is a great size for deep learning, you'd soon come to realize that all our images are from the same 100 cats. As a consequence, your model would overfit heavily and just recognize those 100 cats e.g. by some specific individual features. That's why we chose to preselect only a few of the most different images per cat.
The same is true for our images of Active Regions. It would learn to recognize these patches of the sun, and then "look up" in its neurons whether this specific patch will flare or not flare.
(This is also the reason why we have to make sure to have different cats / Active Regions in the training and test sets. Simply selecting images randomly would not be sufficient.)
Due to the logarithmic nature of the label peak_flux, the mean absolute error weighs errors with strong activities
much higher than in calm or low activities. While this correlates with our intentions, the data imbalance
helps to rebalance the otherwise predominant strong flare predictions to some degree.
Further, while there are plenty of norms around, the mean absolute error is simple and exists out of the box in practically all machine learning frameworks.
Other standard metrics (e.g. mean squared error) have less desireable characteristics for this prediction problem.
Creating a solar flare prediction dataset requires a lot of domain knowledge and time. By providing an already existing dataset we hope to encourage machine learners to push the envelope of solar flare predictions. We therefore put a lot of effort into providing both great accessibility and high scientific quality.
Currently, it is still unknown how well models will perform on this dataset. Our goal is to create a benchmark as simple as
possible, yet without sacrificing scientific value.
We will gladly provide a more difficult prediction problem at a later stage.
Except for vertical flipping (upside down), I claim that all data augmentation will alter the underlying physics.
When you flip the images horizontally, the solar rotation, the perspective distortions and the spherical projection remain the same. And there is no known difference between Active Regions in the upper and the lower hemispheres of the sun.
Horizontal flipping will make the Sun rotate in the other direction. It can work though if you process the images individually.
As for random cropping, keep in mind that you might run into the same issue as mentioned in "Why are there so few training samples?".
On the SDOBenchmark GitHub issues page.
... for this dataset
|Fixed point baseline||Roman Bolzern||1.53e-5||0.0||?|
|First competitive model||Roman Bolzern||3.6e-5||0.45||?|
... for solar flare prediction in generalA nonextensive list of examples are:
- Plenty of manual forecasts, many of them listed here
- The 24h prediction of solar monitor at solarmonitor.org
- Prediction of solar flares using signatures of magnetic field images
- Flare Prediction Using Photospheric and Coronal Image Data
- A RandomForest applied on HMI data (Arxiv here)
- Solar Flare Prediction using Multivariate Time Series Decision Trees