Centroid data

IRISreader contains functionality to access the 53 Mg II k centroids that were found by the study

B. Panos, L. Kleint, C. Huwyler, S. Krucker, M. Melchior, D. Ullmann, S. Voloshynovskiy. 2018
Astrophysical Journal, Volume 861, Number 1

The centroids were created in a semi-supervised way to provide a dictionary for the different observed solar flare physics in the Mg II k line.

IRISreader provides functions to assign these centroids to arbitrary observations containing the Mg II k line window.

irisreader.data.mg2k_centroids.get_mg2k_centroids

irisreader.data.mg2k_centroids.get_mg2k_centroids(bins=216)[source]

Returns Mg II k centroids found in the study ‘Identifying typical Mg II flare spectra using machine learning’, by B. Panos et. al. 2018.

The data contains 53 centroids with 216 wavelength bins between LAMBDA_MIN = 2793.8500976562500 and LAMBDA_MAX = 2799.3239974882454.

In order to assign an observed spectrum to a centroid, it has to be interpolated, normalized by dividing it through its maximum and then a 1-nearest neighbour method has to be used.

Interpolation on a raster_cube instance:

raster_image = raster.get_interpolated_image_step(
        step = <step>,
        lambda_min = LAMBDA_MIN,
        lambda_max = LAMBDA_MAX,
        n_breaks = 216
        )

Normalization:

raster_image /= np.max( raster_image, axis=1 ).reshape(-1,1)

Nearest neighbour assignment:

from sklearn.neighbors import NearestCentroid
knc = NearestCentroid()
knc.fit( X=centroids, y=list( range( centroids.shape[0] ) ) )
assigned_centroids = knc.predict( raster_image )
Parameters

bins (int) – Number of bins to interpolate to (defaults to 216)

Returns

array with shape (216, 53)

Return type

mg2k_centroids

irisreader.data.mg2k_centroids.assign_mg2k_centroids

irisreader.data.mg2k_centroids.assign_mg2k_centroids(X, centroids=None)[source]

Assigns Mg II k centroids found in the study ‘Identifying typical Mg II flare spectra using machine learning’, by B. Panos et. al. 2018 to the Mg II k spectra supplied in X. The centroids are assigned using a nearest neighbour procedure.

The spectra in X have to be interpolated to 216 wavelength bins between LAMBDA_MIN = 2793.8500976562500 and LAMBDA_MAX = 2799.3239974882454. For example:

X = raster.get_interpolated_image_step(
        step = <step>,
        lambda_min = LAMBDA_MIN,
        lambda_max = LAMBDA_MAX,
        n_breaks = 216
        )
Parameters
  • X (numpy.array) – interpolated raster image of shape (_,bins)

  • centroids (numpy.array) – If None, the centroids defined in the above study will be used, otherwise an array of shape (n_centroids, n_bins) should be passed. Important: both the spectra in ‘X’ and in ‘centroids’ should be constrained to the same wavelength region!

Returns

numpy vector with shape (X.shape[1],)

Return type

assigned_mg2k_centroids

irisreader.data.mg2k_centroids.normalize

irisreader.data.mg2k_centroids.normalize(X)[source]

Divides each row of X by its maximum to make sure that the maximum value per row is 1.

Parameters

X (numpy.array) – raster image to normalize

irisreader.data.mg2k_centroids.interpolate

irisreader.data.mg2k_centroids.interpolate(raster, step, bins=216, lambda_min=2793.85009765625, lambda_max=2799.3239974882454)[source]

Returns an interpolated image step from the raster.

Parameters
  • raster (irisreader.raster_cube) – raster_cube instance

  • step (int) – image step in the raster

  • bins (int) – number of bins to interpolate to (defaults to 216)

  • lambda_min (float) – wavelength value where interpolation should start

  • lambda_max (float) – wavelength value where interpolation should stop

Returns

numpy vector with shape (X.shape[1],)

Return type

assigned_mg2k_centroids

irisreader.data.mg2k_centroids.get_mg2k_centroid_table

irisreader.data.mg2k_centroids.get_mg2k_centroid_table(obs, centroids=None, lambda_min=2793.85009765625, lambda_max=2799.3239974882454, crop_raster=False)[source]

Returns a data frame with centroid counts for each raster image of a given observation.

Parameters
  • obs_path (str) – Path to observation

  • centroids (numpy.array) – if None, the centroids defined in the above study will be used, otherwise an array of shape (n_centroids, n_bins) should be passed

  • lambda_min (float) – wavelength value where interpolation should start

  • lambda_max (float) – wavelength value where interpolation should stop

  • crop_raster (bool) – Whether to crop raster before assigning centroids. If set to False, spectra which are -200 everywhere will be assigned to centroid 51 and spectra that are for some part -200 will be assigned to the nearest centroid.

Returns

  • centroids_df (pd.DataFrame) – Data frame with image ids and assigned centroids

  • assigned_centroids (list) – List with array of assigned centroids for every raster image