karabo.sourcedetection

Overview

Karabo comes with a source detection algorithm basesd on PyBDSF. This module offers wrappers around PyBDSF and allows for automatic source detection.

Classes

class SourceDetectionEvaluation(sky: SkyModel, ground_truth: ndarray[Any, dtype[float64]], assignments: ndarray[Any, dtype[float64]], sky_idxs: ndarray[Any, dtype[int64]], source_detection: ISourceDetectionResult)
__init__(sky: SkyModel, ground_truth: ndarray[Any, dtype[float64]], assignments: ndarray[Any, dtype[float64]], sky_idxs: ndarray[Any, dtype[int64]], source_detection: ISourceDetectionResult) None

Class that holds the mapping of a source detection to truth mapping.

Parameters:
  • skySkyModel where the assignment comes from

  • ground_truth – 2xn array of pixel positions of ground truth

  • assignments

    jx3 np.ndarray where each row represents an assignment:

    • first column is the ground_truth index

    • second column is the predicted source_detection.detected_sources index

    • third column is the euclidean distance between the assignment

  • sky_idxs – Sky sources indices of SkyModel from assignment

  • source_detection – SourceDetectionResult from a previous source-detection

classmethod automatic_assignment_of_ground_truth_and_prediction(ground_truth: ndarray[Any, dtype[int64]] | ndarray[Any, dtype[float64]], detected: ndarray[Any, dtype[int64]] | ndarray[Any, dtype[float64]], max_dist: float, top_k: int = 3) ndarray[Any, dtype[float64]]
Automatic assignment of the predicted sources predicted to the

ground truth gtruth. The strategy is the following (similar to AUTOMATIC SOURCE DETECTION IN ASTRONOMICAL IMAGES, P.61, Marc MASIAS MOYSET, 2014): Each distance between the predicted and the ground truth sources is calculated. Any distances > max_dist are not considered. Assign the closest distance from the predicted and ground truth. Repeat the assignment, until every source from the gtruth has an assignment if possible, not allowing any double assignments from the predicted sources to the ground truth and vice versa. So each ground truth source should be assigned with a predicted source if at least one was in range and the predicted source assigned to another ground truth source before. If there are duplicate sources (e.g. same source, different frequency), the duplicate sources are removed and the assignment is done on the remaining.

Parameters:
  • ground_truth – nx2 np.ndarray with the ground truth pixel coordinates of the catalog

  • detected – kx2 np.ndarray with the predicted pixel coordinates of the image

  • max_dist – maximal allowed euclidean distance for assignment (in pixel domain)

  • top_k – number of top predictions to be considered in scipy.spatial. KDTree. A small value could lead to imperfect results.

Returns:

nx3 np.ndarray where each row represents an assignment - first column represents the ground truth index

(return is sorted by this column) a minus index means a ground-truth source with no allocated prediction

  • second column represents the predicted index

    a minus index means a predicted source with no allocated ground-truth

  • third column represents the euclidean distance between the assignment

    a “inf” means no allocation between ground-truth and prediction of that source

static calculate_evaluation_measures(assignments: ndarray[Any, dtype[float64]]) Tuple[int, int, int]

Calculates the True Positive (TP), False Positive (FP) and False Negative (FN) of the ground truth and predictions. - TP are the detections associated with a source - FP are detections without any associated source - FN are sources with no associations with a detection

Parameters:

assignments – nx3 did np.ndarray where each row represents an assignment The assignments is expected to be as automatic_assignment_of_ground_truth_and_prediction return. Therefore, the non-assigned sources must have a value of “-1”.

Returns:

TP, FP, FN

plot(exclude_img: bool = False, show_legend: bool = True, filename: str | None = None) None

Plot the found sources as green x’s and the source truth as red ‘o’ on the original image, that the source detection was performed on.

class SourceDetectionResult(detected_sources: ndarray[Any, dtype[float64]], source_image: Image)
__init__(detected_sources: ndarray[Any, dtype[float64]], source_image: Image) None

Generic Source Detection Result Class. Inputting your Source Detection Result as an array

index

ra

dec

pos X (pixel) | pos Y (pixel)

total_flux

peak_flux

0

30

200

400 | 500

0.345 | 0.34540

Rows can also be left empty if the specified value is not found by your source detection algorithm. More rows can also be added at the end. As they are not used for any internal algorithm.

Parameters:
  • detected_sources – detected sources in array

  • source_image – Image, where the source detection was performed on

classmethod detect_sources_in_image(image: Image, beam: BeamType | None = None, verbose: bool = False, **kwargs: Any) _SourceDetectionResultType | None

Detect sources in an astronomical image using PyBDSF.process_image function.

Parameters

clsType[SourceDetectionResultType]

The class on which this method is called.

imageImage or List[Image]

Image object for source detection. Can be a single image or a list of images.

beamOptional[BeamType],

The Full Width Half Maximum (FWHM) of the restoring beam, BMAJ(arcsec), BMIN(arcsec), BPA(degree). If None, tries to extract from image metadata.

verbose : verbose? n_splits : int, default 0

The number of parts to split the image into for processing. A value greater than 1 requires Dask.

overlapint, default 0

The overlap between split parts of the image in pixels.

** kwargsAny

Additional keyword arguments to pass to PyBDSF.process_image function.

Returns

Optional[List[SourceDetectionResultType]]

A list of detected sources, or None if all pixels in the image are blanked or on failure.

Raises

RuntimeError

If an unexpected error occurs during the source detection process.

Notes

The dask client has to be created with the setting processes=False to avoid issues with PyBDSF multiprocessing. See similar issue here: https://stackoverflow.com/questions/51485212/multiprocessing-gives-assertionerror-daemonic-processes-are-not-allowed-to-have # noqa If ‘n_splits’ is greater than 1 and ‘use_dask’ is True, the image will be split into multiple parts and processed in parallel using Dask. Overlap can be specified to prevent edge artifacts.

If the ‘beam’ parameter is not provided, the method will attempt to extract the beam parameters from the image metadata. A warning is raised if beam parameters are not found.

The PyBDSF process_image function is called for source detection, which is particularly designed for radio astronomical images. For details on this function, refer to the PyBDSF documentation.

get_source_image() Image | None

Return the source image, where the source detection was performed on. :return: Karabo Image or None (if not supplied)

has_source_image() bool

Check if source image is present. :return: True if present, False if not present

write_to_file(path: Path | str) None

Save Source Detection Result to ZIP Archive containing the .fits source image and source-finding catalog. :param path: path to save the zip archive as.