karabo.data package
Submodules
karabo.data.external_data module
- class ContainerContents(remote_url: str, regexr_pattern: str)
Bases:
DownloadObject
- get_all(verbose: bool = True) List[str]
Gets all objects with the according cache paths as a list.
- get_container_content() str
Gets the remote container-content as str.
- get_file_paths() List[str]
Applies regexr_pattern to container-objects.
- is_available() bool
Checks if the container itself is available, not specific files. Is dependent on the regexr_pattern.
- class DiffuseEmissionHaslam408DownloadObject
Bases:
SingleFileDownloadObject
- class DownloadObject(remote_base_url: str)
Bases:
object
Download handler for remote files & dirs.
Important: There is a remote file-naming-convention, to be able to provide updates of cached dirs/files & to simplify maintainability. The convention for each object is <dirname><file|dir>_v<version>, where the version should be an integer, starting from 1. <dirname> should be the same as <file|dir>. The additional <dirname> is to have a single directory for each object, so that additional file/dir versions don’t disturb the overall remote structure.
The version of a downloaded object is determined by the current version of Karabo, meaning that they’re hard-coded. Because Karabo relies partially on remote-objects, we don’t guarantee their availability for deprecated Karabo versions.
- URL_SEP: Final = '/'
- static download(url: str, local_file_path: Path | str, verify: bool = True, verbose: bool = True) int
Downloads url to local_file_path through a GET-request.
- Args:
url: Resource to download. local_file_path: Local file-path. verify: Validate the server’s certificate? verbose: Verbose?
- Returns:
Status-code (currently, always 200, otherwise RuntimeError).
- get_object(remote_file_path: str, verify: bool = True, verbose: bool = True) str
Gets the requested file-path of this object and downloads the resource, cache it on disk if not already done.
- Args:
remote_file_path: Remote file-path, relative to it’s base url. verify: Validate the server’s certificate if download is needed? verbose: Verbose download if it’s needed?
- Returns:
Local file-path of the remote-hosted object.
- static is_url_available(url: str) bool
Checks whether the url is available or not.
- Returns:
True if available, else False
- class ExampleHDF5Map
Bases:
SingleFileDownloadObject
- class GLEAMSurveyDownloadObject
Bases:
SingleFileDownloadObject
- class HISourcesSmallCatalogDownloadObject
Bases:
SingleFileDownloadObject
- class MALSSurveyV3DownloadObject
Bases:
SingleFileDownloadObject
- class MGCLSContainerDownloadObject(regexr_pattern: str)
Bases:
ContainerContents
- class MIGHTEESurveyDownloadObject
Bases:
SingleFileDownloadObject
- class SingleFileDownloadObject(remote_file_path: str, remote_base_url: str)
Bases:
DownloadObject
Abstract single object download handler.
- get(verify: bool = True, verbose: bool = True) str
- is_available() bool
karabo.data.obscore module
ObsCore Data Model.
https://ivoa.net/documents/ObsCore/
Recommended IVOA documents: https://www.ivoa.net/documents/index.html
- class FitsHeaderAxes(x: ~karabo.data.obscore.FitsHeaderAxis = <factory>, y: ~karabo.data.obscore.FitsHeaderAxis = <factory>, freq: ~karabo.data.obscore.FitsHeaderAxis = <factory>)
Bases:
object
Fits file axes description.
Needed for file-parsing for axis-position and unit-transformation.
- Args:
x: X/RA axis (default: axis=1, unit=deg) of image. y: Y/DEC axis (default: axis=2, unit=deg) of image. freq: Freq axis (default: axis=3, unit=Hz) of image.
- freq: FitsHeaderAxis
- class FitsHeaderAxis(axis: int, unit: UnitBase)
Bases:
object
Fits header axis dataclass.
Descriptive dataclass for .fits axis and unit allocation infos.
- Args:
axis: Axis number. unit: Unit of value CRVAL and increment CDELT of axis in the .fits file.
- axis: int
- cdelt(header: Header) Quantity
CDELT{axis} increment at reference point in CTYPE{`axis} unit.
- Args:
header: Header to extract increment from.
- Returns:
Value as astropy Quantity.
- crpix(header: Header) float
CRPIX{axis} location at reference point along axis.
- Args:
header: Header to extract location from.
- Returns:
Location of axis.
- crval(header: Header) Quantity
CRVAL{axis} value at reference point in CTYPE{`axis} unit.
- Args:
header: Header to extract value from.
- Returns:
Value as astropy Quantity.
- ctype(header: Header) str
CTYPE{axis} unit type of axis.
This is just a str, not an astropy unit-name.
- Args:
header: Header to extract unit from.
- Returns:
Unit as str.
- naxis(header: Header) int
NAXIS{axis} length of axis.
- Args:
header: Header to extract length from.
- Returns:
Length of axis.
- unit: UnitBase
- class ObsCoreMeta(dataproduct_type: Literal['image', 'cube', 'spectrum', 'sed', 'timeseries', 'visibility', 'event', 'measurements'] | None = None, dataproduct_subtype: str | None = None, calib_level: Literal[0, 1, 2, 3, 4] | None = None, obs_collection: str | None = None, obs_id: str | None = None, obs_publisher_did: str | None = None, obs_title: str | None = None, obs_creator_did: str | None = None, target_class: str | None = None, access_url: str | None = None, access_format: str | None = None, access_estsize: int | None = None, target_name: str | None = None, s_ra: float | None = None, s_dec: float | None = None, s_fov: float | None = None, s_region: str | None = None, s_resolution: float | None = None, s_xel1: int | None = None, s_xel2: int | None = None, s_pixel_scale: float | None = None, t_min: float | None = None, t_max: float | None = None, t_exptime: float | None = None, t_resolution: float | None = None, t_xel: int | None = None, em_min: float | None = None, em_max: float | None = None, em_res_power: float | None = None, em_xel: int | None = None, em_ucd: str | None = None, o_ucd: str | None = None, pol_states: str | None = None, pol_xel: int | None = None, facility_name: str | None = None, instrument_name: str | None = None)
Bases:
object
IVOA ObsCore v1.1 metadata (TAP column names).
This doesn’t describe a full ObsCoreDM, but the mandatory and the non-mandatory fields defined in the SRCNet Rucio database. The actual JSON to send to a specific ObsTAP service has to be created by yourself.
The args-docstring provides just a rough idea of the according values. A more detailed description is provided by the ObsCore-v1.1 documentation.
- Args:
- dataproduct_type: Logical data product type (image etc.). image, cube,
spectrum, sed, timeseries, visibility, event or measurements.
- dataproduct_subtype: Data product specific type defined by the ObsTAP provider.
This is not a useful value for global discovery, but within an archive.
- calib_level: Calibration level {0, 1, 2, 3, 4} (mandatory).
0: Raw instrumental data.
1: Instrumental data in a standard format (FITS, VOTable, etc.)
2: Calibrated, science ready measurements without instrument signature.
- 3: Enhanced data products like mosaics, drizzled images or heavily
processed survey fields. May represent a combination of data from multiple primary obs.
4: Analysis data products generated after scientific data manipulation.
- obs_collection: Name of the data collection (mandatory). Either registered
shortname, full registered IVOA identifier or a data provider defined shortname. Often used pattern: <facility-name>/<instrument-name>.
- obs_id: Observation ID (mandatory).
All data-products from a single observation should share the same obs_id. This is just a unique str-ID with no form. Must be unique to a provider.
- obs_publisher_did: Dataset identifier given by the publisher (mandatory).
IVOA dataset identifier. Must be a unique value within the namespace controlled by the dataset publisher (data center). ObsCoreMeta.get_ivoid may help creating this value.
obs_title: Brief description of dataset in free format.
obs_creator_did: IVOA dataset identifier given by the creator.
- target_class: Class of the target/object as in SSA.
Either SIMBAD-DB (see https://simbad.cds.unistra.fr/guide/otypes.htx Object type code), OR NED-DB types (see https://ned.ipac.caltech.edu/help/ui/nearposn-list_objecttypes).
access_url: URL used to access (download) dataset.
access_format: File content format (MIME type) (fits, jpeg, zip, etc.).
access_estsize: [kbyte] Estimated size of dataset from access_url.
- target_name: Astronomical object observed, if any. This is typically the name
of an astronomical object, but could be the name of a survey field.
s_ra: [deg] Central right ascension, ICRS.
s_dec: [deg] Central declination, ICRS.
- s_fov: [deg] Region covered by the data product. For a circular region, this
is the diameter. For most data products, the value should be large enough to include the entire area of the observation. For detailed spatial coverage, the s_region attribute can be used.
- s_region: Sky region covered by the data product (expressed in ICRS frame).
It’s a ‘spoly` (spherical polygon) type, which is described in https://pgsphere.github.io/doc/funcs.html#funcs.spoly. Use spoly, scircle or spoint to create the formatted str. E.g. for spoly: {(204.712d,+47.405d),(204.380d,+48.311d),(202.349d,+49.116d), (200.344d,+48.458d),(199.878d,+47.521d),(200.766d,+46.230d), (202.537d,+45.844d),(204.237d,+46.55d)}.
- s_resolution: [arcsec] Smallest resolvable spatial resolution of data as FWHM.
If spatial frequency sampling is complex (e.g. interferometry), a typical value for spatial resolution estimate should be given.
s_xel1: Number of elements along the first spatial axis.
s_xel2: Number of elements along the second spatial axis.
- s_pixel_scale: Sampling period in world coordinate units along the spatial axis.
It’s the distance in WCS units between two pixel centers.
t_min: [d] Observation start time in Modified Julian Day (MJD).
t_max: [d] Observation stop time in Modified Julian Day (MJD).
- t_exptime: [s] Total exposure time. For simple exposures: t_max - t_min.
For data where the exposure is not constant over the entire data product, the median exposure time per pixel is a good way to characterize the typical value.
- t_resolution: [s] Minimal interpretable interval between two points along time.
This can be an average or representative value. For products with no sampling along the time axis, it could be set to exposure time or null.
t_xel: Number of elements along the time axis.
em_min: [m] Minimal spectral value observed, expressed as a vacuum wavelength.
em_max: [m] Maximum spectral value observed, expressed as a vacuum wavelength.
em_res_power: Spectral resolving power λ / δλ.
em_xel: Number of elements along the spectral axis.
- em_ucd: Nature of the spectral axis. This is an em (electromagnetic spectrum)
UCD (UCD-string see o_ucd), e.g. em.freq, em.wl or em.energy. Note: For ObsTAP implementation, the spectral axis coordinates are constrained as a wavelength quantity expressed in meters.
- o_ucd: UCD (semantic annotation(s)) of observable (e.g. phot.flux.density).
A UCD is a string containing ; separated words, which can be separated into atoms. The UCD-list is evolving over time and far too extensive to describe here. Please have a look at the recommended version of IVOA UCDlist document at https://www.ivoa.net/documents/index.html.
- pol_states: List of polarization states or NULL if not applicable.
Allowed: I, Q, U, V, RR, LL, RL, LR, XX, YY, XY, YX, POLI, POLA.
pol_xel: Number of polarization samples in pol_states.
facility_name: Name of the facility used for this observation.
instrument_name: Name of the instrument used for this observation.
- access_estsize: int | None = None
- access_format: str | None = None
- access_url: str | None = None
- calib_level: Literal[0, 1, 2, 3, 4] | None = None
- check_ObsCoreMeta(*, verbose: bool = False) bool
Checks whether ObsCoreMeta is ready for serialization.
This doesn’t perform a full check if all field-values are valid. Currently supported checks: - Presence of mandatory fields - Polarization fields - Axes fields
- Args:
verbose: Verbose?
- Returns:
True if ready, else False.
- dataproduct_subtype: str | None = None
- dataproduct_type: Literal['image', 'cube', 'spectrum', 'sed', 'timeseries', 'visibility', 'event', 'measurements'] | None = None
- em_max: float | None = None
- em_min: float | None = None
- em_res_power: float | None = None
- em_ucd: str | None = None
- em_xel: int | None = None
- facility_name: str | None = None
- classmethod from_image(img: Image, *, fits_axes: FitsHeaderAxes = FitsHeaderAxes(x=FitsHeaderAxis(axis=1, unit=Unit('deg')), y=FitsHeaderAxis(axis=2, unit=Unit('deg')), freq=FitsHeaderAxis(axis=3, unit=Unit('Hz')))) Self
Update fields from Image.
- This function may not adjust each field for your needs. In addition, there
is no possibility to fill some mandatory fields because there is just no information available. Thus, you have to take care of some fields by yourself.
- Note: This function assumes the presence of NAXIS, CRPIX, CRVAL & CDELT
for each axis-number specified in fits_axes to be present and correctly specified for your .fits file in img. Otherwise, this function might fail or produce corrupt values.
- Args:
img: Image instance. fits_axes: FitsAxes instance to specify axis-number and according unit
to extract and transform information from the .fits file of img correctly.
- Returns:
ObsCoreMeta instance.
- classmethod from_visibility(vis: Visibility, *, calibrated: bool | None = None, tel: Telescope | None = None, obs: Observation | None = None) Self
Suggests fields from Visibility.
This function may not adjust each field for your needs. In addition, there is no possibility to fill all mandatory fields because there is just no information available. Thus, you have to take care of some fields by yourself.
Supported formats: OSKAR .vis files.
Not supported atm: CASA .ms measurement sets.
- Args:
vis: Visibility instance. calibrated: Calibrated visibilities? tel: Telescope to determine smallest spatial resolution. obs: Observation to determine smallest spatial resolution.
- Returns:
ObsCoreMeta instance.
- classmethod get_ivoid(*, authority: str, path: str | None, query: str | None, fragment: str | None) str
Gets the IVOA identifier for ObsCoreMeta.obs_creator_did.
- IVOID according to IVOA ‘REC-Identifiers-2.0’. Do NOT specify RFC 3986
delimiters in the input-args, they’re added automatically.
Please set up an Issue if this is not up-to-date anymore.
- Args:
- authority: Organization (usually a data provider) that has been granted
the right by the IVOA to create IVOA-compliant identifiers for resources it registers.
- path: Resource key. It’s ‘a resource that is unique within the namespace
of an authority identifier.
query: According to RFC 3986. fragment: According to RFC 3986.
- Returns:
IVOID.
- get_pol_states() List[Literal['I', 'Q', 'U', 'V', 'RR', 'LL', 'RL', 'LR', 'XX', 'YY', 'XY', 'YX', 'POLI', 'POLA']] | None
Parses the polarization states to _PolStatesListType.
- Returns:
List of polarization states if field-value is not None.
- instrument_name: str | None = None
- o_ucd: str | None = None
- obs_collection: str | None = None
- obs_creator_did: str | None = None
- obs_id: str | None = None
- obs_publisher_did: str | None = None
- obs_title: str | None = None
- pol_states: str | None = None
- pol_xel: int | None = None
- s_dec: float | None = None
- s_fov: float | None = None
- s_pixel_scale: float | None = None
- s_ra: float | None = None
- s_region: str | None = None
- s_resolution: float | None = None
- s_xel1: int | None = None
- s_xel2: int | None = None
- classmethod scircle(point: tuple[float, float], radius: float, *, ndigits: int = 3, suffix: str = 'd') str
Converts point & radius to scircle str for s_region.
scircle: https://pgsphere.github.io/doc/funcs.html
E.g. <(0d,90d),60d>
- Args:
point: RA,DEC [deg] point. radius: Radius [deg]. ndigits: Number of digits to round. suffix: Suffix for each number (e.g. “d” for deg).
- Returns:
scircle str.
- set_fields(**kwargs: Any) None
Set fields from kwargs if they’re valid.
- Args:
kwargs: Field names and values to set.
- set_pol_states(pol_states: List[Literal['I', 'Q', 'U', 'V', 'RR', 'LL', 'RL', 'LR', 'XX', 'YY', 'XY', 'YX', 'POLI', 'POLA']]) None
Sets pol_states from a pythonic interface to a str according to ObsCore.
Overwrites if pol_states already exists.
- Args:
pol_states: Polarization states.
- classmethod spoint(point: tuple[float, float], *, ndigits: int = 3, suffix: str = 'd') str
Converts point to spoint str for s_region.
spoint: https://pgsphere.github.io/doc/funcs.html
E.g. (10d,20d)
- Args:
point: RA,DEC [deg] point. ndigits: Number of digits to round. suffix: Suffix for each number (e.g. “d” for deg).
- Returns:
spoint str.
- classmethod spoly(poly: Sequence[tuple[float, float]], *, ndigits: int = 3, suffix: str = 'd') str
Converts poly to spoly str for s_region.
spoly: https://pgsphere.github.io/doc/funcs.html
E.g. {(0,0),(1,0),(1,1)}
- Args:
args: Consecutive RA,DEC [deg] poly tuples. ndigits: Number of digits to round. suffix: Suffix for each number (e.g. “d” for deg).
- Returns:
spoly str.
- t_exptime: float | None = None
- t_max: float | None = None
- t_min: float | None = None
- t_resolution: float | None = None
- t_xel: int | None = None
- target_class: str | None = None
- target_name: str | None = None
- to_dict(fpath: Path | str | None = None, *, ignore_none: bool = True) Dict[str, Any]
Converts this dataclass into a dict.
- Args:
fpath: File-path to write dump. ignore_none: Ignore non-mandatory None fields?
- Returns:
Dataclass as dict.
karabo.data.src module
- class RucioMeta(namespace: str, name: str, lifetime: int, dataset_name: str | None = None, meta: Dict[str, Any] | ObsCoreMeta | None = None)
Bases:
object
Metadata dataclass to handle SKA SRC Ingestion for Rucio.
- This dataclass may go through some changes in the future in case
the Rucio service is also changing.
See https://gitlab.com/ska-telescope/src/ska-src-ingestion/-/tree/main.
- Args:
namespace: The Rucio scope in which the new file should be located.
- name: The name of the file within Rucio - the scope:name together is the
Data Identifier (DID).
lifetime: The lifetime in seconds for the new file to be retained.
- dataset_name: The Rucio dataset name the file will be attached to.
The dataset scope will be the same as that specified in namespace.
- meta: An object containing science metadata fields, which will be set against
the ingested file. This should be either a dict of ObsCoreMeta or an instance of ObsCoreMeta.
- dataset_name: str | None = None
- classmethod get_ivoid(*, authority: str = 'test.skao', path: str = '/~', namespace: str, name: str, fragment: str | None = None) str
Gets the IVOA identifier for ObsCoreMeta.obs_creator_did.
- SRCNet Rucio IVOID according to IVOA ‘REC-Identifiers-2.0’. Do NOT specify
RFC 3986 delimiters in the input-args, they’re added automatically.
Please set up an Issue if this is not up-to-date anymore.
- Args:
- authority: Organization (usually a data provider) that has been granted
the right by the IVOA to create IVOA-compliant identifiers for resources it registers.
- path: Resource key. It’s ‘a resource that is unique within the namespace
of an authority identifier.
namespace: RucioMeta.namespace. name: RucioMeta.name (filename in Rucio). fragment: According to RFC 3986.
- Returns:
IVOID.
- classmethod get_meta_fname(fname: TFilePathType) TFilePathType
Gets the metadata-filename of fname.
- It’s according to the Rucio metadata specification (if up-to-date). The
specification states that metadata is expected to be provided by two files: fname: data_name and metadata: data_name.<metadata_suffix>, where the suffix is set to meta.
- Args:
fname: Filename to create metadata filename from.
- Returns:
Metadata filename (or filepath if fname was also a filepath).
- lifetime: int
- meta: Dict[str, Any] | ObsCoreMeta | None = None
- name: str
- namespace: str
- to_dict(fpath: Path | str | None = None, *, ignore_none: bool = True) Dict[str, Any]
Converts this dataclass into a dict.
- Args:
- fpath: File-path to write dump. Consider using get_meta_fname
to get an fpath according to the Rucio specification.
ignore_none: Ignore None fields?
- Returns:
Dataclass as dict.