karabo.data package

Submodules

karabo.data.external_data module

class ContainerContents(remote_url: str, regexr_pattern: str)

Bases: DownloadObject

get_all(verbose: bool = True) List[str]

Gets all objects with the according cache paths as a list.

get_container_content() str

Gets the remote container-content as str.

get_file_paths() List[str]

Applies regexr_pattern to container-objects.

is_available() bool

Checks if the container itself is available, not specific files. Is dependent on the regexr_pattern.

class DiffuseEmissionHaslam408DownloadObject

Bases: SingleFileDownloadObject

class DownloadObject(remote_base_url: str)

Bases: object

Download handler for remote files & dirs.

Important: There is a remote file-naming-convention, to be able to provide updates of cached dirs/files & to simplify maintainability. The convention for each object is <dirname><file|dir>_v<version>, where the version should be an integer, starting from 1. <dirname> should be the same as <file|dir>. The additional <dirname> is to have a single directory for each object, so that additional file/dir versions don’t disturb the overall remote structure.

The version of a downloaded object is determined by the current version of Karabo, meaning that they’re hard-coded. Because Karabo relies partially on remote-objects, we don’t guarantee their availability for deprecated Karabo versions.

URL_SEP: Final = '/'
static download(url: str, local_file_path: Path | str, verify: bool = True, verbose: bool = True) int

Downloads url to local_file_path through a GET-request.

Args:

url: Resource to download. local_file_path: Local file-path. verify: Validate the server’s certificate? verbose: Verbose?

Returns:

Status-code (currently, always 200, otherwise RuntimeError).

get_object(remote_file_path: str, verify: bool = True, verbose: bool = True) str

Gets the requested file-path of this object and downloads the resource, cache it on disk if not already done.

Args:

remote_file_path: Remote file-path, relative to it’s base url. verify: Validate the server’s certificate if download is needed? verbose: Verbose download if it’s needed?

Returns:

Local file-path of the remote-hosted object.

static is_url_available(url: str) bool

Checks whether the url is available or not.

Returns:

True if available, else False

class ExampleHDF5Map

Bases: SingleFileDownloadObject

class GLEAMSurveyDownloadObject

Bases: SingleFileDownloadObject

class HISourcesSmallCatalogDownloadObject

Bases: SingleFileDownloadObject

class MALSSurveyV3DownloadObject

Bases: SingleFileDownloadObject

class MGCLSContainerDownloadObject(regexr_pattern: str)

Bases: ContainerContents

class MIGHTEESurveyDownloadObject

Bases: SingleFileDownloadObject

class SingleFileDownloadObject(remote_file_path: str, remote_base_url: str)

Bases: DownloadObject

Abstract single object download handler.

get(verify: bool = True, verbose: bool = True) str
is_available() bool

karabo.data.obscore module

ObsCore Data Model.

https://ivoa.net/documents/ObsCore/

Recommended IVOA documents: https://www.ivoa.net/documents/index.html

class FitsHeaderAxes(x: ~karabo.data.obscore.FitsHeaderAxis = <factory>, y: ~karabo.data.obscore.FitsHeaderAxis = <factory>, freq: ~karabo.data.obscore.FitsHeaderAxis = <factory>)

Bases: object

Fits file axes description.

Needed for file-parsing for axis-position and unit-transformation.

Args:

x: X/RA axis (default: axis=1, unit=deg) of image. y: Y/DEC axis (default: axis=2, unit=deg) of image. freq: Freq axis (default: axis=3, unit=Hz) of image.

freq: FitsHeaderAxis
x: FitsHeaderAxis
y: FitsHeaderAxis
class FitsHeaderAxis(axis: int, unit: UnitBase)

Bases: object

Fits header axis dataclass.

Descriptive dataclass for .fits axis and unit allocation infos.

Args:

axis: Axis number. unit: Unit of value CRVAL and increment CDELT of axis in the .fits file.

axis: int
cdelt(header: Header) Quantity

CDELT{axis} increment at reference point in CTYPE{`axis} unit.

Args:

header: Header to extract increment from.

Returns:

Value as astropy Quantity.

crpix(header: Header) float

CRPIX{axis} location at reference point along axis.

Args:

header: Header to extract location from.

Returns:

Location of axis.

crval(header: Header) Quantity

CRVAL{axis} value at reference point in CTYPE{`axis} unit.

Args:

header: Header to extract value from.

Returns:

Value as astropy Quantity.

ctype(header: Header) str

CTYPE{axis} unit type of axis.

This is just a str, not an astropy unit-name.

Args:

header: Header to extract unit from.

Returns:

Unit as str.

naxis(header: Header) int

NAXIS{axis} length of axis.

Args:

header: Header to extract length from.

Returns:

Length of axis.

unit: UnitBase
class ObsCoreMeta(dataproduct_type: Literal['image', 'cube', 'spectrum', 'sed', 'timeseries', 'visibility', 'event', 'measurements'] | None = None, dataproduct_subtype: str | None = None, calib_level: Literal[0, 1, 2, 3, 4] | None = None, obs_collection: str | None = None, obs_id: str | None = None, obs_publisher_did: str | None = None, obs_title: str | None = None, obs_creator_did: str | None = None, target_class: str | None = None, access_url: str | None = None, access_format: str | None = None, access_estsize: int | None = None, target_name: str | None = None, s_ra: float | None = None, s_dec: float | None = None, s_fov: float | None = None, s_region: str | None = None, s_resolution: float | None = None, s_xel1: int | None = None, s_xel2: int | None = None, s_pixel_scale: float | None = None, t_min: float | None = None, t_max: float | None = None, t_exptime: float | None = None, t_resolution: float | None = None, t_xel: int | None = None, em_min: float | None = None, em_max: float | None = None, em_res_power: float | None = None, em_xel: int | None = None, em_ucd: str | None = None, o_ucd: str | None = None, pol_states: str | None = None, pol_xel: int | None = None, facility_name: str | None = None, instrument_name: str | None = None)

Bases: object

IVOA ObsCore v1.1 metadata (TAP column names).

This doesn’t describe a full ObsCoreDM, but the mandatory and the non-mandatory fields defined in the SRCNet Rucio database. The actual JSON to send to a specific ObsTAP service has to be created by yourself.

The args-docstring provides just a rough idea of the according values. A more detailed description is provided by the ObsCore-v1.1 documentation.

Args:
dataproduct_type: Logical data product type (image etc.). image, cube,

spectrum, sed, timeseries, visibility, event or measurements.

dataproduct_subtype: Data product specific type defined by the ObsTAP provider.

This is not a useful value for global discovery, but within an archive.

calib_level: Calibration level {0, 1, 2, 3, 4} (mandatory).
  • 0: Raw instrumental data.

  • 1: Instrumental data in a standard format (FITS, VOTable, etc.)

  • 2: Calibrated, science ready measurements without instrument signature.

  • 3: Enhanced data products like mosaics, drizzled images or heavily

    processed survey fields. May represent a combination of data from multiple primary obs.

  • 4: Analysis data products generated after scientific data manipulation.

obs_collection: Name of the data collection (mandatory). Either registered

shortname, full registered IVOA identifier or a data provider defined shortname. Often used pattern: <facility-name>/<instrument-name>.

obs_id: Observation ID (mandatory).

All data-products from a single observation should share the same obs_id. This is just a unique str-ID with no form. Must be unique to a provider.

obs_publisher_did: Dataset identifier given by the publisher (mandatory).

IVOA dataset identifier. Must be a unique value within the namespace controlled by the dataset publisher (data center). ObsCoreMeta.get_ivoid may help creating this value.

obs_title: Brief description of dataset in free format.

obs_creator_did: IVOA dataset identifier given by the creator.

target_class: Class of the target/object as in SSA.

Either SIMBAD-DB (see https://simbad.cds.unistra.fr/guide/otypes.htx Object type code), OR NED-DB types (see https://ned.ipac.caltech.edu/help/ui/nearposn-list_objecttypes).

access_url: URL used to access (download) dataset.

access_format: File content format (MIME type) (fits, jpeg, zip, etc.).

access_estsize: [kbyte] Estimated size of dataset from access_url.

target_name: Astronomical object observed, if any. This is typically the name

of an astronomical object, but could be the name of a survey field.

s_ra: [deg] Central right ascension, ICRS.

s_dec: [deg] Central declination, ICRS.

s_fov: [deg] Region covered by the data product. For a circular region, this

is the diameter. For most data products, the value should be large enough to include the entire area of the observation. For detailed spatial coverage, the s_region attribute can be used.

s_region: Sky region covered by the data product (expressed in ICRS frame).

It’s a ‘spoly` (spherical polygon) type, which is described in https://pgsphere.github.io/doc/funcs.html#funcs.spoly. Use spoly, scircle or spoint to create the formatted str. E.g. for spoly: {(204.712d,+47.405d),(204.380d,+48.311d),(202.349d,+49.116d), (200.344d,+48.458d),(199.878d,+47.521d),(200.766d,+46.230d), (202.537d,+45.844d),(204.237d,+46.55d)}.

s_resolution: [arcsec] Smallest resolvable spatial resolution of data as FWHM.

If spatial frequency sampling is complex (e.g. interferometry), a typical value for spatial resolution estimate should be given.

s_xel1: Number of elements along the first spatial axis.

s_xel2: Number of elements along the second spatial axis.

s_pixel_scale: Sampling period in world coordinate units along the spatial axis.

It’s the distance in WCS units between two pixel centers.

t_min: [d] Observation start time in Modified Julian Day (MJD).

t_max: [d] Observation stop time in Modified Julian Day (MJD).

t_exptime: [s] Total exposure time. For simple exposures: t_max - t_min.

For data where the exposure is not constant over the entire data product, the median exposure time per pixel is a good way to characterize the typical value.

t_resolution: [s] Minimal interpretable interval between two points along time.

This can be an average or representative value. For products with no sampling along the time axis, it could be set to exposure time or null.

t_xel: Number of elements along the time axis.

em_min: [m] Minimal spectral value observed, expressed as a vacuum wavelength.

em_max: [m] Maximum spectral value observed, expressed as a vacuum wavelength.

em_res_power: Spectral resolving power λ / δλ.

em_xel: Number of elements along the spectral axis.

em_ucd: Nature of the spectral axis. This is an em (electromagnetic spectrum)

UCD (UCD-string see o_ucd), e.g. em.freq, em.wl or em.energy. Note: For ObsTAP implementation, the spectral axis coordinates are constrained as a wavelength quantity expressed in meters.

o_ucd: UCD (semantic annotation(s)) of observable (e.g. phot.flux.density).

A UCD is a string containing ; separated words, which can be separated into atoms. The UCD-list is evolving over time and far too extensive to describe here. Please have a look at the recommended version of IVOA UCDlist document at https://www.ivoa.net/documents/index.html.

pol_states: List of polarization states or NULL if not applicable.

Allowed: I, Q, U, V, RR, LL, RL, LR, XX, YY, XY, YX, POLI, POLA.

pol_xel: Number of polarization samples in pol_states.

facility_name: Name of the facility used for this observation.

instrument_name: Name of the instrument used for this observation.

access_estsize: int | None = None
access_format: str | None = None
access_url: str | None = None
calib_level: Literal[0, 1, 2, 3, 4] | None = None
check_ObsCoreMeta(*, verbose: bool = False) bool

Checks whether ObsCoreMeta is ready for serialization.

This doesn’t perform a full check if all field-values are valid. Currently supported checks: - Presence of mandatory fields - Polarization fields - Axes fields

Args:

verbose: Verbose?

Returns:

True if ready, else False.

dataproduct_subtype: str | None = None
dataproduct_type: Literal['image', 'cube', 'spectrum', 'sed', 'timeseries', 'visibility', 'event', 'measurements'] | None = None
em_max: float | None = None
em_min: float | None = None
em_res_power: float | None = None
em_ucd: str | None = None
em_xel: int | None = None
facility_name: str | None = None
classmethod from_image(img: Image, *, fits_axes: FitsHeaderAxes = FitsHeaderAxes(x=FitsHeaderAxis(axis=1, unit=Unit('deg')), y=FitsHeaderAxis(axis=2, unit=Unit('deg')), freq=FitsHeaderAxis(axis=3, unit=Unit('Hz')))) Self

Update fields from Image.

This function may not adjust each field for your needs. In addition, there

is no possibility to fill some mandatory fields because there is just no information available. Thus, you have to take care of some fields by yourself.

Note: This function assumes the presence of NAXIS, CRPIX, CRVAL & CDELT

for each axis-number specified in fits_axes to be present and correctly specified for your .fits file in img. Otherwise, this function might fail or produce corrupt values.

Args:

img: Image instance. fits_axes: FitsAxes instance to specify axis-number and according unit

to extract and transform information from the .fits file of img correctly.

Returns:

ObsCoreMeta instance.

classmethod from_visibility(vis: Visibility, *, calibrated: bool | None = None, tel: Telescope | None = None, obs: Observation | None = None) Self

Suggests fields from Visibility.

This function may not adjust each field for your needs. In addition, there is no possibility to fill all mandatory fields because there is just no information available. Thus, you have to take care of some fields by yourself.

Supported formats: OSKAR .vis files.

Not supported atm: CASA .ms measurement sets.

Args:

vis: Visibility instance. calibrated: Calibrated visibilities? tel: Telescope to determine smallest spatial resolution. obs: Observation to determine smallest spatial resolution.

Returns:

ObsCoreMeta instance.

classmethod get_ivoid(*, authority: str, path: str | None, query: str | None, fragment: str | None) str

Gets the IVOA identifier for ObsCoreMeta.obs_creator_did.

IVOID according to IVOA ‘REC-Identifiers-2.0’. Do NOT specify RFC 3986

delimiters in the input-args, they’re added automatically.

Please set up an Issue if this is not up-to-date anymore.

Args:
authority: Organization (usually a data provider) that has been granted

the right by the IVOA to create IVOA-compliant identifiers for resources it registers.

path: Resource key. It’s ‘a resource that is unique within the namespace

of an authority identifier.

query: According to RFC 3986. fragment: According to RFC 3986.

Returns:

IVOID.

get_pol_states() List[Literal['I', 'Q', 'U', 'V', 'RR', 'LL', 'RL', 'LR', 'XX', 'YY', 'XY', 'YX', 'POLI', 'POLA']] | None

Parses the polarization states to _PolStatesListType.

Returns:

List of polarization states if field-value is not None.

instrument_name: str | None = None
o_ucd: str | None = None
obs_collection: str | None = None
obs_creator_did: str | None = None
obs_id: str | None = None
obs_publisher_did: str | None = None
obs_title: str | None = None
pol_states: str | None = None
pol_xel: int | None = None
s_dec: float | None = None
s_fov: float | None = None
s_pixel_scale: float | None = None
s_ra: float | None = None
s_region: str | None = None
s_resolution: float | None = None
s_xel1: int | None = None
s_xel2: int | None = None
classmethod scircle(point: tuple[float, float], radius: float, *, ndigits: int = 3, suffix: str = 'd') str

Converts point & radius to scircle str for s_region.

scircle: https://pgsphere.github.io/doc/funcs.html

E.g. <(0d,90d),60d>

Args:

point: RA,DEC [deg] point. radius: Radius [deg]. ndigits: Number of digits to round. suffix: Suffix for each number (e.g. “d” for deg).

Returns:

scircle str.

set_fields(**kwargs: Any) None

Set fields from kwargs if they’re valid.

Args:

kwargs: Field names and values to set.

set_pol_states(pol_states: List[Literal['I', 'Q', 'U', 'V', 'RR', 'LL', 'RL', 'LR', 'XX', 'YY', 'XY', 'YX', 'POLI', 'POLA']]) None

Sets pol_states from a pythonic interface to a str according to ObsCore.

Overwrites if pol_states already exists.

Args:

pol_states: Polarization states.

classmethod spoint(point: tuple[float, float], *, ndigits: int = 3, suffix: str = 'd') str

Converts point to spoint str for s_region.

spoint: https://pgsphere.github.io/doc/funcs.html

E.g. (10d,20d)

Args:

point: RA,DEC [deg] point. ndigits: Number of digits to round. suffix: Suffix for each number (e.g. “d” for deg).

Returns:

spoint str.

classmethod spoly(poly: Sequence[tuple[float, float]], *, ndigits: int = 3, suffix: str = 'd') str

Converts poly to spoly str for s_region.

spoly: https://pgsphere.github.io/doc/funcs.html

E.g. {(0,0),(1,0),(1,1)}

Args:

args: Consecutive RA,DEC [deg] poly tuples. ndigits: Number of digits to round. suffix: Suffix for each number (e.g. “d” for deg).

Returns:

spoly str.

t_exptime: float | None = None
t_max: float | None = None
t_min: float | None = None
t_resolution: float | None = None
t_xel: int | None = None
target_class: str | None = None
target_name: str | None = None
to_dict(fpath: Path | str | None = None, *, ignore_none: bool = True) Dict[str, Any]

Converts this dataclass into a dict.

Args:

fpath: File-path to write dump. ignore_none: Ignore non-mandatory None fields?

Returns:

Dataclass as dict.

karabo.data.src module

class RucioMeta(namespace: str, name: str, lifetime: int, dataset_name: str | None = None, meta: Dict[str, Any] | ObsCoreMeta | None = None)

Bases: object

Metadata dataclass to handle SKA SRC Ingestion for Rucio.

This dataclass may go through some changes in the future in case

the Rucio service is also changing.

See https://gitlab.com/ska-telescope/src/ska-src-ingestion/-/tree/main.

Args:

namespace: The Rucio scope in which the new file should be located.

name: The name of the file within Rucio - the scope:name together is the

Data Identifier (DID).

lifetime: The lifetime in seconds for the new file to be retained.

dataset_name: The Rucio dataset name the file will be attached to.

The dataset scope will be the same as that specified in namespace.

meta: An object containing science metadata fields, which will be set against

the ingested file. This should be either a dict of ObsCoreMeta or an instance of ObsCoreMeta.

dataset_name: str | None = None
classmethod get_ivoid(*, authority: str = 'test.skao', path: str = '/~', namespace: str, name: str, fragment: str | None = None) str

Gets the IVOA identifier for ObsCoreMeta.obs_creator_did.

SRCNet Rucio IVOID according to IVOA ‘REC-Identifiers-2.0’. Do NOT specify

RFC 3986 delimiters in the input-args, they’re added automatically.

Please set up an Issue if this is not up-to-date anymore.

Args:
authority: Organization (usually a data provider) that has been granted

the right by the IVOA to create IVOA-compliant identifiers for resources it registers.

path: Resource key. It’s ‘a resource that is unique within the namespace

of an authority identifier.

namespace: RucioMeta.namespace. name: RucioMeta.name (filename in Rucio). fragment: According to RFC 3986.

Returns:

IVOID.

classmethod get_meta_fname(fname: TFilePathType) TFilePathType

Gets the metadata-filename of fname.

It’s according to the Rucio metadata specification (if up-to-date). The

specification states that metadata is expected to be provided by two files: fname: data_name and metadata: data_name.<metadata_suffix>, where the suffix is set to meta.

Args:

fname: Filename to create metadata filename from.

Returns:

Metadata filename (or filepath if fname was also a filepath).

lifetime: int
meta: Dict[str, Any] | ObsCoreMeta | None = None
name: str
namespace: str
to_dict(fpath: Path | str | None = None, *, ignore_none: bool = True) Dict[str, Any]

Converts this dataclass into a dict.

Args:
fpath: File-path to write dump. Consider using get_meta_fname

to get an fpath according to the Rucio specification.

ignore_none: Ignore None fields?

Returns:

Dataclass as dict.

Module contents