wiki:ScantableRedesignProposal

Version 4 (modified by Malte Marquarding, 14 years ago) (diff)

--

Scantable Redesign Proposal

Summary

The current Scantable schema is biased towards rpfits style data (or the structure of atnf/PKSIO/PKSrecord.h). Using the PKSreader and the current Scantable format a lot of metadata from the MeasurementSets get lost. This proposal tries to rectify this without as major impact on the current use of the Scantable.

The proposal in summary:

  • Keep Scantable as the asap internal format
  • extend the schema to include all relevant (meta)data in the MeasurementSet (MS)
  • implement a direct MS->Scantable Filler avoiding (meta)data loss by going through the PKSreader
  • use MS as the persistent asap storage container
    • implement a direct Scantable->MS Writer avoiding (meta)data loss by going through the PKSwriter

optional performance/usability changes:

  • implement a direct ASDM->Scantable Filler (performance)
  • implement an upgrade task for the old Scantable to the new Scantable schema ( not required for standard as Scantable is not assumed to be persistent)

Scantable vs MS

Even though both containers are based on casa::Table they use different philosophies in terms of access. The main differences are:

Frequency handling

  • MS has the SPECTRAL_WINDOW_ID for identifying similar frequency set-ups
  • Scantable has a combination of IFNO and FREQUENCY_ID to group similar frequencies.

The concept of an IF for grouping doesn't really exist in the MS (there are entries which can be used for this in the SPECTRAL_WINDOW table An IF for example can have multiple FREQUENCY_IDs e.g. in case of doppler tracked data. Frequency alignment then unifies these.

Handling of spectral channels

  • MS has two columns DATA and FLOAT_DATA containing n_pol x n_channel matrices of complex and float values respectively. In case of single-dish data the FLOAT_DATA column contains the spectra.
  • Scantable has only one column SPECTRA and one row per polarisation containing a float vector of n_channel. Cross-correlations are stored as real and imaginary parts in separate rows. For example for polarimetry data. we have four rows containing XX, YY, Real(XY), Imag(XY)

asap relies on these structures in almost all access methods. As such, changing the internal format to MS would mean almost a complete reimplementation of asap.

Referencing

  • Scantable relates to all sub-tables via an ID in the main table
  • MS releates to sub-tables via a time stamp.

Relevant Documentation

  • attach spreadsheet for impact on all asap functions/methods/tasks
  • attach image of Scantable schema