===== vex2difx ===== vex2difx is a program that takes a vex files (such as one produced by sched with various tables based on observe-time data appended) and a configuration file (described below) and generates one or more .input files for use with difx. Each .input file is accompanied by a .calc file which is used by calcif2 to generate the .delay and .uvw files needed at correlation time. vex2difx along with calcif2 supercedes the functionality of vex2config and vex2model. ===== The vex2difx philosophy ===== Users and future developers of vex2difx should be aware of the approach used in designing vex2difx which can be summarized as follows: - The output files should never need to be hand edited - Simple experiments should not require complicated configuration - All features implemented by mpifxcorr should be accessible - All experiments expressible by vex should be supported - The configuration file should be human and machine friendly - Command line arguments should not influence the processing of the vex file Note that not all of these ideals have been completely reached as of now. It is not the intention of the developer to guess all possible future needs of this program. Most new features will be easy to implement so send a message to the difx-users mailing list for requests. ===== The vex file ===== The VLBI scheduling programs [[http://www.aoc.nrao.edu/software/sched/index.html|sched]] and sked both produce vex files that are used to control antennas for observations. Certain information that is not available prior to an observation needs to be provided to vex2difx in some way. One way is to append this data to the vex file. The alternative is to provide it in the .v2d file (as shown further down). This information includes: - The Earth orientation parameters ($EOP block in the vex file, or EOP blocks in the .v2d file) - The antenna clock offsets ($CLOCK block in the vex file, or clock values in the ANTENNA blocks of the .v2d file) - The volume serial numbers for the recording media ($TAPELOG_OBS block, or file lists in the ANTENNA blocks of the .v2d file) Population of these three tables is necessarily a correlator/array specific operation and is the responsibility of the vex2difx user to arrange. Note, only formal vex files are supported as input to vex2difx. Similar looking ovex files used at some/all Mark4 correlators are not acceptable, however, with a small amount of work an ovex file can be hand converted to a valid vex file. It would not be hard to write a conversion script to do this automatically. ===== The configuration file ===== The configuration file consists of a number of global parameters that affect the way that jobs are created and several sections that can customize correlation on a per-source, per mode, or per scan basis. All parameters (those that are global and those that reside inside sections) are specified by a parameter name, the equal sign, and one value, or a comma-separated list of values, that cannot contain whitespace. Whitespace is not required except to keep parameter names, values, and section names separate. All parameter names and values are case sensitive except for source names and antenna names. The # is a comment character; any text after this on a line is ignored. ==== Parameter Types ==== * ''bool'' → A boolean value that can be True or False. Any value starting with 0, f, F, or - will be considered False and otherwise True. * ''float'' → A floating point number. Can be of the forms: 1.23, 1.2e-4, -12.6, 4 * ''int'' → An integer. * ''string'' → Any sequence of printable(non-whitespace) characters. Certain fields require strings of a maximum length or certain form. * ''date'' → A number or string representing Universal Time. Several formats are supported: * Modified Julian Day : ''54345.341944'' * Vex time format : ''2009y245d08h12m24s'' * VLBA-like format : ''2009SEP02-08:12:24'' (Note - between date and time!) * ISO 8601 format : ''2009-09-02T08:12:24'' ==== Specifying data formats ==== The ''format'' parameter of an ANTENNA or DATASTREAM section in the ''.v2d'' file, or the ''track_frame_format'' within a vex TRACKS section gives ''vex2difx'' information needed to determine how the data is arranged on the media. In the past (before DiFX 2.5) the two sources of format information had different formatting options. With DiFX 2.5 a new unified format decoding infrastructure has been added that give more flexibility. With this formats, either in vex or ''.v2d'' files can be specified in one of several ways: * * /// e.g., INTERLACEDVDIF/3:2:1:0/5032/2 or VDIF/7,8/8032/2 //VDIF only// * The comma separator for threads must be used within a ''.vex'' file * The colon separator for threads must be used within a ''.v2d'' file * // e.g., VDIF/5032/2 //VDIF only// * e.g., VDIF5032 //VDIF only// * _--- e.g., VDIF_5032-2048-4-2 * 1_ e.g., VLBA1_4 //VLBA and Mark4 only// * --- * 1_--- //VLBA and Mark4 only// The format class, , can be one of the following: * VDIF * VDIFL (legacy VDIF) * VDIFC (VDIF with complex samples) * VDIFD (VDIF with double sideband complex samples) * INTERLACEDVDIF (explicitly multi-thread VDIF -- often interchangeable with VDIF) * Mark5B * KVN5B * VLBA * Mark4 or MKIV * S2 * LBA * LBAVSOP Some general tips: * A list of threads can be colon or comma separated. * If a list of threads is provided it is assumed that the format is INTERLACEDVDIF, even if only VDIF is used to specify the format class. * The size field always refers to the entire length of a data frame, including any headers. * If the number of recorded channels provided is an integer multiple, //m//, of the thread count, then it is assumed that each thread has multiple channels. The ordering of the channels in the vex file is mapped to order of channels in vex as follows: the first //m// channels belong to the numerically first thread, the next //m// channels belong to the next thread, ... Note 1. this is not yet implemented in mpifxcorr, and 2. vex2 will provide a more natural way to proceed. ==== Global Parameters ==== Global parameters can be specified one or many per line such as: maxGap = 2000 # seconds or mjdStart = 52342.522 mjdStop=52342.532 ^ Parameter name ^ Type ^ Units ^ Default ^ Comments ^ | vex | string | | REQUIRED | filename of the vex file to process; this is the only required parameter | | mjdStart | date | | obs. start | discard any scans or partial scans before this time | | mjdStop | date | | obs. stop | discard any scans or partial scans after this time | | break | date | | | mjd times of forced manual job breaks | | minSubarray | int | | 2 | don't make jobs for subarrays with fewer than this many antennas | | maxGap | float | sec | 180 | split an observation into multiple jobs if there are correlation gaps longer than this number | | tweakIntTime | bool | | False | Adjust (up to 40%) integration time to ensure integer blocks per send (newly re-enabled) | | maxSize | float | MB | 2000 | The maximum output fits file size, estimated | | singleScan | bool | | False | if True, split each scan into its own job | | singleSetup | bool | | True | if True, allow only one setup per job; True is required for FITS-IDI conversion | | maxLength | float | sec | 7200 | don't allow individual jobs longer than this amount of time | | minLength | float | sec | 2 | don't allow individual jobs shorter than this amount of time | | dataBufferFactor | int | | 32 | the mpifxcorr DATABUFFERFACTOR parameter; see mpifxcorr documentation | | nDataSegments | int | | 8 | the mpifxcorr NUMDATASEGMENTS parameter | | jobSeries | string | | ''job'' | the base filename of .input and .calc files to be created | | startSeries | int | | 1 | the default starting number for jobs created | | sendLength | float | sec | 0 | roughly the amount of data to send at a time from datastream processes to core processes | | sendSize | int | bytes | 5000000 | roughly the send size from datastream to core | | antennas | string | | all ants. | a comma separated list of antennas to include in correlation | | baselines | string | | all bls. | a comma separated list of baselines; see below | | padScans | bool | | True | insert non-correlation scans in recording gaps to prevent mpifxcorr from complaining | | invalidMask | int | | 0xFFFF | this bit-field selects which flag conditions are considered when writing flag file: 1=Recording, 2=On source, 4=Job time range, 8=Antenna in job | | visBufferLength | int | | 32 | number of visibility buffers to allocate in mpifxcorr | | simFXCORR | bool | | False | simulate the VLBA hardware correlator integration and start times | | overSamp | int | | | force all baseband channels to use the provided overSampling | | mode | string | | normal | options: normal and profile; see section below | | threadsFile | string | | | overrides the name of the threads file to use | | nCore | int | | | with nThread, cause a .threads file to be written | | nThread | int | | | Number of threads per core to write to .threads file | | machines | string | | | a list of machine names used to populate a .machines file | | maxReadSize | int | bytes | 25000000 | Max read size in bytes (larger values cause issues with Mk5 module playback) | | minReadSize | int | bytes | 10000000 | Min read size in bytes (smaller values mean probable inefficiency) | Note that the baselines parameter supports the following syntaxes: A1-A2 A1+A2+A3-A4+A5 A1-* A1+A2-* and so on. For each list member, all baselines consistant with an antenna match on both sides will be kept. ==== SOURCE sections ==== A source section can be used to change the properties of an individual source, such as its position or name. In the future this is where multiple correlation centers for a given source will be specified. A source section is enclosed in a pair of curly braces after the keyword SOURCE followed by the name of a source, e.g.: SOURCE 3C273 { source parameters go here } or equivalently SOURCE 3c273 { source parameters go here } ^ Parameter name ^ Type ^ Units ^ Default ^ Comments ^ | ra | | J2000 | | right ascension, e.g., 12h34m12.6s or 12:34:12.6 | | dec | | J2000 | | declination, e.g., 34d12'23.1" or 34:12:23.1 | | name | string | | | new name for source | | calCode | char | | ' ' | calibration code, typically A, B, C for calibrators, G for a gated pulsar, or blank for normal target | | naifFile | string | | | Path to a leap second kernel file for SPICE. Only used with near-field correlations | | ephemObject | string | | | Name of the object from the ephemFile to be associated with this source. Only used for near-field correlations | | ephemFile | string | | | Path to a planetary ephemeris file for SPICE. Only used with near-field correlations. bsp or tle files are allowed. | | doPointingCentre | bool | | true | Whether the pointing centre should be correlated (only ever turned off for multi-phase centre | | addPhaseCentre | string | | | contains info on a source to add, with ra, dec and optionally name/calcode with no spaces, "/" separation and "@" in place of "=" e.g., "addPhaseCentre = name@1010-1212/RA@10:10:21.1/Dec@-12:12:00.34" | ==== ANTENNA sections ==== An antenna section allows properties of an individual antenna, such as position, name, or clock/LO offsets, to be adjusted. ^ Parameter name ^ Type ^ Units ^ Default ^ Comments ^ | name | string | | | New name to assign to this antenna | | polSwap | bool | | False | Swap the polarizations (i.e. L ←→ R) for this antenna | | clockOffset | float | us | vex value | Overrides the clock offset value from the vex file; used in conjunction with clockEpoch | | clockRate | float | us/s | vex value | Overrides the clock offset rate value from the vex file; used in conjunction with clockEpoch | | clockEpoch | date | | vex value | Overrides the epoch of the clock rate value; must be present if clockRate or clockOffset parameter is set | | deltaClock | float | us | 0.0 | Adds to the clock offset (either the vex value or the clockOffset above | | deltaClockRate | float | us/s | 0.0 | Adds to the clock rate (either the vex value or the clockRate above | | X | float | m | vex value | Change the X coordinate of the antenna location | | Y | float | m | vex value | Change the Y coordinate of the antenna location | | Z | float | m | vex value | Change the Z coordinate of the antenna location | | format | string | | | Force format to be one of VLBA, MKIV, Mark5B, S2, VDIF or INTERLACEDVDIF, LBAVSOP, LBASTD | | file | strings | | (none) | A comma separated list of files that will be copied verbatim to the DATA TABLE of the input file | | filelist | string | | | A filename listing files for the DATA TABLE and optionally mjdStart and mjdStop for each | | networkPort | int | | | the eVLBI network port to use. This forces NETWORK media type in .input | | windowSize | int | | | TCP window size for eVLBI. Set to <0 for UDP | | UDP_MTU | int | | | Same as setting windowSize to negative of value | | vsn | string | | | Override the Mark5 Module to be used | | zoom | string | | | Uses the global zoom configuration with matching name for this antenna, e.g., zoom=Zoom1 will match the ZOOM block called Zoom1 | | addZoomFreq | string | | noparent=true | Adds a zoom band with specified freq/bw as shown: freq@1810.0/bw@4.0[/specAvg@4][/noparent@ftrue] | | freqClockOffs | string | us | | Adds clock offsets to each recorded frequency using the following format:freqClockOffs=f1,f2,f3,f4 (must be same length as number of recorded freqs, first value must be zero) | | loOffsets | string | Hz | | Adds LO offsets to each recorded frequency using the following format:loOffsets=f1,f2,f3,f4 (must be same length as number of recorded freqs) | | tcalFreq | int | Hz | 0 | Enables switched power detection at specified frequency | | phaseCalInt | int | MHz | 1 | Zero turns off phase cal extraction, positive value is the interval between tones to be extracted | | toneGuard | float | MHz | 0.125 of bandwidth | When using toneSelection ''smart'' or ''most'' don't select tones within this range of band edge, if possible | | toneSelection | string | | ''smart'' | Use an algorithm to choose tones for you. Read the code to learn more. | | sampling | string | | ''REAL'' | Set to ''COMPLEX'' for complex sampled data or ''COMPLEX_DSB'' for double sideband | | fake | bool | | False | enable a fake data source | | mjdStart | date | | obs. start | discard any data from this antenna before this time | | mjdStop | date | | obs. stop | discard any data from this antenna after this time | | machine | string | | | //Coming in ver. 2.5// if writing a .machines file, link this machine to this ANTENNA's datastream process | | datastreams | strings | | (none) | //Coming in ver. 2.5// links to DATASTREAM sections; below for more info | The addZoomFreq parameter freq always specifies the **lower edge** of the frequency channel, regardless of whether or not the parent band is USB or LSB. The optional arguments for addZoomFreq control spectral averaging (currently constrained to be same as the parent band) and whether or not the parent band is still correlated - default is that it is *not* correlated. These are more for potential future compatibility. If it is intended to run difx2fits, "FreqId" should be used in the SETUP section to select frequencies corresponding to parent band(s) for "addZoomFreq", in the case where only a subset of recorded IFs have zoom bands. This avoids bands of different width being present in the same output job, which will cause difx2fits to fail. the "freqClockOffs" parameter is intended for fixing small differences between frequency subbands, introduced by e.g. different cabling to parallel backends. It **cannot** be used to fix large offsets (e.g. integer seconds) between frequency subbands. The reason is that the samples from different frequency subbands are interleaved, so you have one block of data being FFT'd, and then these small corrections are applied after the FFT. So, at most they could correct for offsets of length one FFT duration - beyond that, there is no overlap between the two antennas any more! **At present, there is no way to correct for large (e.g. integer second offsets) on some but not all frequency channels, other than multiple correlation passes with the different clocks and a messy post-processing combination of the results (e.g. SPLIT, DBCON in AIPS).** Legal values for ''toneSelection'' are ''vex'' ''none'' ''all'' ''ends'' ''smart'' or ''most'': |''smart''| write the 2 most extreme tones at least toneGuard from band edge [default]| |''vex'' | write the tones listed in the vex file to FITS | |''none'' | don't write any tones to FITS | |''all'' | write all extracted tones to FITS | |''ends'' | write the 2 most extreme tones to FITS | |''most'' | write all tones not closer than toneGuard to band edge | VDIF is primarily supported in DiFX2. If a format of simply VDIF is given, the frame size and number of bits will be assumed to be 5032 bytes and 2 bits, respectively. Otherwise, you can specify frame size and number of bits with a format line like: "format=VDIF/5032/2" (for 5032 bytes and 2 bits, again). For interlaced VDIF (where multiple threads are present in one stream), presently the following simple case is supported - all threads must have the same frame size and number of bits, and the same bandwidth, and all must contain exactly one subband. The INTERLACEDVDIF format takes a list of such threads and multiplexes it on the fly back into a single multiple-subband VDIF thread. For INTERLACEDVDIF, you must present a fully specified format line, which has one additional parameter compared to normal VDIF: a list of the threadIds which are to be muxed. For example, if you had 4 single subband VDIF threads interlaced in a file, and they had thread Ids of 0, 1, 16 and 17, and the order of the subbands (when compared to the list of channels in the vex file) was 0, 1, 16 and 17 then the format line would be: "format=INTERLACEDVDIF/0:1:16:17/1032/2" for 1032 byte frames and 2 bit sampling. Please note that vex uses as a clock sign convention that is positive for a formatter with its clock running fast (i.e., the second tick happens too early). The clockOffset and clockRate in this ANTENNA section, as well as FITS files, have the opposite sign convention. ==== DATASTREAM sections (coming in DiFX 2.5) ==== New in upcoming DiFX version 2.5 will be support for multiple datastreams per antenna. Since vex1.5 does not have the concept of multiple datastreams per antenna the additional information must be provided explicitly in the .v2d file. Within .v2d files datastreams are linked to antennas. Logically speaking the datastreams are functions not only of antenna but also of setup; cases that have varying recording modes through an experiment invariably have changes to the datastreams, as used by DiFX, as well. Thus the implementation described here does not provide a fully general solution. In cases where this breaks down it is likely that multiple .v2d files, each acting on a subset of the setups used, will allow the needed flexibility. Note that when vex2 is fully supported, the STREAMS block within vex will give users access to the full generality of DiFX on a setup-by-setup basis. To enable multiple datastreams for an antenna, simply define 2 or more DATASTREAM sections (described below) and link them with the appropriate antenna by using the datastreams parameter of ANTENNA sections. By default if there are //N// DATASTREAMS defined for an antenna, each will get one //N//th of the channels with the order of the channels preserved, meaning that the order of the datastreams argument does matter. This can be overridden with an nBand parameter. ^ Parameter name ^ Type ^ Units ^ Default ^ Comments ^ | format | string | | | the data format for this (see below for more details) | | sampling | string | | ''REAL'' | Set to ''COMPLEX'' for complex sampled data or ''COMPLEX_DSB'' for double sideband | | file | strings | | (none) | A comma separated list of files that will be copied verbatim to the DATA TABLE of the input file | | filelist | string | | | A filename listing files for the DATA TABLE and optionally mjdStart and mjdStop for each | | networkPort | int | | | the eVLBI network port to use. This forces NETWORK media type in .input | | windowSize | int | | | TCP window size for eVLBI. Set to <0 for UDP | | UDP_MTU | int | | | Same as setting windowSize to negative of value | | vsn | string | | | Override the Mark5 Module to be used | | fake | bool | | False | enable a fake data source | | nBand | int | | | number of bands (baseband channels) to assign to this datastream | | machine | string | | | if writing a ''.machines'' file, link this machine to this datastream's process | Specifying data formats is often tricky, especially in cases where vex doesn't properly support the particular format type (e.g., VDIF with vex1.5), or in the multiple datastream case. It is suggested to use a full format descriptor (e.g., "format=INTERLACEDVDIF/0:1:16:17/1032/2" rather than just "format=INTERLACEDVDIF") even if the information should be present to fill in the gaps. In general vex2difx tries to require minimal information, but sometimes its assumptions may differ from yours. ==== SETUP sections ==== Setup sections are enclosed in braces after the word SETUP and a name given to this setup section. The setup name is referenced by a RULE section (see below). A setup with the special name ''default'' will be applied to any scans not otherwise assigned to setups by rule sections. If no setup sections are defined, a setup called ''default'', with all default parameters, will be implicitly created and applied to all scans. The order of setup sections is immaterial. Note: The use of nChan (plus optionally specAvg) to set final (and FFT) spectral resolution is discouraged. It is maintained for backwards compatibility and convenience, but if you have different subband bandwidths, you **cannot** use nChan, and must instead use specRes (and FFTSpecRes, if you want to explicitly set the FFT spectral resolution, for example in multifield projects). ^ Parameter name ^ Type ^ Units ^ Default ^ Comments ^ | tInt | float | sec | 2 | integration time | | FFTSpecRes | float | MHz | 0.125 | spectral resolution of first stage FFTs | | specRes | float | MHz | 0.5 | spectral resolution of visibilities produced | | nChan | int | | 16 | number of post-averaged channels per spectral window; currently must be a power of 2. Do not use in combination with specRes/FFTSpecRes; nChan is only for convenience in simple cases (all stations have the same bandwidth for all subbands) | | doPolar | bool | | True | correlate cross hands when possible? | | subintNS | int | ns | 160000000 | The mpifxcorr SUBINT NS; should eventually be set to a smarter default | | guardNS | int | ns | 2000 | The mpifxcorr GUARD NS; 2000 is almost always fine, should eventually be adjusted automatically | | maxNSBetweenUVShifts | int | ns | 2000000000 | Used for multiphase centre stuff. if better time resolution than 1 threads portion of a subint is required | | maxNSBetweenACAvg | int | ns | 2000000000 | Used for STA dumping (transient searches) if better time resolution than 1 threads portion of a subint is required | | specAvg | int | | 8 | The spectral averaging to perform inside the correlator, at the end of a subint | | fringeRotOrder | int | | 1 | The fringe rotation order - 0=post-F, 1=linear, 2=quadratic | | strideLength | int | | 16 | The number of channels to "stride" for fringe rotation, fractional sample correction etc | | xmacLength | int | | 128 | The number of channels to "stride" for cross-multiply accumulations | | numBufferedFFTs | int | | 1 | The number of FFTs to do in a row for each datastream, before XMAC'ing them all | | postFFringe | bool | | False | do fringe rotation after FFT? | | binConfig | string | | none | if specified, apply this pulsar bin configuration file to this setup | | freqId | int list | | none | a comma separated list of integers that are freq table indexes to select which bands to correlate; default is to correlate all. **Note:** this should be used to select parent bands for zoom frequencies if difx2fits is to be run. | | phasedArray | string | | | if specified, tells DiFX to produce a phased array output instead of cross correlations, using the setup specified in this phased array config file | ==== EOP sections ==== It is possible to specify the Earth Orientation Parameters (EOPs) through the .v2d file. Normally these values will be appended to the vex file, but there may be cases where a completely unmodified vex file is desired (eVLBI maybe?). Like ANTENNA and SOURCE sections, each EOP section has a name. The name must be in a form that can be converted directly to a date (see above for legal date formats). Conventional use suggests that these dates should correspond to 0 hours UT; deviation from this practice is at the users risk. It is not advised to mix EOP values stored in the vex and .v2d files. ^ Parameter name ^ Type ^ Units ^ Default ^ Comments ^ | tai_utc | float | sec | | TAI minus UTC; the leap-second count | | ut1_utc | float | sec | | UT1 minus UTC; Earth rotation phase | | xPole | float | asec | | X component of spin axis offset | | yPole | float | asec | | Y component of spin axis offset | Example section EOP 55005 { tai_utc=34 ut1_utc=0.236958 xPole=0.10597 yPole=0.53906 } ==== RULE sections ==== A rule section is used to assign a setup to a particular source name, calibration code (currently not supported), scan name, or vex mode. The order of rule sections //does// matter as the order determines the priority of the rules. The first rule that matches a scan is applied to that scan. The correlator setup used for scans that match a rule is determined by the parameter called ''setup''. A special setup name ''SKIP'' causes matching scans not to be correlated. Any parameters not specified are interpreted as fully inclusive. Note that multiple rule sections can reference the same setup section. Multiple values may be applied to any of the parameters except for ''setup''. This is accomplished by comma separation of the values in a single assignment or with repeated assignments. Thus RUlE rule1 { source = 3C84,3C273 setup = BrightSourceSetup } is equivalent to RULE rule2 { source = 3C84 3C273 setup = BrightSourceSetup } is equivalent to RULE rule3 { source = 3C84 source = 3C273 setup = BrightSourceSetup } The names given to rules (e.g., rule1, rule2 and rule3 above) are not used anywhere but are required to be unique. ^ Parameter name ^ Type ^ Units ^ Default ^ Comments ^ | scan | string | | | one or more scan name, as specified in the vex file, to select with this rule | | source | string | | | one or more source name, as specified in the vex file, to select with this rule | | calCode | char | | | one or more calibration code to select with this rule | | mode | string | | | one or more modes as defined in the vex file to select with this rule | | setup | string | | | The name of the SETUP section to use, or SKIP if this rule describes scans not to correlate | Note that source names and calibration codes reassigned by source sections are not used. Only the names and calibration codes in the vex file are compared. ==== ZOOM sections ==== A zoom section specifies a list of zoom bands, using the same syntax used to add a zoom band to an individual antenna. This zoom setup can then be selected by any number of antennas (making it simpler, with less typing, to add the same zoom setup to many antennas) ^ Parameter name ^ Type ^ Units ^ Default ^ Comments ^ | addZoomFreq | string | | | Adds a zoom band with specified freq/bw as shown: freq@1810.0/bw@4.0[/specAvg@4][/noparent@false] | Defaults are that noparent is true (e.g., parent bands are not correlated), and spectral averaging is performed as specified for the parent band (e.g., based on specRes and fftSpecRes. An example ZOOM section that adds 4 zoom bands of width 8 MHz might look like: ZOOM zoom1 { addZoomFreq = freq@1600.49/bw@8.0 addZoomFreq = freq@1608.49/bw@8.0 addZoomFreq = freq@1616.49/bw@8.0 addZoomFreq = freq@1624.49/bw@8.0 } ==== COMMENT sections //coming in DiFX 2.5// ==== Starting with version 2.5.0 one can include COMMENT blocks in the .v2d file. These have no effect on the files written by vex2difx but allow comments (likely instructions to the person executing vex2difx) to be seen at the end of the output to the terminal. Each COMMENT block will make a new comment, separated by one line of whitespace in the output. A comment block starts with COMMENT { and ends with a } For example: COMMENT { Correlate this four times to see if we get the same answer. } ===== Modes ===== Currently vex2difx operates in one of two modes: * //normal// The output of vex2difx is a set of files useful for correlating the data. This is, as the name suggests, the normal mode of operation for vex2difx. * //profile// This mode specializes in making files useful for forming pulse profiles in preparation for pulsar gating. The difference compared to //normal// mode is that the standard autocorrelations are turned off and instead are computed as if they are cross correlations. This allows multiple pulsar bins to be stored. No formal cross correlations are performed. To be useful, one must create and specify a binconfig file and select only the pulsar(s) from the experiment. ===== Command line arguments ===== vex2difx is executed on the command line with: ''vex2difx'' [options] inputFile Although no command line options can change the way that vex2difx processes a file, there are some options that the user may find useful: * ''-h'' or ''--help'' Print usage information to the screen. This is the same as if no arguments were supplied to vex2difx. * ''-o'' or ''--output'' Writes a configuration file called //inputFile//''.params'' which is a valid configuration file identical to //inputFile//, but with all assumed defaults populated. This is useful to see what was actually assumed. * ''-v'' or ''--verbose'' Prints much more information to the screen. Use this option twice for even more information. * ''-d'' or ''--delete-old'' Deletes all output from previous runs of vex2difx with same prefix. This is most useful when rerunning and a smaller number of jobs are created. * ''-s'' or ''--strict'' Treat some warnings as errors and quit. ===== Reporting problems ===== If you have a problem with vex2difx, please email the difx users email group. Be sure to include the following in the email: - A description of the problem - The v2d file supplied to vex2difx - The vex file pointed to from the v2d file - the captured output when running vex2difx with extra verbosity (use options ''-v -v'') ===== Examples ===== ==== Trivial case ==== The following example demonstrates the simplest case where all defaults are assumed vex=trivial.vex ==== Simple case ==== The following is a more realistic case for a simple experiment vex=simple.vex SETUP default { nChan=64 tInt =3.0 } ==== Source coordinate change ==== This shows how to change the coordinates of two sources in a file vex=coords.vex SOURCE J1232+131 { ra=12h32m15.12s dec=13d07'12.5" } SOURCE PLANETX { ra=11h59m59.999s dec=-12d59'59.88" } SETUP default { nChan=128 } ==== Two setups ==== This is a more complicated file showing how to apply different correlator setups to different sources vex=twosetups.vex maxGap=1000 # don't split the jobs at every source change, # instead, make just 2 interleaved jobs antennas=BR,FD,HN,MK # select only these four antennas for now SETUP target { nchan=1024 tInt =1.2 } SETUP calibrator { nchan=32 tInt =4 } RULE calRule { source=J1234+1231,3C84,3C273 setup =calibrator } RULE targetRule { # note: not specifying any restrictions so all sources that don't # match above will match here setup = target } The above could have used a default setup rather than a catch-all rule and resulted in the same output. ===== Specifying media ===== vex2difx allows .input file generation for two types of media. A single .input file can have different media types for different stations. Ensuring specification of media is important as antennas with no media will be dropped from correlation. The default media choice is Mark5 modules. The TAPELOG_OBS table in the input vex file should list the time ranges valid for each module. Jobs will be split at Mark5 module boundaries; that is, a single job can only support a single Mark5 unit per station. All stations using Mark5 modules will have DATA SOURCE set to MODULE in .input files. If file-based correlation is to be performed, the TAPELOG_OBS table is not needed and the burden of specifying media is moved to the .v2d file. The files to correlate are specified separately for each antenna in an ANTENNA block. Note when specifying filenames, it is up to the user to ensure that full and proper paths to each file are provided and that the computer running the datastream for each antenna can "see" that file. Two keywords are used to specify data files. They are not mutually exclusive but it is not recommended to use both for the same antenna. The first is "file". The value assigned to "file" is one or more (comma separated) files. It is OK to have multiple "file" keywords per antenna; all files supplied will be stored in the same order internally. The second keyword is "filelist" which takes a single argument, which is a file containing the list of files to read. This "filelist" file only needs to be visible to vex2difx. This file contains a list of filenames and optionally start and stop dates (in one of the formats listed above). Comments can be started with a # and are ended by the end-of-line character. Like for the "file" keyword, the filenames listed must be in time order, even if start and stop dates are supplied. An example "filelist" file is below: # This is a comment. File list for MK for project BX123 /data/mk/bx123.001.m5a 54322.452112 54322.511304 /data/mk/bx123.002.m5a 54322.512012 54322.514121 # a short scan /data/mk/bx123.003.m5a 54322.766323 54322.812311 If times for a file are supplied, the file will be included in the .input file DATA TABLE only if the file time range overlaps with the .input file time range. If not supplied, the file will be included regardless of the .input file time range, which could incur a large performance problem. A few sample ANTENNA blocks are shown below: ANTENNA MK { filelist=bx123.filelist.mk } ANTENNA OV { file=/data/ov/bx123.001.m5a, /data/ov/bx123.002.m5a, /data/ov/bx123.003.m5a } ANTENNA PT { file=/data/pt/bx123.003.m5a } # recording started late here ANTENNA default { networkPort = 320 } # all antennas without ANTENNA setups will get this ===== Splitting of jobs ===== Certain events cause a forced job break that could, in some cases, end up requiring many individual software correlations to complete processing of a project. Effort has been made to minimize the number of these cases. The following situations will cause a job break: change in clock model at a station, change of a Mark5 module, change in number of channels or sub-bands, multiple simultaneous subarrays, and leap seconds. Future versions of vex2difx and DiFX may relax some of these circumstances. Some parameters have defaults that may cause more job splitting than is desired (such as maxLength) and can be tuned. ===== Mark5B issues ===== The Mark5B format, including its 2048 Mbps extension, is now supported by vex2difx. The "track assignments" for Mark5B format has never been formally documented. Vex2difx has adopted the track assignment convention used by Haystack. Formally speaking, Mark5B has no "tracks". Instead it stores up to 32 bitstreams in 32 bit words. The concept of "fanout" is no longer used with Mark5B. Instead, the equivalent operation of spreading one bitstream among 1 or more bits in each 32 bit word is performed automatically. Thus to specify a Mark5B mode, only three numbers are needed: Total data bit rate (excluding frame headers), number of channels, and number of bits per sample (1 or 2). The number of bitstreams is the product of channels and bits. The $TRACKS section of the vex file is used to convey the bitstream assignments. Individually, the sign and magnitude bits for each channel are specified with fanout_def statements. In unfortunate correspondence with existing practice, 2 is the first numbered bitstream and 33 is the highest. In 2-bit mode, all sign bits must be assigned to even numbered bitstreams and the corresponding magnitude bit must be assigned to the next highest bitstream. To indicate that the data is in Mark5B format, one must either ensure that a statement of the form track_frame_format = MARK5B; must be present in the appropriate $TRACKS section or format = MARK5B must be present in each appropriate ANTENNA section of the .v2d file. As a concrete example, a complete $TRACKS section may resemble: $TRACKS; def Mk34112-XX01_full; fanout_def = A : &Ch01 : sign : 1 : 02; fanout_def = A : &Ch01 : mag : 1 : 03; fanout_def = A : &Ch02 : sign : 1 : 04; fanout_def = A : &Ch02 : mag : 1 : 05; fanout_def = A : &Ch03 : sign : 1 : 06; fanout_def = A : &Ch03 : mag : 1 : 07; fanout_def = A : &Ch04 : sign : 1 : 08; fanout_def = A : &Ch04 : mag : 1 : 09; fanout_def = A : &Ch05 : sign : 1 : 10; fanout_def = A : &Ch05 : mag : 1 : 11; fanout_def = A : &Ch06 : sign : 1 : 12; fanout_def = A : &Ch06 : mag : 1 : 13; fanout_def = A : &Ch07 : sign : 1 : 14; fanout_def = A : &Ch07 : mag : 1 : 15; fanout_def = A : &Ch08 : sign : 1 : 16; fanout_def = A : &Ch08 : mag : 1 : 17; fanout_def = A : &Ch09 : sign : 1 : 18; fanout_def = A : &Ch09 : mag : 1 : 19; fanout_def = A : &Ch10 : sign : 1 : 20; fanout_def = A : &Ch10 : mag : 1 : 21; fanout_def = A : &Ch11 : sign : 1 : 22; fanout_def = A : &Ch11 : mag : 1 : 23; fanout_def = A : &Ch12 : sign : 1 : 24; fanout_def = A : &Ch12 : mag : 1 : 25; fanout_def = A : &Ch13 : sign : 1 : 26; fanout_def = A : &Ch13 : mag : 1 : 27; fanout_def = A : &Ch14 : sign : 1 : 28; fanout_def = A : &Ch14 : mag : 1 : 29; fanout_def = A : &Ch15 : sign : 1 : 30; fanout_def = A : &Ch15 : mag : 1 : 31; fanout_def = A : &Ch16 : sign : 1 : 32; fanout_def = A : &Ch16 : mag : 1 : 33; track_frame_format = MARK5B; enddef; ===== VDIF issues ===== VDIF, including "Legacy VDIF" is supported vex2difx. (Legacy support was added in 2.3). The "track assignments" for VDIF needs to be clarified. VDIF has no "tracks" and channels are specifically stored "in order" within a bitstream. The concept of "fanout" is also no longer used. To indicate that the data is in VDIF format, one must either ensure that a statement of the form track_frame_format = VDIF; for non-legacy data or track_frame_format = VDIFL; for legacy data must be present in the appropriate $TRACKS section or format = VDIF or format = VDIFL must be present in each appropriate ANTENNA section of the .v2d file. As a concrete example, a complete $TRACKS section may resemble: TOBEADDED ===== About the source code ===== vex2difx is written in c++ and makes heavy use of the standard template library. This makes applying standard algorithms (sorting, traversing, associating) container members simple and error-free. An object-oriented approach is used. The base class for many of the classes is Interval, which is simply a pair of modified Julian days specifying a time interval. From this many other classes, such as scan, job, experiment, flag, ... are derived. These classes are defined and implemented in files in the vexdatamodel/ subdirectory of the source code. This makes simple operations on Interval objects (such as sorting and determining overlap) apply automatically to the higher level objects. The vex parsing library from the Field System was borrowed from Goddard Space Flight Center. This code is duplicated with little change within the vex/ subdirectory of the vex2difx source tree. Source file vexload.cpp contains the code that calls the vex parser routines to populate the VexData structure which is then used as the model from which to make jobs. vex2difx uses the difxio library for writing DiFX .input and .calc files. Currently the .flag files are written natively within vex2difx, however this may change. To aid diagnosis of an experiment and forming jobs, vex2difx keeps an internal list of //events//. An event could be the experiment starting or stopping, recording at a station starting or stopping, a leap second, an antenna joining or leaving a scan, and others. Event types are enumerated in the vextables.h source file. Splitting of an experiment into one or more jobs is one of the main functions of vex2difx. The first step in this process is to divide the experiment into JobGroups. A JobGroup is a collection of scans that can be combined into one FITS file. Examples of cases where a JobGroup boundary must be made include changing number of spectral channels or polarizations. The JobGroup boundaries happen at exacting times, dictated entirely by the scheduled scans. The second layer of splitting considers media changes. Often there is a gap between the end of recording on one Mark5 module and beginning recording on the next. vex2difx aims to be smart about choosing when to split jobs to minimize the total number of jobs created. ===== vex2difx TODO list ===== List of remaining issues * Add a "default" option for the antenna table * Improve handling of case mismatches (e.g., Ny vs. NY) * VDIF support * Better handle modes/setups that don't use all provided antennas * Extensive testing of many modes * Support pulsar polyco with gate open/close support for simple gating * Support nAntenna != nDatastream * Don't require $DIFX_VERSION (see bugs below) * Warn if source, scan, antenna, or mode name referenced from .v2d file is not in vex file * eVLBI support * Support time formats other than decimal MJD * Mark5B support * Set up correlation off disk files rather than modules * Ability to select which baselines are retained or dumped (akin to antennas= ) * Support IF selection * VLBA hardware correlator time alignment option * Improve blocks per send calculation * Support for polarization swapping * Support for ANTENNA sections * Improved ra, dec parsing * Write ''.flag'' file indicating baselines/antennas to turf after correlation * Allow setting DataStream buffer as a total size in MB or seconds (e.g., 256 MB or 10sec) * Use links from GLOBAL and STATION tables to CLOCK, EOP and TAPELOG_OBS as appropriate (example file: gk022.vix) ===== BUGS ===== * Core dumps if DIFX_VERSION is not set (fixed in DiFX-1.5 branch and trunk, 2010Mar04) * In eVLBI (NETWORK) mode, warns about missing media specification and does not add the Network Table to input file (fixed in trunk, 2010July) * Default AvChans is not 1 * If an antenna is not listed in the global ''antennas'' list, but has a ''ANTENNA'' section vex2difx should just skip this section, not return an error. ===== Feature Requests ===== * By default, the "CORE CONF FILENAME" does not need to include the experiment name. "threads" is good enough (CJP) (set with threadsFile= 2010July) * If only a single job is to be run, the output filename does not need the _# prefix. For the way ATNF eVLBI run this is particularly messy (as each time DiFX is run the wall clock start time is added to the file name)(CJP). (enabled by setting startSeries=0 2010July03) * When correlating single pol antennas against dual pol, the (possible) crosspol products should be added to the baseline table, at least optionally. (CJP) * Simple VDIF support is needed (CJP) * Complex data type support is needed [boolean, default False] (CJP) * Support input files with MSDOS or Linux style EOL markers (WFB)