Overview of the DiFX GUI

The DiFX GUI was developed at the United States Naval Observatory as a front end to the DiFX software correlator. It was designed to minimize the effort required to conduct "normal" correlation tasks without crippling the flexibility sophisticated operators require for unique or unusual operations. While it was designed to make running the correlator easier, running the GUI (and reading this documentation) requires some understanding of DiFX and its components.

This document contains an end-to-end description of how the DiFX user interface can be used to run DiFX jobs. All steps in the process are described in roughly the sequence they would be employed in actually running a job. Because the GUI has many options that can cause branching paths in step-by-step instructions, what follows is only a description of a "sample" procedure for running a job. The specific needs of individual users, data and installations will likely require approaches that differ some, or possibly a lot. With this in mind, links are provided to other sections of the documentation that may provide additional details on each subject.

This document is not a comprehensive tutorial on all of the functionality of DiFX itself. A good place to start looking for that sort of thing is here.

What the GUI Does

The GUI allows an operator to perform basic tasks associated with DiFX correlation. It was created for the purpose of processing data from regular, fairly standardized geodetic experiments, and is probably best suited for that kind of work. Specifically, it allows a user starting with radio interferometry data and a ".vex" file to:

Create a ".v2d" file governing how the .vex file and data will be used to produce DiFX experiments and organize where results data from DiFX processing will be stored.
Run vex2difx and calcif2 (or variants) to produce the control files DiFX requires to run experiments (.input, .calc, etc.).
Determine how DiFX processing will be distributed amongst correlator resources (by creating .machines and .threads files).
Run (possibly large numbers of) DiFX jobs according to user-defined scheduling instructions.

The GUI also serves some ancillary purposes, including:

Monitoring the health and activity of DiFX cluster components (processors, Mark5 units, etc) and shows the progress of DiFX jobs running on the cluster. In this capacity it is essentially a passive observer, serving as a somewhat more colorful replacement to the errormon, mk5mon and cpumon scripts that come with DiFX.
Running some controls on data and data modules, including copying data to disk, creating "file lists", and scanning module hard drives for problems.

Build and Installation

The DiFX GUI and its associated server process guiServer are part of the DiFX software tree. The GUI is written in Java, and is currently located in the sub-directory:

   applications/gui

The Java archive (".jar") file used to run the GUI itself is in:

   applications/gui/gui/dist/gui.jar

This directory also contains a number of other ".jar" files that are necessary to run the GUI. If you want to move the GUI to another disk location it is best to simply copy the complete contents of the "dist" directory.

The top-level of this documentation is here:

   applications/gui/doc/intro.html

The guiServer C++ application is also part of the DiFX software tree, in:

   application/guiServer

Using DiFX build procedures (such as difxbuild) will compile guiServer and install it and gui.jar file in the appropriate bin directories. The ".jar" files do not need compilation.

Running the GUI and Connecting to the DiFX Host

A single instance of the server application guiServer needs to be run on one of the processing nodes in the DiFX cluster (the processor running guiServer is often referred to in the documentation as the "DiFX Host" or the "DiFX head node"). GuiServer must be run by a user (not root for security reasons) that has read/write permissions over all data directories used by DiFX. This user must also be able to start processes on all other nodes using mpirun. It probably makes most sense to have the user that you normally use to run DiFX from the command line run guiServer.

GuiServer is run from the command line on the DiFX Host:

   guiServer [PORT #]

The optional port number is the TCP connection port used to communicate with the GUI. If it is not specified, guiServer will use a default port number (it uses the value given by the DIFX_MESSAGE_PORT environment variable, or 50200 if that is not available). As soon as it is started, guiServer will produce a message indicating the port it is using:

   server at port 50200

The GUI itself is a Java program that can be run anywhere that a network connection to the DiFX cluster is available. Because the GUI and guiServer communicate using insecure TCP connections there must be no intervening firewalls between them (there are ways to deal with firewalls and in fact run the GUI anywhere - see Running DiFX Remotely. Run the GUI using its ".jar" (Java archive) file:

   java -jar [GUI DIST PATH]/gui.jar

The "GUI DIST PATH" is the location of the "dist" subdirectory in the gui portion of the DiFX installation tree. Once the GUI is running, the address of the DiFX Host (where guiServer is running) and the port number (what guiServer told you above) can be entered in the Settings menu to connect the two (see DiFX Control Connection in the Settings documentation for details). A proper connection will be pretty obvious - the GuiServer Connection Monitor will turn green, a "connection successful" message will appear, and, assuming you have mk5daemon operating properly, the GUI should start displaying information about the components of the DiFX cluster.

The GUI and guiServer can be started in any order - the GUI will connect as soon as a guiServer becomes available (for the most part - remote connections are sometimes more touchy about this). Any number of GUI sessions can be run simultaneously using the same guiServer, although there are considerations one should take into account to make sure ports are always available.

Run mk5daemon!

For the GUI to work properly it is important that the mk5daemon process be running on every DiFX hardware component (processors, MK5 units, etc.) in the DiFX cluster. The reason for this is that mk5daemon produces the periodic "heartbeats" for each component, including such information as CPU and memory load and read/write operations. Mk5daemon is also important because it is the only way the GUI knows that a component exists and is available as a resource - without it the component will not be utilized in DiFX processing. Your DiFX cluster may be set up such that mk5daemon is started by each component when it boots, but in the event it is not you will need to log into each component (use the DIFX_USER) and start it by typing:

   mk5daemon &

Soon after mk5daemon is run on a component, the component should appear in the GUI Hardware Monitor (see Monitoring the DiFX Cluster Hardware).

Running DiFX Remotely

The GUI/guiServer communications link, which handles all interaction between the GUI and DiFX, is based on insecure TCP socket connections. This works fine if you run the GUI on the same unrestricted LAN as the software correlator, but breaks down if you move outside firewalls that restrict direct TCP connections. To get around such restrictions, an "ssh tunnel" can be set up through a firewall as long as you can ssh to the firewall. Running a DiFX cluster that is located behind a firewall using a GUI running on a machine outside the firewall can be accomplished using the following steps (the order of which is unimportant):

Start guiServer - normally on the head node of the DiFX cluster. The TCP connection port will be referred to as the "connection port" in the following steps.
Start an ssh tunnel from the location where you wish to run the GUI through the firewall. This is done using an ssh command with some options:
```
ssh -N -L [LOCALPORT]:[WHAT FIREWALL CALLS DIFX HEADNODE]:[CONNECTION PORT] USER@FIREWALL
```
- USER@FIREWALL is how you would log into the firewall from your local machine.
- CONNECTION PORT is the TCP connection port used by guiServer on the head node of the DiFX cluster.
- WHAT FIREWALL CALLS DIFX HEADNODE is the node name of the head node as the firewall sees it (what you would use if logging into the head node from the firewall).
- LOCALPORT is a port on the machine where the GUI is being run.
Start the GUI on the machine outside the firewall. In the "DiFX Control Connection" section of the "Settings" window, set "DiFX Host" to "localhost" and "Control Port" to the value of LOCALPORT you used in the ssh tunnel. The "guiServer Connection" light should turn green and you should start seeing data from the nodes in the DiFX cluser.
Make sure Channel All Data is activated in the DiFX ControlConnection" section of the "Settings" window.

Changing Settings

The GUI has many options the user can set to govern processing, how data are stored, and where necessary components are located. These are all collected in the "Settings Window" which can be accessed from the "Settings/Show Settings" menu item. There are a large number of options in the Settings Window, but most of these needn't be touched on a job-by-job basis as long as the GUI is running smoothly and appears to be doing things correctly. Below is a short list of some of the settings that are more likely to require user changes, along with brief explanations of what they do (each item is linked to detailed documentation). A comprehensive list of all settings and their options is contained in the Settings Window in Detail documentation.

DiFX Host is the host name of the "head node" of the DiFX cluster - where guiServer should be running. The DiFX host name should be whatever the machine on which the GUI is run calls the DiFX head node - i.e. a "ping" of this host name from the GUI host should be successful.
Control Port is the port number at which the GUI will repeatedly attempt to make a connection to guiServer if such a connection does not exist. The control port should match the port number given by guiServer when it is started.
Run w/DiFX Version is the version of DiFX software that will be run by the GUI. This version does not need to match that of the GUI or guiServer, however it does need to be installed on the DiFX cluster.
DiFX Execute Script is a script on the DiFX processing nodes that is used to execute all DiFX and mpi commands. The script defines environment variables and performs any other necessary setup before running things. Most of the time the script selected automatically by the GUI should be fine.
Working Directory is the path under which new directories are stored for DiFX experiments that the user creates using the GUI. The "user" running guiServer needs to have write permission in this location or none of this will work.

Settings are preserved between GUI sessions, so once you have things set up and running properly you should be able to restart the GUI and have it run properly right away. You can also save specific setting configurations to files that can later be loaded. See Settings File.

Monitoring the DiFX Cluster Hardware

The GUI Hardware Monitor provides detail about the status of the hardware components (processors and data module drives) that comprise the DiFX cluster. Components are detected using their mk5daemon broadcasts, so only those components running that application will be visible. Data are continually updated at the broadcast rate of mk5daemon - "lights" associated with each component will flash green with every update, providing a "heartbeat" for each component. These lights will turn yellow, and then red if a component goes silent for too long.

Quite a bit of information is available for each component - a pull-down menu accessible through the arrow in the header for each component type provides a list from which you can turn data outputs on and off. The available items reflect the data fields provided by mk5daemon, and not all of them are currently filled.

The menu also allows some "hardware" control - components can be reset, shut down, and rebooted. Controls that perform the same operations on individual components are available using the arrows associated with each.

These hardware controls should be used with caution!

Monitoring DiFX Jobs Using the Queue Browser

The Queue Browser organizes DiFX jobs under a three-level hierarchy with "Experiments" at the top level, underneath which are "Passes" that in turn contain individual "Jobs".

An Experiment is usually bound to a single data set (one or more scans) collected over a specific time span - the results of a single observing session for instance. It can contain any number of "Passes" (including zero).
A Pass is used to contain a single analysis of a subset of the data within an Experiment. Often Experiments contain a "Clock Pass" run on a few scans to generate the time delays for each involved antenna, and a "Production Pass" run on all scans with those time delays in place.
Within each Pass is a series of Jobs, each controlling the processing of at least one scan.

Adding Existing Experiments to the Queue Browser

Creating a New Experiment

To create a new experiment for DiFX processing, all that is required is a .vex file and appropriate data. The GUI can be used to perform (or facilitate) the various steps required to set up an experiment based on instructions from the user. In short, these steps amount to:

Setting up a location to do the processing
Creating a .v2d file to go with the .vex file
Running vex2difx and calcif2 to create .input and .im files

The GUI tries to be as flexible as possible about this, although it has a "preferred" way of arranging things such that running DiFX processing on the created experiments is possible through the GUI as well.

To create a new experiment, select the "Experiment/Create New..." menu item in the Queue Browser. IMAGE
This will bring up the "Create New Experiment" window:
IMAGE
The purpose of this window is to allow the user to tailor a new experiment to meet their needs. It creates a "working" directory for the new experiment, allows the specification of data sources, and puts all relevant DiFX files (.v2d, .input, etc.) in the working directory from which they can be run (either through the GUI or by hand). Experimentation with different GUI settings while creating an experiment is not dangerous as the original .vex file and data files are not moved or altered in any way. If you mess up, delete the experiment and try again.

Naming the New Experiment and Putting It Somewhere

Getting .vex File Content

For a DiFX Experiment to be created and run, it must begin with a .vex file. Vex files should be provided with observed data and imported into the GUI where they are parsed to determine the structure of jobs that will be created, data sources that are necessary, and other things.

Editing .vex File Content

Once you have obtained .vex data from some source, the data are displayed in the ".vex File Editor" panel. This panel provides a (rather rudimentary) text editor that can be used to edit the .vex data by hand. The final edited text is used in the .vex file assigned to your created experiment, a copy of which is put in your working directory.

For any changes to apply, you must click the "Parse Content" button.

Some care should be taken in directly editing .vex data as it is trivial to corrupt the .vex to the point where it can't be used (the DiFX operational paradigm says that users should never need to do this), but editing this content does not alter the original source .vex file, only the final .vex file associated with the experiment - so playing around with things is not permanently harmful.

Correlation Tuning Parameters

The Correlation Tuning Parameters section includes values that can be changed to adjust the quality of the correlation results, and/or the total time processing takes. Adjustments to many of these values is something of an art in itself, and the details of what things do and what their "best" values should be is not covered here (some talks at DiFX Users Meetings have covered the subject - slides can be viewed here).

Each item has an associated "apply" check box. If this box is not checked, no instructions regarding the item will be put in the .v2d file and vex2difx will be allowed to pick its own defaults. Unless you know what you are doing, don't check the apply box - let vex2difx pick the values! The GUI has default values for all items but they are not based on anything - they are essentially placeholders. The default values that are picked by vex2difx are far better.

Stations and Data Sources

Each antenna involved in the observations described by the .vex data triggers the creation of a panel in the "Stations" section with the two-letter code station/antenna code used as a panel title. Each station panel contains a number of sub-sections: at least one Data Stream; Antenna; Site; and Settings.

In most cases users will only be concerned with changing the Data Stream settings. This section tells DiFX where the data for a particular station/antenna can be found.

Because filling out the Data Source section can be tedious, the DiFX GUI provides ways of pre-defining all Data Source settings for a station/antenna based on known and/or likely locations. See Antenna Defaults in the Settings Window "Job Creation Settings" section.

Station Settings

The Settings section contains settings for Tone, Phase Calibration Interval and Delta Clock. The Delta Clock value is often gleaned by running a "Clock Pass" on a subset of the experiment's data (see some sort of explanation here).

The Many Ways to Select Scans and Stations

When new .vex data are selected, the GUI begins with the assumption that all observations described in the data will be included in the new experiment. There are a number of ways of adjusting which of the scans that make up the observations are ultimately used, and which stations are used in which scans, most of which are part of the Experiment Editor. Scan and station selection changes must be reflected in the .v2d and .vex files that are created as part of the new experiment, so selections are part of the experiment creation process. The Scan Selection section of the Experiment Editor can be used at any time to view the scans that will be included in the experiment when it is created as well as which stations are used for each.

Some of the scan and station selection controls can work at cross-purposes - effectively they provide more than one way to cause a scan or a station to be used. When a conflict occurs, the GUI will give the most recent command precedence (if, for instance, a command is given that a scan be included in the final experiment when a previous command eliminated the scan, the GUI will include the scan).

Eliminating Stations in the "Source" `.vex` Data

Stations can be eliminated from individual scans by putting a "-1" in the "code" column within the appropriate "scan" section in the "source" .vex data. When the GUI encounters the "-1", it will remove the station from the scan. This duplicates hardware correlator behavior. Starting with the .vex file snippet below, the final .vex file will not include the station "Bd" because of the "-1" in the final column.

scan 128-1703;
    start = 2014y128d17h03m34s;
    mode = GEOSX8N.8F;
    source = 1846+322;
    station = Bd :    0 sec :     20 sec :     0 ft : 1A : &amp;n : -1; 
    station = Ho :    0 sec :     20 sec :     0 ft : 1A : &amp;n : 1;
    station = Kk :    0 sec :     20 sec :     0 ft : 1A : &amp;cw : 1;
    station = Ny :    0 sec :     20 sec :     0 ft : 1A : &amp;ccw : 1;
    station = Ts :    0 sec :     20 sec :     0 ft : 1A : &amp;cw : 1;
endscan;

For changes of this type to be recognized, the .vex file must be parsed by the Experiment Editor. You edit the .vex file in the .vex File Editor or you can use your favorite text editor.

If you do not want the GUI to pay attention to the "-1" code in this way, un-check the Eliminate Stations With "-1" Code box in the Settings menu.

Eliminating Stations in the "Stations" Section

The "Stations" section is primarily set up to change parameters related to each antenna involved in an experiment, and to select the data sources associated with them (see above). However it also includes a check box that can be used to completely remove each station from the experiment. Any scans that no longer have enough stations to form a baseline (i.e. less than two) will be eliminated.
IMAGE

The "Scan/Station Timeline" Editor

The "Scan/Station Timeline" section provides a visual map of all scans and the stations used in them in a timeline. It allows the selection/deselection of individual stations within scans or the inclusion of data from different stations based on time.
IMAGE
Somewhat more complex explanation here.

Selecting by Source Using the "Sources" Editor

The "Sources" section shows all sources and the stations used to observe them. It allows sources to be selected and deselected, and stations to be selectively used or eliminated from sources.
IMAGE

All sources included in the .vex file are listed. Boxes show which sources are observed with which stations.
Hover over the name of a source to produce a tooltip that includes information about the source as well as the names of the scans used to observe it and the stations used for each of those scans (stations marked in red have been eliminated).
Hover over the boxes to produce a tooltip that includes which scans use the associated station on the associated source. A scan that appears in red text has been eliminated - either explicitly or because it lacks sufficient stations to form a baseline.
Use check boxes to add or eliminate a source. When a source is removed, all scans associated with it are eliminated from the final experiment. When a source is added, all scans associated with it are put into the final experiment (assuming they have sufficient stations to form a baseline).
Click on the boxes to add or eliminate a station from the observations of a given source. Scans will be added or eliminated from the final experiment based on whether changes give them enough stations to form a baseline.

The Sources section is something of a work in progress, and not something anyone uses at the USNO, so it is a little confused at this point as to what it wants to be. It was developed originally with the idea that astronomical observers would be interested in sources (in geodesy they are uninteresting). Suggestions are welcome.

Selecting Specific Scans With the "Scan Selection" Editor

At any time in the scan/station selection process, the "Scan Selection" editor will show which scans will be included in the final experiment (included scans are green, scans not included are gray). It allows the user to make selections on a scan-by-scan basis by clicking on individual scans, or by turning all scans on or off using the "Select All" and "Clear All" buttons.
IMAGE OF SCAN SELECTION PANEL WITH LABELS HERE
The Scan Selection Editor includes a "Time Limits" plot that shows all scans from the original .vex file as a time sequence (again, scans in green are included, those in gray are not included). The mouse wheel can be used to "zoom in" on different time limits, and the red and blue triangles can be grabbed and dragged to limit the final experiment in time. This widget is somewhat redundant with the Scan/Station Timeline Editor, but it may be useful to someone.

Running DiFX Jobs

Running an Individual Job

Picking Data Sources and Processors

Running Jobs With the Scheduler

Using the Real-Time Monitor

If you wish to monitor running jobs through the GUI's real-time plotting capabilities, the DiFX application monitor_server needs to be running. This program provides a TCP server at which real-time data from running DiFX processes can be obtained. The absence of this process is not usually a problem - if you request the real-time plotting it should be started automatically. However if you find that real-time plotting isn't working, this could be a cause. For details, see the Real Time Job Monitor Documentation. Note that at this time real-time monitoring is best considered "experimental".

Table of Contents