Brian Jackel 2006-01-18

Riometer Data Flow

Introduction

Figure 1: Generic data flow

The NORSTAR riometer data acquisition and telemetry process consists of six stages illustrated in Figure 1 and listed below:

  1. remote acquisition
  2. remote transmission
  3. central reception
  4. central validation
  5. reconciliation
  6. revalidation

This process was developed to satisfy several kinds of constraints. First, it must operate effectively over a low-bandwidth high-latency satellite internet link. Second, it needs to work with multiple legacy data streams. Third, it should require a minimum of technical expertise and programming effort.

Remote data acquisition

A riometer unit at each remote field site measures the cosmic "noise" power at 30 MHz. Signal strength is converted non-linearly to voltage (0V to 7.5V) which is digitized at 60 Hz to 12 bit values spanning a -10V to 10V range. These are read by a computer over an RS-232 serial interface and time stamped using the site GPS unit as a primary NTP reference.

Values are grouped into 1-second intervals from which the average and range are calculated. Every 60 seconds a record is written to disk, where it should be noted that the record boundaries are not necessarily synchronized to minute boundaries. Each record should contain a header (20 bytes) and 60 data points (6 bytes each) for a total of 380 bytes per record.

struct data_header {
  char  stream_id[4];    /* should always be 'XRIO' */
  char  site_id[4];      /* eg. 'gill', 'RANK' (safest to fold case) */
  char  version;         /* currently '0' */
  char  sample_rate;     /* currently 1 (samples/second) */
  char  spare1[2]; 
  long int  timestamp;   /* first sample in unix epoch (seconds since January 01, 1970) */
  uint16_t  datapoints;  /* currently 60, but always check */
  char  spare2[2];
} __attribute__((__packed__));
struct  data_point{
  uint16_t  a2d_average; /* should be 0 to 4095 */
  uint16_t  a2d_range;   /* should be 0 to 4095 */
  uint16_t  npoints;     /* usually 60, sometimes 59 or 61, flag anything else */
} __attribute__((__packed__));;

Records are appended to hourly files and grouped together in a directory tree organized by date and site. The last path element is unnecessary at each remote site, but is very useful when aggregating data from multiple sites on a central server.

[monitor@calgu riometer]$ ls /data/riometer/2006/01/17/calg_xrio/
20060117_00_calg_xrio.dat  20060117_08_calg_xrio.dat  20060117_16_calg_xrio.dat
20060117_01_calg_xrio.dat  20060117_09_calg_xrio.dat  20060117_17_calg_xrio.dat
20060117_02_calg_xrio.dat  20060117_10_calg_xrio.dat  20060117_18_calg_xrio.dat
20060117_03_calg_xrio.dat  20060117_11_calg_xrio.dat  20060117_19_calg_xrio.dat
20060117_04_calg_xrio.dat  20060117_12_calg_xrio.dat  20060117_20_calg_xrio.dat
20060117_05_calg_xrio.dat  20060117_13_calg_xrio.dat  20060117_21_calg_xrio.dat
20060117_06_calg_xrio.dat  20060117_14_calg_xrio.dat  20060117_22_calg_xrio.dat
20060117_07_calg_xrio.dat  20060117_15_calg_xrio.dat  20060117_23_calg_xrio.dat

Remote data transmission

At each remote site, a watcher routine makes a copy of each new data record and transmits it as a single UDP packet to port 24985 on a telemetry server. (UDP is a connectionless low-overhead best-effort protocol; details can be found online eg. here or here.) Total bandwidth is roughly 51 bits/second.

Central data reception

A central server listens to UDP port 24895 and receives all incoming packets. There should be one packet from each of 12 sites every 60 seconds. These are appended to hourly data files named according to arrival time and destination port and organized in a directory tree by arrival date and source IP address. A current tree of all received UDP packets should be available here. The following example shows one day from a site which is transmitting two different telemetry streams on UDP ports 24985 (riometer) and 25000 (status monitor) respectively.

[bjackel@cgsm-rt udp]$ ls 2006/01/17/136.159.51.71/
ut00_24985.dat  ut04_24985.dat  ut08_24985.dat  ut12_24985.dat  ut16_24985.dat  ut20_24985.dat
ut00_24985.idx  ut04_24985.idx  ut08_24985.idx  ut12_24985.idx  ut16_24985.idx  ut20_24985.idx
ut00_25000.dat  ut04_25000.dat  ut08_25000.dat  ut12_25000.dat  ut16_25000.dat  ut20_25000.dat
ut00_25000.idx  ut04_25000.idx  ut08_25000.idx  ut12_25000.idx  ut16_25000.idx  ut20_25000.idx
ut01_24985.dat  ut05_24985.dat  ut09_24985.dat  ut13_24985.dat  ut17_24985.dat  ut21_24985.dat
ut01_24985.idx  ut05_24985.idx  ut09_24985.idx  ut13_24985.idx  ut17_24985.idx  ut21_24985.idx
ut01_25000.dat  ut05_25000.dat  ut09_25000.dat  ut13_25000.dat  ut17_25000.dat  ut21_25000.dat
ut01_25000.idx  ut05_25000.idx  ut09_25000.idx  ut13_25000.idx  ut17_25000.idx  ut21_25000.idx
ut02_24985.dat  ut06_24985.dat  ut10_24985.dat  ut14_24985.dat  ut18_24985.dat  ut22_24985.dat
ut02_24985.idx  ut06_24985.idx  ut10_24985.idx  ut14_24985.idx  ut18_24985.idx  ut22_24985.idx
ut02_25000.dat  ut06_25000.dat  ut10_25000.dat  ut14_25000.dat  ut18_25000.dat  ut22_25000.dat
ut02_25000.idx  ut06_25000.idx  ut10_25000.idx  ut14_25000.idx  ut18_25000.idx  ut22_25000.idx
ut03_24985.dat  ut07_24985.dat  ut11_24985.dat  ut15_24985.dat  ut19_24985.dat  ut23_24985.dat
ut03_24985.idx  ut07_24985.idx  ut11_24985.idx  ut15_24985.idx  ut19_24985.idx  ut23_24985.idx
ut03_25000.dat  ut07_25000.dat  ut11_25000.dat  ut15_25000.dat  ut19_25000.dat  ut23_25000.dat
ut03_25000.idx  ut07_25000.idx  ut11_25000.idx  ut15_25000.idx  ut19_25000.idx  ut23_25000.idx

Each data file ("*.dat") has an accompanying index file ("*.idx") with one line per packet received. Each line contains information including the arrival time, source IP address, total size, offset within the data file, and the first four bytes.

[bjackel@cgsm-rt udp]$ head -n 10 2006/01/17/136.159.51.71/ut00_24985.idx
2006-01-17 00:00:06.421153   24985  136.159.51.71:32769      380            0  58 52 49 4f 
2006-01-17 00:01:06.902381   24985  136.159.51.71:32769      380          380  58 52 49 4f 
2006-01-17 00:02:06.375104   24985  136.159.51.71:32769      380          760  58 52 49 4f 
2006-01-17 00:03:06.814618   24985  136.159.51.71:32769      380         1140  58 52 49 4f 
2006-01-17 00:04:06.285830   24985  136.159.51.71:32769      380         1520  58 52 49 4f 
2006-01-17 00:05:06.766924   24985  136.159.51.71:32769      380         1900  58 52 49 4f 
2006-01-17 00:06:06.228161   24985  136.159.51.71:32769      380         2280  58 52 49 4f 
2006-01-17 00:07:06.689275   24985  136.159.51.71:32769      380         2660  58 52 49 4f 
2006-01-17 00:08:06.150479   24985  136.159.51.71:32769      380         3040  58 52 49 4f 
2006-01-17 00:09:06.611591   24985  136.159.51.71:32769      380         3420  58 52 49 4f 

Index files can be used for a variety of diagnostic tasks. The following example shows that only 1437 out of a possible 1440 records were received from the Dawson riometer during January 17 2004. The daily average loss rate due to these three dropped packets is just over 0.2%.

[bjackel@cgsm-rt udp]$ wc -l  2006/01/17/206.172.47.234/ut??_24985.idx
    60 2006/01/17/206.172.47.234/ut00_24985.idx
    60 2006/01/17/206.172.47.234/ut01_24985.idx
    60 2006/01/17/206.172.47.234/ut02_24985.idx
    60 2006/01/17/206.172.47.234/ut03_24985.idx
    60 2006/01/17/206.172.47.234/ut04_24985.idx
    60 2006/01/17/206.172.47.234/ut05_24985.idx
    60 2006/01/17/206.172.47.234/ut06_24985.idx
    60 2006/01/17/206.172.47.234/ut07_24985.idx
    60 2006/01/17/206.172.47.234/ut08_24985.idx
    60 2006/01/17/206.172.47.234/ut09_24985.idx
    60 2006/01/17/206.172.47.234/ut10_24985.idx
    60 2006/01/17/206.172.47.234/ut11_24985.idx
    60 2006/01/17/206.172.47.234/ut12_24985.idx
    60 2006/01/17/206.172.47.234/ut13_24985.idx
    60 2006/01/17/206.172.47.234/ut14_24985.idx
    60 2006/01/17/206.172.47.234/ut15_24985.idx
    59 2006/01/17/206.172.47.234/ut16_24985.idx
    60 2006/01/17/206.172.47.234/ut17_24985.idx
    59 2006/01/17/206.172.47.234/ut18_24985.idx
    60 2006/01/17/206.172.47.234/ut19_24985.idx
    60 2006/01/17/206.172.47.234/ut20_24985.idx
    60 2006/01/17/206.172.47.234/ut21_24985.idx
    59 2006/01/17/206.172.47.234/ut22_24985.idx
    60 2006/01/17/206.172.47.234/ut23_24985.idx
  1437 total

From the same site during the entire month of December 2005 there were 125 packets missing of a possible 44640, for an average loss rate of 0.28%. Peak loss rate during the worst day (2005/12/16) was just over 2%. Most of this was in fact due to server downtime rather than network problems.

[bjackel@cgsm-rt udp]$ for i in 2005/12/*/206.172.47.234
 > do echo -n $i
 > wc -l $i/ut??_24985.idx | grep total
 > done
2005/12/01/206.172.47.234  1426 total
2005/12/02/206.172.47.234  1438 total
2005/12/03/206.172.47.234  1436 total
2005/12/04/206.172.47.234  1439 total
2005/12/05/206.172.47.234  1438 total
2005/12/06/206.172.47.234  1418 total
2005/12/07/206.172.47.234  1437 total
2005/12/08/206.172.47.234  1435 total
2005/12/09/206.172.47.234  1439 total
2005/12/10/206.172.47.234  1440 total
2005/12/11/206.172.47.234  1439 total
2005/12/12/206.172.47.234  1439 total
2005/12/13/206.172.47.234  1437 total
2005/12/14/206.172.47.234  1430 total
2005/12/15/206.172.47.234  1436 total
2005/12/16/206.172.47.234  1411 total
2005/12/17/206.172.47.234  1438 total
2005/12/18/206.172.47.234  1440 total
2005/12/19/206.172.47.234  1439 total
2005/12/20/206.172.47.234  1439 total
2005/12/21/206.172.47.234  1439 total
2005/12/22/206.172.47.234  1438 total
2005/12/23/206.172.47.234  1440 total
2005/12/24/206.172.47.234  1439 total
2005/12/25/206.172.47.234  1439 total
2005/12/26/206.172.47.234  1431 total
2005/12/27/206.172.47.234  1440 total
2005/12/28/206.172.47.234  1438 total
2005/12/29/206.172.47.234  1439 total
2005/12/30/206.172.47.234  1438 total
2005/12/31/206.172.47.234  1440 total

Central Validation

The validation process periodically checks a set of pointers to the most recent hour files to determine when new records have arrived. Any new data are passed through an "invalidation" stage which checks for fundamental deviations from the expected data format. These currently include:

stream id
bytes 0-3 must be characters (all uppercase) identifying the data stream (currently 'XRIO')
site id
bytes 4-7 must be characters (any case) identifying the remote field site (eg. 'gill')
version
byte 8 must a single character identifying the packet version (currently '0')
data size
bytes 16-17 should be an unsigned integer corresponding to the total number of data points (currently 60)
data
a version '0' record must consist of a 20 byte header plus 6 bytes for each data point (currently 380 bytes total)

Figure 2: Real-time data from Dawson

Any record which sucessfully passes all invalidation checks is provisionally declared to be "valid". This does not guarantee that all possible errors have been found and does not imply that the record contents are necessarily correct or reliable. For example, pre-amplifier failures at a field site can result in the remote computer simply logging small voltage fluctuations within ±50 millivolts of zero. These records are certainly not scientifically useful for ionospheric studies but will be passed by the current validation algorithm.

Validated records are appended to hourly files using the same naming convention as at the remote field sites. They are also used to generate various kinds of summary information. One example is given in Figure 2, which contains a link to the most recently produced daily plot of real-time data from the Dawson field site.

Reconciliation

If the remote transmission and central reception stages are operating over a perfect network connection then validated data files on the central server will be identical copies of those acquired at the remote field site. In reality some packets will inevitably be lost due to network problems. This will result in data gaps on the central server. Using data from Dawson on January 17 2006 as an example, the remote field site originally transmitted 1440 packets for a total of 547200 bytes. Three of these packets (1140 bytes) were never received by the central server. These three packets must somehow be retrieved in order to modify files on the central server so that they are identical to the original ones on the remote field site computer.

Ideally this would require a minimum of 3 × 380 = 1140 bytes, which is an increase of roughly 0.2% on top of the 547200 bytes originally transmitted. In practice some additional bytes may be required to ensure data integrity (eg. checksums). Such increases may be balanced by using compression to reduce the total amount of data transfered. This is illustrated by the following example where a standard tool (xdelta) is used to determine the differences between files on the central server and a copy of the original files from the remote site. A total of 1241 bytes are needed to store all the information required to fill all three gaps in the central data files.

[bjackel@cgsm-rt tmp]$ for i in remote/*.dat
 > do i=`basename $i`
 > cmp central/$i remote/$i || xdelta delta central/$i remote/$i $i.xdelta
 > done

[bjackel@cgsm-rt tmp]$ ls -al *.xdelta
-rw-rw-r--  1 bjackel bjackel 395 Jan 18 02:01 20060117_16_daws_xrio.dat.xdelta
-rw-rw-r--  1 bjackel bjackel 430 Jan 18 02:01 20060117_18_daws_xrio.dat.xdelta
-rw-rw-r--  1 bjackel bjackel 416 Jan 18 02:01 20060117_22_daws_xrio.dat.xdelta

[bjackel@cgsm-rt tmp]$ xdelta info 20060117_16_daws_xrio.dat.xdelta 
xdelta: version 1.1.3 found patch version 1.1 in 20060117_16_daws_xrio.dat.xdelta (compressed)
xdelta: output name:   20060117_16_daws_xrio.dat
xdelta: output length: 22800
xdelta: output md5:    08133801d9d416126128a5c6ec0b7091
xdelta: patch from segments: 2
xdelta: MD5                               Length  Copies  Used   Seq?  Name
xdelta: 0e625120c73401619e98bd8f06a059e8  296     4       296    yes   (patch data)
xdelta: de72c1b6abc12912abdfc2ceac8acfa9  22420   6       22504  no    20060117_16_daws_xrio.dat

In the preceeding example the comparison was carried out using copies of the "good" data. Of course, it would be preferable to somehow identify gaps without having to first download the entire data set as a reference.

One way would be to start with an understanding of the expected record cadence and try to infer which records are missing. This information could then be used to determine the range(s) of bytes required from the original file. Finally, some mechanism such as HTTP Range requests could be used to transfer the those bytes of interest from the remote site to the central server, where they could be inserted into the file gap(s).

An equivalent result can be obtained without requiring any knowledge of the record format or rate. If the validation stage uses the same naming convention as the original acquisition, then standard tools can be used to synchronize the two sets of files. For example, the rsync program automatically identifies differences between three of the 24 hourly files and transfers the information required to remove gaps on the central server. Due to block-size issues it actually transfers more bytes than strictly required (3240 rather than 1140 bytes) but with compression this is reduced considerably (2264 bytes including overhead).

[bjackel@cgsm-rt daws_xrio]$ rsync --verbose --archive --compress --stats \
  rsync://206.172.47.234/data/riometer/2006/01/17/daws_xrio/* .
opening tcp connection to 206.172.47.234 port 873
receiving file list ... 
24 files to consider
delta transmission enabled
 ...
Number of files: 24
Number of files transferred: 24
Total file size: 547200 bytes
Total transferred file size: 547200 bytes
Literal data: 3240 bytes
Matched data: 543960 bytes
File list size: 668
Total bytes sent: 5235
Total bytes received: 2264

sent 5235 bytes  received 2264 bytes  789.37 bytes/sec
total size is 547200  speedup is 72.97

It is interesting to consider the case where an entire day of data needs to be recovered. Possible causes might include the remote satellite dish being knocked out of alignment or a failure of the central server. Simply using rsync without any compression would require a transfer of the complete day of data (547200 bytes) with an additional 1664 bytes for overhead (eg. filenames, checksums, etc.). This would require an additional 0.3% of bandwidth, which is negligible. Using the --compress flag to enable gzip compression before transmission produces a remarkable reduction to 110908 bytes including overhead. For this case, retrospective recovery would use only 20% of the bandwidth required for real-time transmission.

Revalidation

The reconcilation stage only guarantees that the central server has data files identical to those at the remote field site. It is possible, at least in theory, that the original files were somehow corrupt in a way which would be caught by one of the invalidation checks. Consequently, the validation process should be re-applied to any files modified during reconciliation.