Drifter Buoy Data and awk (or gawk)

Space-time datasets of drifter buoys in the world's oceans are available from: www.meds-sdmm.dfo-mpo.gc.ca/isdm-gdsi/drib-bder/svp-vcs/index-eng.asp. You'll probably want to select, download and use the 'raw' data. The download will come as a zipped, csv file. Some of these files are quite large.

The 'raw' data include a list of contents in the first 20 lines of the file:

Col 1  	Platform identifier (ARGOS #)
Col 2 	EXP$ - The originator's experiment number
Col 3 	WMO$ - WMO platform identifier number
Col 4 	Position year/month/day hour:minute (UTC)
Col 5 	Latitude of observation (+ve North)
Col 6 	Longitude of observation (+/- 180 deg +ve West of Greenwich)
Col 7 	QIDX - Position accuracy
Col 8 	Observation year/month/day hour:minute (UTC)
Col 9 	BATT - Battery voltage
Col 10 	DRG# - Drogue number
Col 11 	SSTP - Sea surface temperature (deg. C)
Col 12 	CNDC - Conductivity
Col 13 	PSAL - Salinity
Col 14 	ATMS - Atmospheric pressure
Col 15 	WDIR - Wind direction (degrees from true north)
Col 16 	WS8K - Wind speed at 8 kHz sample
Col 17 	WS2K - Wind speed at 2 kHz sample
Note: Missing value indicated by 999.9999
Followed by the data in comma separated columns.

One might wish to extract only some variables for use in other processing. For instance, one might want only the ID, LAT, LON, and TIME. That is, you want columns 1, 5, 6, and 4 from the file. (NB the longitudes are positive west of the prime meridian, the oposite of normal usage. You could extract those data and fix the longitudes in a spreadsheet program, (provided the data set is small enough for the program to read it), but this is an easier job for 'awk'.

with an awk script like this:

BEGIN { FS=","
        OFS=", " }
NR > 20 { lon = $6 * -1.0
           print $1, $5, lon, $4 }
in a file called 'script.awk', and the data in a file 'data.csv', the shell command:
awk -f script.awk < data.csv >  extract.csv
will put the data you want in a file called 'extract.csv'.