Chapter 2: DATA SET BASICS

Ch2 Sec1. OVERVIEW

Ferret accepts input data from both ASCII and binary files and recognizes two standardized, self-describing data formats—NetCDF, and TMAP. Network Common Data Format (NetCDF) is the suggested method of data storage.

SET DATA_SET or just SET DATA specifies a data set for access. ASCII and binary files can be read using SET DATA/EZ (also known as "FILE"). To unambiguously specify the format of a data set, include the extension .cdf or .des in its name, or use the qualifier /FORMAT=CDF.

To examine what each data set consists of (variables, grids, etc.) after specifying them with SET DATA, use SHOW DATA. This command displays the variables in the data set and over what geographical and time ranges they are defined.

Here is an example of Ferret's output:

    yes? SET DATA coads_climatology
    yes? SHOW DATA
    currently SET data sets:
   1> /home/e1/tmap/fer_dsets/descr/coads_climatology.des  (default)
name     title                         I         J         K        L
SST     SEA SURFACE TEMPERATURE      1:180     1:90      1:1       1:12
AIRT    AIR TEMPERATURE              1:180     1:90      1:1       1:12
SPEH    SPECIFIC HUMIDITY            1:180     1:90      1:1       1:12
WSPD    WIND SPEED                   1:180     1:90      1:1       1:12
UWND    ZONAL WIND                   1:180     1:90      1:1       1:12
VWND    MERIDIONAL WIND              1:180     1:90      1:1       1:12
SLP     SEA LEVEL PRESSURE           1:180     1:90      1:1       1:12

If multiple data sets have been requested in a single Ferret session, the last requested will be the default data set. To specify other data sets, use the name of the data set or the number of the set as given by the SHOW DATA statement. For example:

yes? LIST/D=2  temp

will list the data for the variable "temp" in data set number 2 as displayed by SHOW DATA/BRIEF, while

yes? LIST temp[D=levitus_climatology] - temp[D=coads_climatology]

will list the differences between the variable "temp" in data set "levitus_climatology" and data set "coads_climatology."

If a filename begins with a number, Ferret does not recoginze it, but the file may be specified using its unix  pathname, e.g.

yes? use "./123"

or

yes? file/var=a "./45N_180W.dat"



Ch2 Sec2. NETCDF DATA

The Network Common Data Format (NetCDF) is an interface to a library of data access routines for storing and retrieving scientific data. NetCDF allows the creation of data sets which are self-describing and platform-independent. NetCDF was created under contract with the Division of Atmospheric Sciences of the National Scientific Foundation and is available from the Unidata Program Center in Boulder, Colorado (unidata.ucar.edu).

See the chapter "Converting Data to NetCDF" (p. 195), for a complete description of how to create NetCDF data sets or how to convert existing data sets into NetCDF.

To output a variable in NetCDF, simply use:

yes?  LIST/FORMAT=CDF variable_name

LIST/FORMAT=CDF (alias SAVE) can also be used with abstract variables:

yes? SAVE/FILE=example.cdf/I=1:100 sin(I/100)

This will create a file named example.cdf.

The current region and data sets determine the variable names in the saved file and the range over which they are saved. Saved data can then be accessed as follows:

yes? USE example

(USE is an alias for SET DATA/FORMAT=CDF)

If a filename is not specified, Ferret will generate one. (See command SET LIST/FILE in the Commands Reference section, p. 299). An example of converting TMAP-formatted data to NetCDF goes as follows:

yes? SET DATA coads_climatology
yes? SAVE/L=1 sst,airt,uwnd,vwnd

These commands will save sst, airt, uwnd, and vwnd at the first time step over their entire regions to a NetCDF file named by Ferret.

One advantage to using NetCDF is that users on a different system (i.e., VMS instead of Unix) with different software (i.e., with an analysis tool other than Ferret) can share data easily without substantial conversion work. NetCDF files are self-describing; with a simple command the size, shape and description of all variables, grids and axes can be seen.

With Ferret version 5.1 , the internal functioning of netCDF reads has been changed when "strides" are involved. Suppose that CDFVAR represent a variable from  NetCDF file. In version 5.0 and earlier the command PLOT CDFVAR[L=1:1000:10] would have read the entire array of 1000 points from the file; Ferret's internal logic would have subsampled every 10th point from the resulting array in a manner that was consistent for NetCDF variables, ASCII variables, user defined variables, etc. In V5.1 strides applied to netCDF variables are given special treatment -- subsampling is done by the netCDF library. The primary benefit of this is to make network access to remote data sets via DODS more efficient. A remote satellite image of size, say, 1000x1000 points x 8 bit depth (8 megabytes) can efficiently be previewed using

SHADE DODS_VAR[i=1:1000:10,j=1:1000:10]

If a grid or axis from a netCDF file is used in the definition of a LET-defined variable (e.g. LET my_X = X[g=sst[D=coads_climatology]]) that variable definition will be invalidated when the data set is canceled (CANCEL DATA coads_climtology, in the preceding example).  There is a single exception to this behavior: netCDF files such as climtological_axes.cdf, which define grids or axes that are not actually used by any variables. These grids and axes will remain defined even after the data set, itself, has been canceled. They may be deleted with explicit use of CANCEL GRID or CANCEL AXIS.


Ch2 Sec2.1. Multi-file NetCDF data sets

Ferret supports collections of NetCDF files that are regarded as a single NetCDF data set. Such data sets are referred to as "MC" (multi CDF) data sets.  They are particularly useful to manage the outputs of numerical models.  MC data sets use a descriptor file, in the style of TMAP-formatted data sets. The data set is referred to inside Ferret by the name of this descriptor file.

A collection of NetCDF files is suitable to form a multi-file data set if

1)    The files are connected through their time axis—each file represents one or more time snapshots of the variables it contains.

2)    Each file is self-documenting with respect to the time axis of the variables—even if the time axis represents only a single point. (All of the time axes must be identically encoded with respect to units and date of the time origin.)

3)    All non-time-dependent variables in the data set must be contained in the first file of the data set (or those variables will not appear in the merged, MC, data set).

A typical MC descriptor file may be found in the chapter "Converting to NetCDF", in the section "Creating a multi-NetCDF data set." (p. 214)


Ch2 Sec2.2. Non-standard NetCDF data sets

As discussed in the Chapter, "Converting Data to NetCDF," (p. 195) Ferret expects netCDF files to adhere to the COARDS conventions (http://ferret.wrc.noaa.gov/noaa_coop/ coop_cdf_profile.html). If the files do not adhere to the COARDS conventions, Ferret will still attempt to access them. Often, the user can use Ferret controls for regridding, reshaping, and otherwise transforming data to recover the intended file contents.

Here are a few common ways in which NetCDF files may deviate from the COARDS standard and how one may cope with those situations in Ferret.

In the COARDS conventions an axis (a.k.a. "coordinate variable") must have monotonically-increasing coordinate values. If the coordinates are disordered or repeating in a netCDF file, then Ferret will present the coordinates to the user (in SHOW DATA) as a dependent variable, whose name is the axis name, and it will substitute an axis of the index values 1, 2, 3, ... Note that Ferret will apply this same behavior when files have long irregular axis definitions that exceed Ferret's axis memory capacity.

If the coordinates of an axis are monotonically decreasing, instead of increasing, Ferret will transparently reverse both the axis coordinates and the dependent variables that are defined upon that axis. Note that if Ferret writes a reverse-ordered variable to a new netCDF file (with the SAVE command), the coordinates and data in the output file will be in monotonically increasing coordinate order—reversed from the input file.

If the values of a dependent variable are reversed, but there is no associated coordinate axis then use attach a minus sign to the corresponding axis orientation in the USE/ORDER=  qualifier to designate that the variable(s) should be reversed along the corresponding axis. (Feature not yet implemented as of 5/5/99)

The COARDS standard specifies that variable names should begin with a letter and be composed of letters, digits, and underscores. In files where the variable names contain other letters, references to those variable names in Ferret must be enclosed in single quotes.

The COARDS standard specifies that if any or all of the dimensions of a variable have the interpretations of "date or time" (a.k.a. "T"), "height or depth" (a.k.a. "Z"), "latitude" (a.k.a. "Y"), or "longitude" (a.k.a. "X") then those dimensions should appear in the relative order T, then Z, then Y, then X in the CDL definition corresponding to the file. In files where the axis ordering has been permuted the command qualifiers USE/ORDER= (Command Reference,  p. 292)  allow the user to inform Ferret of the correct permutation of coordinates. Note that if Ferret writes a permuted variable to a new netCDF file (with the SAVE command), the coordinates and data in the output file will be in standard X-Y-Z-T ordering (as indicated in the user's /ORDER specification)—permuted from the original file ordering. See the Command Reference (p. 243) for a complete description of the ORDER qualifier.


Ch2 Sec3. TMAP-FORMATTED DATA

As of Ferret version 2.30, NetCDF is the suggested format for data storage (see the chapter, "Converting to NetCDF," p. 195). This section describing TMAP information is included only for users who already work with data in TMAP format.

To access TMAP-formatted data sets use

SET DATA_SET TMAP_set1, TMAP_set2, ...

TMAP_setn must be the name of a descriptor file for a data set that is in TMAP "GT" (grids-at-timesteps) or "TS" (time series) format. ("Ferret" format and "TMAP" format are synonyms.)

If the directory portion of the filename is omitted the environment variable FER_DESCR will be used to provide a list of directories to search. The order of directories in FER_DESCR determines the order of directory searches. If the extension is omitted a default of ".des" will be assumed (if the filename has more than one period, the extension must be given explicitly).

Descriptors

For every TMAP-formatted data set there is a descriptor file containing summary information about the contents of the data set. This includes variable names, units, grids, and coordinates. When the command SET DATA_SET is given to Ferret pointing to a GT-formatted or TS-formatted data set, it is the name of the descriptor file that must be specified.


Ch2 Sec4. BINARY DATA

Ferret can read binary data files that are formatted with and without FORTRAN record length headers (binary files without FORTRAN record length formatting are also known as "stream" files).


Ch2 Sec4.1. FORTRAN-structured binary files

Files containing record length information are created by FORTRAN programs using the  ACCESS="SEQUENTIAL" (the FORTRAN default) mode of file creation and also by Ferret using LIST/FORMAT=unf. Files that contain FORTRAN record length headers must have all data aligned on a 4-byte boundary. Suppose "rrrr" represents 4 bytes of record length information and "dddd" represents a 4-byte data value. Then FORTRAN-structured files are organized in one of the following two ways:


Ch2 Sec4.2. Records of uniform length

A FORTRAN-structured file with records of uniform length (3 single-precision floating point data values per record in this figure) looks like this:

rrrr dddd dddd dddd rrrr ...

FORTRAN code that creates a data file of this type might look something like this (sequential access is the default and need not be specified in the OPEN statement):

REAL VARI(10), VAR2(10), VAR3(10)
...
OPEN(UNIT=20,FORMAT='UNFORMATTED',ACCESS='SEQUENTIAL',FILE='MYFILE.DAT')
...
DO 10 I=1,10
    WRITE (20) VAR1(I), VAR2(I), VAR3(I)
10 CONTINUE
....

To access data from this file, use

yes? SET DATA/EZ/FORMAT=UNF/VAR=var1,var2,var3/COL=3  myfile.dat    or,
yes? FILE/FORMAT=UNF/VAR=var1,var2,var3/COLUMNS=3  myfile.dat

This is very similar to accessing ASCII data with the addition of the /FORMAT=unf qualifier. The /COLUMNS= qualifier tells Ferret the number of data values per record. Although optional in the above example, this qualifier is required if the number of data values per record is greater than the number of variables being read (examples follow in section "ASCII Data").


Ch2 Sec4.3. Records of non-uniform length

A FORTRAN-structured file with variable-length records might look like this:

rrrr dddd dddd rrrr
rrrr dddd rrrr
rrrr dddd dddd dddd dddd rrrr
etc.

With care, it is possible to read a data file containing variable-length records which was created using the simplest unformatted FORTRAN OPEN statement and a single WRITE statement for each variable. Use /FORMAT=stream to read such files. Note that sequential access is the FORTRAN default and does not need to be specified in the OPEN statement:

REAL VAR1(1000), VAR2(500)
...
OPEN (UNIT=20, FORMAT="UNFORMATTED", FILE="MYFILE.DAT")
...
WRITE (20) VAR1
WRITE (20) VAR2
....

Use the qualifier /SKIP to skip past the record length information (/SKIP arguments are in units of words), and define a grid which will not read past the data values. The  /COLUMNS= qualifier can be used when reading multiple variables to specify the number of words separating the start of each variable:

yes? DEFINE AXIS/X=1:500:1  xaxis
yes? DEFINE GRID/X=XAXIS  mygrid
yes? FILE/FORMAT=stream/SKIP=1003/GRID=mygrid/VAR=var2  myfile.dat

The argument 1003 is the sum of the 1000 data words in record 1, plus 2 words of record length information surrounding the data values in record 1 (variable var1), plus 1 word of record information preceding the data in record 2.


Ch2 Sec4.4. Stream binary files

Files without embedded record length information are created by FORTRAN programs using  ACCESS="DIRECT" in OPEN statements and by C programs using the C studio library. These files can contain a mix of integer and real numbers. The following types can be read from an unstructured file:

FORTRAN

C

Size in bytes

     

INTEGER*1

char

1

INTEGER*2

short

2

INTEGER*4

int

4

REAL*4

float

4

REAL*8

double

8


Ch2 Sec4.4.1. Simple stream files

Suppose "dddd" represents a 4-byte data value. Then a stream (or "direct access") binary file of FORTRAN REAL*4 or C floats is:

dddd dddd dddd dddd dddd dddd ...

The structure of the records is implied by the program accessing the data.  FORTRAN code which generates a direct access binary file might look like this:

REAL*4 MYVAR(10,5)
...
C Use RECL=40 for machines that specify in bytes

OPEN(UNIT=20, FILE="myfile.dat", ACCESS="DIRECT", RECL=10)
...
DO 100 j = 1, 5
100    WRITE (20,REC=j) (MYVAR(i,j),i=1,10)
....

Use the following Ferret commands to read variable "myvar" from this file:

yes? DEFINE AXIS/X=1:10:1 x10
yes? DEFINE AXIS/Y=1:5:1 y5
yes? DEFINE GRID/X=x10/Y=y5 g10x5
yes? FILE/VAR=MYVAR/GRID=g10x5/FORMAT=stream  myfile.dat

If the file consisted of a set of FORTRAN REAL*8 or C doubles, then the data would look like:

dddddddd dddddddd dddddddd ...

and the following Ferret commands would read the data into "myvar":

yes? DEFINE AXIS/X=1:10:1 x10

yes? DEFINE AXIS/Y=1:5:1 y5

yes? DEFINE GRID/X=x10/Y=y5 g10x5

yes? FILE/VAR=MYVAR/GRID=g10x5/FORMAT=stream/type=r8  myfile.dat

Note the addition of the "type" qualifier. See the FILE command (p. 267) for more details.

Since Ferret represents all variables as REAL*4, some precision is lost when reading in REAL*8 or INTEGER*4 values. Also, some REAL*8 numbers cannot be represented as REAL*4 numbers; the internal Ferret value of such a number is system dependent.


Ch2 Sec4.4.2. Mixed stream files

Ferret can read binary files that contain a mix of numbers of different type. However, a given Ferret variable can only be one type. Say you have a file containing a mix of REAL*8 and REAL*4 numbers:

dddddddd dddd dddddddd dddd dddddddd ...

The following would successfully read the file:

yes? FILE/VAR=MYDOUBLE,MYFLOAT/GRID=somegrid/FORMAT=stream/type=r8,r4  myfile.dat

while:

yes? FILE/VAR=MYDOUBLE/GRID=someothergrid/FORMAT=stream/type=r8,r4  myfile.dat

would fail.

Stream files with byte-swapped numbers can be read with the /swap qualifier; the /order, and /skip qualifiers are also available (see chapter "Data Set Basics", section "Reading ASCII files," p. 37, for more details on /order and /skip).


Ch2 Sec5. ASCII DATA

To access ASCII data file sets use

yes? SET DATA/EZ   ASCII_file_name   or equivalently
yes? FILE   ASCII_file_name

The following are qualifiers to SET DATA/EZ or FILE:

Qualifier

Description

   

/VARIABLES

names the variables in the file

/TITLE

associates a title with the data set

/GRID

indicates multi-dimensional data and units

/COLUMNS

tells how many data values are in each record

/FORMAT

specifies the format of the file

/SKIP

skips initial records of the file

/ORDER

specifies order of axes (which varies fastest)

Use command SET VARIABLE to individually customize the variables.


Ch2 Sec5.1. Reading ASCII files

Below are several examples of reading ASCII data properly. (Uniform record length, FORTRAN-structured binary data are read similarly with the addition of the qualifier /FORMAT= "unf". Seethe chapter on "Data Set Basics",  section "Binary Data," p. 33, for other binary types). First, we look briefly at the relationship between Ferret and standard matrix notation.

Linear algebra uses established conventions in matrix notation. In a matrix A(i,j), the first index denotes a (horizontal) row and the second denotes a (vertical) column.

A11

A12

A13

...

A1n

 

A21

A22

A23

...

A2n

Matrix A(i,j)

...

         
           

Am1

Am2

Am3

...

Amn

 

X-Y graphs follow established conventions as well, which are that X is the horizontal axis (and in a geographical context, the longitude axis) and increases to the right, and Y is the vertical axis (latitude) and increases upward (Ferret provides the /DEPTH qualifier to explicitly designate axes where the vertical axis convention is reversed).

In Ferret, the first index of a matrix, i, is associated with the first index of an (x,y) pair, x. Likewise, j corresponds to y. Element Am2, for example, corresponds graphically to x=m  and y=2.

By default, Ferret stores data in the same manner as FORTRAN—the first index varies fastest. Use the qualifier /ORDER to alter this behavior. The following examples demonstrate how Ferret handles matrices.

Example 1—1 variable, 1 dimension

1a) Consider a data set containing the height of a plant at regular time intervals, listed in a single column:

2.3
3.1
4.5
5.6
. . .

To access, name, and plot this variable properly, use the commands

yes? FILE/VAR=height plant.dat
yes? PLOT height

1b) Now consider the same data, except listed in four columns:

2.3   3.1   4.5   5.6
5.7   5.9   6.1   7.2
. . .

Because there are more values per record (4) than variables (1), use:

yes? FILE/VAR=height/COLUMNS=4 plant4.dat
yes? PLOT height

Example 2—2 variables, 1 dimension

2a) Consider a data set containing the height of a plant and the amount of water given to the plant, measured at regular time intervals:

2.3 20.4
3.1 31.2
4.5 15.7
5.6 17.3
. . .

To read and plot this data use

yes? FILE/VAR="height,water" plant_wat.dat
yes? PLOT height,water

2b) The number of columns need be specified only if the number of columns exceeds the number of variables. If the data are in six columns

2.3 20.4   3.1 31.2   4.5 15.7   
5.6 17.3 ...

use

yes? FILE/VAR="height,water"/COLUMNS=6 plant_wat6.dat
yes? PLOT height,water

Example 3—1 variable, 2 dimensions

3a) Consider a different situation: a greenhouse with three rows of four plants and a file with a single column of data representing the height of each plant at a single time (successive values represent plants in a row of the greenhouse):

3.1
2.6
5.4
4.6
3.5
6.1
. . .

If we want to produce a contour plot of height as a function of position in the greenhouse, axes will have to be defined:

yes? DEFINE AXIS/X=1:4:1 xplants
yes? DEFINE AXIS/Y=1:3:1 yplants
yes? DEFINE GRID/X=xplants/Y=yplants gplants
yes? FILE/VAR=height/GRID=gplants greenhouse_plants.dat
yes? CONTOUR height

When reading data the first index, x, varies fastest.  Schematically, the data will be assigned as follows:

        x=1        x=2         x=3        x=4   
y=1     3.1        2.6         5.4        4.6
y=2     3.5        6.1 . . .
y=3    . . .

3b) If the file in the above example has, instead, 4 values per record:

3.1   2.6   5.4   4.6
3.5   6.1  . . .

then add /COLUMNS=4 to the FILE command:

yes? FILE/VAR=height/COLUMNS=4/GRID=gplants greenhouse_plants.dat

Example 4—2 variables, 2 dimensions

Like Example 3, consider a greenhouse with three rows of four plants each and a data set with the height of each plant and the length of its longest leaf:

3.1     0.54
2.6     0.37
5.4     0.66
4.6     0.71
3.5     0.14
6.1     0.95
.        .
.        .

Again, axes and a grid must be defined:

yes? DEFINE AXIS/X=1:4:1 xht_leaf
yes? DEFINE AXIS/Y=1:3:1 Yht_leaf
yes? DEFINE GRID/X=xht_leaf/Y=yht_leaf ght_leaf
yes? FILE/VAR="height,leaf"/GRID=ght_leaf greenhouse_ht_lf.dat
yes? SHADE height
yes? CONTOUR/OVER leaf

The above commands create a color-shaded plot of height in the greenhouse, and overlay a contour plot of leaf length. Schematically, the data will be assigned as follows:

        x=1         x=2         x=3         x=4
     ht ,  lf    ht ,  lf
y=1   3.1, 0.54   2.6, 0.37   5.4, 0.66   4.6, 0.71
y=2   3.5, 0.14   6.1, 0.95 . . .
y=3   . . .

Example 5—2 variables, 3 dimensions (time series)

Consider the same greenhouse with height and leaf length data taken at twelve different times. The following commands will create a three-dimensional grid and a plot of the height and leaf length versus time for a specific plant.

yes? DEFINE AXIS/X=1:4:1 xplnt_tm
yes? DEFINE AXIS/Y=1:3:1 yplnt_tm
yes? DEFINE AXIS/T=1:12:1 tplnt_tm
yes? DEFINE GRID/X=xplnt_tm/Y=yplnt_tm/T=tplnt_tm gplant2
yes? FILE/VAR="height,leaf"/GRID=gplant2 green_time.dat
yes? PLOT/X=3/Y=2 height, leaf

Example 6—1 variable, 3 dimensions, permuted order (vertical profile)

Consider a collection of oceanographic measurements made to a depth of 1000 meters. Suppose that the data file contains only a single variable, salt. Each record contains a vertical profile (11 values) of a particular x,y (long,lat) position. Supposing that successive records are successive longitudes, the data file would look as follows (assume the equivalencies are not in the file):

              z=0   z=10  z=20  . . .

x=30W,y=5S  35.89 35.90 35.93 35.97 36.02 36.05 35.96 35.40 35.13 34.89 34.72

x=29W,y=5S  35.89 35.91 35.94 35.97 36.01 36.04 35.94 35.39 35.13 34.90 34.72

            . . .

Use the qualifier /DEPTH= when defining the Z axis to indicate positive downward, and /ORDER when setting the data set to properly read in the permuted data:

yes? DEFINE AXIS/X=30W:25W:1/UNIT=degrees salx
yes? DEFINE AXIS/Y=5S:5N:1/UNIT=degrees saly
yes? DEFINE AXIS/Z=0:1000:100/UNIT=meters/DEPTH salz
yes? DEFINE GRID/X=salx/Y=saly/Z=salz salgrid
yes? FILE/ORDER=zxy/GRID=salgrid/VAR=sal/COL=11 sal.dat



Ch2 Sec6. TRICKS TO READING BINARY AND ASCII FILES

Since binary and ASCII files are found in a bewildering variety of non-standardized formats a few tricks may help with reading difficult cases.

Unix commands:

    ln -s my_data my_dat.v1
    ln -s my_data my_dat.v2
    ln -s my_data my_dat.v3

Ferret commands:

    yes? FILE/SKIP=0/VAR=v1 my_dat.v1
    yes? FILE/SKIP=100/VAR=v2 my_dat.v2
    yes? FILE/SKIP=200/VAR=v3 my_dat.v3


Ch2 Sec7. ACCESS TO REMOTE DATA SETS WITH DODS

DODS, the Distributed Oceanographic Data System, allows users to access data anywhere from the internet using a variety of client/server methods, including Ferret. Employing technology similar to that used by the World Wide Web, DODS and Ferret create a powerful tool for the retrieval, sampling, analyzing and displaying of datasets; regardless of size or data format (though there are data format limitations).

For more information on DODS, please see the DODS home page at

    http://unidata.ucar.edu/packages/dods/

Similar to the WWW, DODS is an emerging technology and is under development. As a result, it is likely that the details with which things are accomplished will be changing.

Datasets are accessed through Ferret using their raw Universal Resource Locator (URL) address. For example, to access the Coads climatology, hosted at PMEL:

yes? use "http://ferret.wrc.noaa.gov/cgi-bin/nph-nc/data/coads_climatology.nc"

Once the dataset has been initialized, it is used just like any other local dataset.

yes? list/x=140w/y=2n/t="16-Feb" sst
           SEA SURFACE TEMPERATURE (Deg C)
           LONGITUDE: 141W
           LATITUDE: 1N
           TIME: 15-FEB 16:29
           DATA SET: http://ferret.wrc.noaa.gov/cgi-bin/nph-nc/data/coads_climatology.nc
        26.39

We have developed some general scripts (available at http://ferret.wrc.noaa.gov/ Ferret/Dods/) which will assist in both finding and using data with Ferret and DODS. This script illustrates the use of datasets which are located on our server at PMEL.

yes? go dods

——————————————————-

DODS data suppliers currently known to your Ferret installation:

* * * * * * * * in .

dods_cdc.jnl: "Use" a dods data set from NOAA/CDC

dods_fsu.jnl: "Use" a dods data set from FSU

dods_jpl.jnl: "Use" a dods data set from JPL

dods_pmel.jnl: "Use" a dods data set from NOAA/PMEL

dods_uri.jnl: "Use" a dods data set from URI/GSO

The techniques for cataloging and organizing distributed DODS data are still under development. The Ferret scripts to assist with DODS access are subject to change.

yes? go dods_pmel

—————————————————

The available data sets are:

   coads_climatology.nc levitus_climatology.nc


The base URL is: http://ferret.wrc.noaa.gov/cgi-bin/nph-nc/data/


yes? go dods_pmel coads_climatology.nc

yes? sh data/br

currently SET data sets:

1> http://ferret.wrc.noaa.gov/cgi-bin/nph-nc/data/coads_climatology.nc (default)

yes? list/x=140w/y=2n/t="16-Apr" sst

       SEA SURFACE TEMPERATURE (Deg C)

       LONGITUDE: 141W

       LATITUDE: 1N

       TIME: 16-APR 13:27

       DATA SET: http://ferret.wrc.noaa.gov/cgi-bin/nph-nc/data/coads_climatology.nc

            27.07


To find out more information about a particular dataset, or to debug problems, there are three elements of the dataset which may be accessed via a web browser. To access this information, merely append a dds, das, or info to the dataset name. For example:

http://ferret.wrc.noaa.gov/cgi-bin/nph-nc/data/coads_climatology.nc.dds

DDS stands for Data Description Structure and this will return a text description of the data sets structure.

http://ferret.wrc.noaa.gov/cgi-bin/nph-nc/data/coads_climatology.nc.das

DAS stands for Dataset Attribute Structure and this will return a text description of attributes assigned to the variables in the data set.

http://ferret.wrc.noaa.gov/cgi-bin/nph-nc/data/coads_climatology.nc.info

This will return a text description of the variables in the dataset.

One of the most powerful aspect of DODS is the ease with which it allows for the sharing of data. With just a few simple steps, anyone running a web server can also be a DODS data server, thereby allowing data set access to anyone with an internet connection.

Simply copying a few precompiled binaries into the cgi-bin directory of an already configure httpd server is all it takes to become a DODS server. Once the server is configured, adding or removing data sets is as simple as copying them to the server data directory or deleting them from that directory.

This ability has such immense potential that it bears extra emphasis. Imagine that within seconds of finishing a model run, a remote colleague is able to look at your dataset with whatever DODS client he/she desires, be it Ferret, or Matlab, etc. No need for you to package up the data or for your colleague to download and/or reformat it, it is ready to be analyzed right away.

This feature allows caching of frequently accessed DODS-served datasets to produce a quicker response when  requesting the remote data.

The first time you access a dods data set, a file in the users home directory will be created called dodsrc and will contain:

USE_CACHE=1     ! Turn cache on or off - 0=off, 1=on
MAX_CACHE_SIZE=20   ! max cache size in Mbytes
MAX_CACHED_OBJ=5    ! max cache entry size in Mbytes
IGNORE_EXPIRES=0    ! 0: Honor expiration information from servers,1: ignore them
CACHE_ROOT=/home/myname/.dods_cache/  !pathname to cache directory
DEFAULT_EXPIRES=86400    ! expiration time, in seconds, of cached data

Also created will be a .dods_cache directory, which by default (as mentioned above) is created in the users home directory. This is where all the cached information is stored. To clear the DODS cache, simply delete the .dods_cache directory and all of it's contents (for example, rm -r ~/.dods_cache). This directory will be recreated and repopulated with caching information the next time data is accessed via DODS and caching is turned on. Of course, all of the above values can be modified to better suit individual needs, and will be incorporated the next time Ferret is run. For example, to turn caching off, simply set USE_CACHE to 0, and restart Ferret.

For more detailed information on setting up a DODS server, please see the DODS home page (http://unidata.ucar.edu/packages/dods).


ferret_ug@pmel.noaa.gov

Last modified: September 27, 2000