In order to specify the characteristics of ASCII and Table data, we use a construct called schema when creating metadata. "Schema" is really just a block of text that describes the arrangement of data within a file. The structure of a schema block in the datamanager.txt configuration file is
schema
statements
...
endschema
When schema are to be specified in Data Source Wizard or FormatDisplay, the keywords schema and endschema should be omitted as well as any quotes. The types of statements that are allowed depend on whether the schema describe Ascii or Table data (HDF-EOS Point files also use schema).
Valid statements for ASCII data are labels, edit descriptors, and goto. Edit descriptors look very much like Fortran FORMAT statements. They are made up of lists of field descriptors, followed by a field width on the right, and preceded by a repeat count on the left. The current valid field descriptors are F and N. F denotes floating point fields, and N denotes integers.
Skip descriptors can also be included. The characters "x" or "X" indicate a blank space. The "/" character causes a line to be skipped. These descriptors can also be preceded by repeat counts.
Edit descriptors can be nested indefinitely within parentheses. Nested edit lists can also be preceded by repeat counts. Commas are allowed as delimiters between descriptors. Though they are not required, they can help you make the meaning of an edit descriptor more clear.
Input data sources can be described in the file datamanager.txt. For example
DataSource "N7t" file ga840101.n7t format "Ascii" schema "3/" "Label1:" "11(x,25F3/)" "x,13F3/" "goto Label1" endschema
.
.After the schema keyword, the first line is "3/". This tells WebWinds to skip three lines.
Next comes a label. "Label1:" is like a statement label in any programming language. It denotes a place that can be jumped to from a goto statement.
The next two statements are edit descriptors. The first one, "11(x,25F3/)", tells WebWinds read eleven lines, each containing a superfluous character, followed by 25 three-digit floating-point numbers. The "/" is necessary to make Webwinds move to the next line for input. The next descriptor, "x,13F3/" just reads a single line, skipping the first character, and reading thirteen three-digit floats.
Note that the two edit descriptors also work if they're combined into one descriptor, like this:
schema "3/" "Label1:" "11(x,25F3/),x,13F3/" "goto Label1" endschemaThe line "goto Label1" tells the input method to go back to repeat the edit descriptor block for the next block of input.
lon lat temperature
(C)
-130.123 10.2 24.2
-132.304 2.1 27.0
-150.777 2.2 26.8
. .
.
. .
.
. .
.
As you can see, the data are arranged into columns and rows. The columns are separated by white space. Either blanks or tab characters are allowed. The table reader expects that all the columns are full. If there are any empty cells, the data that follows will be read into the wrong columns. Table data can be
Although both types of files contain formatted data, Tables are structurally quite different from ASCII files. Tables will, in general, have only certain columns that are to be read in. Although each column is treated as a distinct entity in WebWinds, the user specifies, via the schema statements which columns are to be read and of those columns which are to be treated as the metadata and which one (or ones) is the data.
As an example, lets say we have an ASCII table that has 3 columns. The
first two columns are, respectively longitude and latitude and the
third column is temperature data. The following schema
will cause the data to be read:
schema
"columns 3 1 2"
endschema
This instructs the table reader to read columns 3, 1 and 2. It also instructs it to make the data in column 3 fit onto a grid generated by the data in columns 1 and 2. ( If the last two columns are reversed, then the axes will be reversed when an image is created from the data ). As was the case for ASCII data, the "/" descriptor causes a line to be skipped.
The above example would produce a 2-D data set. If more than 1 column contains data, the other columns can be added in to produce a 3-D data set. To do this, use the concat descriptor.
Lets say that in the above example column 4 is data at some other altitude, depth, or some other third dimension element. The concat keyword can be used to instruct the table reader that more than one column is data. If the schema is like this:
schema
"columns concat ( 3 4 ) 1 2"
endschema
The table reader will recognize that columns 3 and 4 are data. When the data is dropped into an Image, two slices will be rendered the first slice representing the data in column 3 and the second the data in column 4. If you want these reversed then reverse the numbers in the schema. Note that this approach allows different types of data slices to be stacked into one data cube.
The table reader can also generate a 3D data cube where all 3 axes are specified by dependent variables. For example, it will read four columns treating three columns as dependent variables (the X, Y, and Z axes, respectively). When the data are gridded, they will be gridded in three dimensions and the gridding algorithm will try to preserve aspect ratios that make sense. Lets say that column 4 is data in third dimension, use the following schema:
schema
"2/"
"columns 5 2 1 4"
endschema
Here, columns 2, 1 and 4 will be treated as metadata for the X, Y, and Z axes respectively. The data in column 5 will be gridded into a cube generated with the data in columns 2, 1 and 4 by the gridding algorithm.
For example. Consider the following schema.
schema "start:" "3f24/" "goto start" endschemaEach line in the data file holds three floating point numbers. The Ascii reader does not try to be excessively clever about how much data could be read in. It only processes one schema statement at a time, so it simply allocates a buffer with only three elements.
This example was actually used on a file containing only a million double precision numbers, and the resulting amount of interrupts for internal data transfers more than tripled the time to read the file. The entire file took over 10 minutes to read.
However, the buffer computation tallies repeat counts across nested descriptors, so for a simple format such as this, you could use a statement such as "(37(11(3f24))". The buffer is now 37*11*3 = 1221 elements long. In this case, numbers were chosen to be consistent with the data dimensions. This change reduced the time to read this file to less that three minutes.
Standard Fortan defines the syntax for the F descriptor as
Fw.m,
where w is the full width of the field, and m is the width of the mantissa.
Webwinds accepts this syntax, but currently does not process the mantissa
width. The text-to-double converter that WebWinds uses keys on the decimal
point, if it is available, thus processing as many significant digits as
it can. Otherwise, it behaves as if m were equal to zero.