Data Model
Developing point data readers for RAMADDA
Point Data Documentation
 
4.2 Text Point Data
RAMADDA can provide rich support for structured CSV or text data files. First of all, download and install the pointtools.zip release. All of the examples below, once saved to disk, can be processed with:
    sh <install path>/pointtools/pointchecker.sh  value.csv
Here are some examples of different point data readers.
4.2.0 Simple CSV Examples
If you have text formatted data that you want to ingest into RAMADDA you can either generate the data in a standard CSV text format for specify a set of metadata properties in an external properties file.

The "standard CSV" format has any number of "#" delimited comment and property lines at the beginning of the file followed by any number of data records. The properties are defined in the header with:

#comment
#property name=property value
#property name=property value
#
value1,value2,valueN
value1,value2,valueN
...
Here is a simple example with just a single column value:
value.csv
#fields=value unit"
-0.931363
-0.930391
The only property that is required is the fields property - a comma separated list of field identifiers with a set of attributes contained within " and "".
    fieldname[attr1="value1" attr2="value2" ...]
Here is a simple example with 2 columns. The second column has a missing value defined.
2values.csv
#fields=value1 unit",value2 unit" missing="-999.0"
-0.93,100.0
-0.23,-999.0
-1.93,-999.0
An alternative way to specify attributes of fields is with other named properties as shown below.
2values_alt.csv
#fields=value1,value2
#field.value1.unit=some unit
#field.value2.unit=some other unit
#field.value2.missing=-999.0
-0.93,100.0
-0.23,-999.0
-1.93,-999.0
4.2.1 Date/Time
You can specify a date/time by specifying its type="date". Use format="date format" to specify the date format. Here is an example with time and a single value.
time_value.csv
#fields=date format="yyyy-MM-dd",value
2001-01-01,-0.931363
2001-02-01,-0.930391
2001-03-01,-0.95
2001-04-01,-0.96
Here we have a file where the date and time are in different columns. The isdate and istime attributes specify that the time field is created from both of the columns. The dateformat specifies the format to use.
datetime.csv
#
#dateformat=yyyy/MM/dd HH:mm:ss
#fields=date isdate="true",time istime="true",value
2012/10/12 14:11:17.14 -0.931363 2012/10/12 14:11:17.24 -0.930391
If you have fields with the names yyyy (or year), month (or mm), day, hour, minute, second (or a subset of them) then RAMADDA will figure out the date/time of the records from the column values.
yymmddhhmmss.csv
#fields=yyyy[1],month[2],day[3],hour[4],minute[5],second[6],value
2001,01,01,01,00,00,-0.931363
2001,02,01,01,00,00,-0.930391
2001,03,01,01,00,00,-0.95
2001,04,01,01,00,00,-0.96
4.2.2 Georeferenced Data
If you have georeferenced data then specify latitude and longitude columns. Please, please, please, use decimal degrees east -180 to 180 and decimal degrees north -90 to 90.
latlon_value.csv
#fields=latitude[7],longitude[8],value
40,-107,-0.931363
45,-110,-0.930391
40,-107,-0.95
35,-120,-0.96
Here is a georeferenced time series:
latlon_time_value.csv
#fields=latitude[9],longitude[10],date format="yyyy-MM-dd",value
40,-107,2001-01-01,-0.931363
45,-110,2001-02-01,-0.930391
40,-107,2001-03-01,-0.95
35,-120,2001-04-01,-0.96
You can specify different coordinate reference systems with the crs property. For UTM coordinates specify an X and Y field and the utm zone and north/south flag: Here is data in UTM zone 58 South:
utm58s_rgbi.csv
#crs=utm
#utm.zone=58
#utm.north=false
#fields=x,y,elevation[11],red,green,blue,intensity
449929.47,  1382815.76, 21.01,  67, 66, 61, 0
449929.45,  1382815.77, 21.00,  67, 66, 61, 0
449929.47,  1382815.77, 21.02,  78, 77, 72, 0
449929.13,  1382815.69, 20.94,  89, 86, 77, 0
449929.16,  1382815.71, 20.94,  90, 90, 82, 0

Here is data in a WGS84 ellipsoid:
wgs84_rgbi.csv
#crs=wgs84
#fields=x[12],y[13],z[14],r[15],g[16],b[17],intensity[18]
-2313174.974,-3717949.974,4622885.034,4,4,4,1166
-2313175.009,-3717949.961,4622885.028,2,2,2,1799
-2313175.001,-3717950.058,4622884.669,2,2,2,1196
-2313175.012,-3717949.889,4622884.824,4,4,4,2659
-2313175.284,-3717950.819,4622883.842,4,4,4,1663
-2313175.097,-3717950.210,4622884.419,6,6,6,2101
-2313175.074,-3717949.930,4622884.744,5,5,4,1598
-2313175.198,-3717950.351,4622884.298,5,5,4,1937
-2313175.079,-3717949.814,4622884.857,3,3,3,1302
-2313175.195,-3717950.237,4622884.345,4,4,4,1425
You can also specify EPSG coordinate systems by setting the crs property to:
crs=epsg:<epgs code>

4.2.3 Site Based Data
If you have text values then specify the type="string"
site_time_value.csv
#
#fields=latitude[1],longitude[2],site[3],date format="yyyy-MM-dd",value
40,-107,site id 1,2001-01-01,-0.931363 45,-110,site id 2,2001-02-01,-0.930391 40,-107,site id 1,2001-03-01,-0.95 35,-120,site id 3,2001-04-01,-0.96
Its often the case that a single file has site and lat/long data is implicit in a header, etc. For these cases we want to be able to access the site and location as we read the data. So, we define a fake field with a value="..." attribute.
fixed_site.csv
#
#fields=latitude value="40",longitude value="-107",site value="site id",date format="yyyy-MM-dd",value
2001-01-01,-0.931363 2001-02-01,-0.930391 2001-03-01,-0.95 2001-04-01,-0.96
Likewise, you can specify the time value:
fixed_time.csv
#
#fields=latitude value="40",longitude value="-107",site value="site id",date format="yyyy-MM-dd" value="2001-01-01",value
-0.931363 -0.930391 -0.95 -0.96
You can also specify a pattern that is applied to the text in the header to extract out latitude, longitude, elevation, etc.
patternexample.csv
#
#fields=Site_Id   pattern="ID:\s(.*)ARGOS:", Latitude[4], Longitude[5], Elevation[6], value
  1. Year: 2011 Month: 02 ID: KMS ARGOS: 21364 Name: Kominko-Slade
  2. Lat: 79.47S Lon: 112.11W Elev: 1801m
1 2 3 4

4.2.4 Property Files
If you have text point files that are in some pre-existing format (i.e., you can't add "#" properties) then you can specfy Lets assume we have a simple CSV file with 4 columns -
example1.csv
latitude, longitude, date, value
40.0,-107,2012/10/12,  -0.931363
45.0,-110.0,2012/10/12, -0.930391
We can read this file with just a point.properties file. The key properties are delimiter, skiplines and fields.
example1.csv.properties

  1. Define the column delimiter
delimiter=,
  1. number of lines in header to skip
skiplines=1
  1. fields definition
fields=latitude[7],longitude[8],date format="yyyy/MM/dd",value unit"
example1_alt.csv.properties

  1. number of lines in header to skip
skiplines=1
  1. An alternative method to define the attributes of the fields
  2. define the field names here
fields=latitude,longitude,date,value date.type=date date.format=yyyy/MM/dd
  1. define the attributes with
#field..=... field.latitude.unit=degrees field.longitude.unit=degrees field.date.type=date field.date.format=yyyy/MM/dd field.value.unit=some unit
  1. One can also set searchable and chartable attributes for ramadda's use
field.value.searchable=true field.value.chartable=true

4.2.5 Integrating with RAMADDA
You can upload an arbitrary CSV point file and its accompanying properties file. When you are logged in go to File->New Entry. Under the Point Data list choose Text Point Data. Specify the properties in the text field or upload the properties file. You can also define a new entry type in RAMADDA for your point data. Embed the properties in a properties tag. Install the types.xml file as a plugin.
exampletypes_.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<types>

    <type
     description="Test data"
     handler="org.ramadda.data.services.PointTypeHandler"
     name="type_point_test"
     super="type_point">

         <property name="record.file.class" value="org.ramadda.data.point.text.CsvFile"/>

         <property name="record.properties">
delimiter=
position.required=false
skiplines=1
dateformat=yyyy/MM/dd HH:mm:ss
fields=date isdate="true",time istime="true",value chartable="true" unit="some unit"
     </property>

      </type>

</types>



 

Data Model
Developing point data readers for RAMADDA