The Godiva2 web portal for environmental data visualization
The latest version of this document can be found at www.resc.rdg.ac.uk/twiki/bin/view/Resc/GodivaTwo. For a PDF of the latest version of this document, go to http://www.resc.rdg.ac.uk/twiki/bin/genpdf/Resc/GodivaTwo.
Introduction
Godiva2 (
http://lovejoy.nerc-essc.ac.uk:8080/Godiva2) is a prototype "next generation" environmental data portal that uses Google Maps technology to display interactive visualizations of gridded data (such as satellite images/measurements and numerical model output). It is being developed by the
Reading e-Science Centre, which is hosted at the
NERC Environmental Systems Science Centre (
ESSC) at the
University of Reading, UK. Godiva2 is an
AJAX application (i.e. a dynamic, responsive web application that "feels" more like a desktop application).
Godiva2 is being run alongside the original
Godiva data portal. The original Godiva portal has more functionality than Godiva2: it allows animations to be created, and data can be extracted in a variety of dimensions (x-y maps, x-t Hovmuller diagrams, x-y-z-t data blocks etc). By contrast, Godiva2 can only currently display maps (x-y plots). Another major difference is that Godiva2 currently only allows data to be
visualized: it does not allow the original data to be downloaded.
The main aim of Godiva2 is to allow data to be visualized
as quickly as possible, in an
interactive fashion. Data can be loaded and displayed on a Google Map with a single mouse click. The user can then navigate around the dataset by dragging the map display around, zooming in and changing the timestep and depth/height level that is being displayed. The user can change the extents of the colour scale to allow phenomena (such as ocean eddies) to be displayed clearly. All changes that the user makes are immediately reflected in the map display, without the need for further mouse clicks or a browser refresh. By contrast, the original Godiva site requires that the user select all parameters in advance, before the data are displayed after a delay of a few seconds. Godiva2 is therefore more interactive than the original Godiva portal, but less feature-rich. Both sites will be maintained for the foreseeable future.
Having selected the field of interest (e.g. sea temperature from the NCOF 1 degree global model), the depth, time and colour scale, the user can click "Open in Google Earth" to display the data in
Google Earth. Other sources of Google Earth data (e.g. ARGO float locations, sea-ice extents and many more) can then be displayed alongside the selected data on a 3-D globe.
System architecture
The system consists of four basic components:
- the web interface
- a metadata server
- a server for Google Maps (GM) images
- a server for Google Earth (GE) images
Note: The server components of the architecture are in the process of being replaced by an OGC-compliant Web Map Service. This serves metadata and images to both Google Maps and Google Earth, as well as a large number of other clients that support the WMS specification. This page will be updated when the changes go live.
At present, Godiva2 only displays data that are held at ESSC. However, it would be possible for other institutions to host a server that provides images and metadata to Godiva2 (see
below). In other words, the web interface and the server components are loosely coupled from each other.
The three server components (metadata, GM images, GE images) can be hosted in the same application server (from now on we shall assume that these components are served together by a
data provider). They are all HTTP servers. At ESSC, the application server is
Tomcat and the server components are Java servlets and Java Server Pages (JSPs) that use the
GADS library for handling numerical model output. However, we are considering moving towards an
Apache/mod_python solution to take advantage of the
CDAT toolkit. The Godiva2 interface does not care how the back-end servers are implemented: it simply makes requests via HTTP GET for images and metadata. All that is important is that the servers return images and metadata in the correct format.
Note that the Godiva2 servers do not serve actual data, only metadata and images. The diagram below illustrates a possible situation in which there are three providers of data to the Godiva2 interface. The web interface (HTML and Javascript code) is downloaded from ESSC's web server to the client. The data providers have previously registered with the web interface and the interface contains references to the data providers. When the client opts to view data that is provided by, say, "Data provider 2", the client's browser makes requests for metadata and images
directly to the data provider: these requests do not go through ESSC's servers. In this way the architecture is very scalable.
Data structure
In order to understand how the system works, it is important to understand how the underlying data are logically structured. At present the
GADS libary is used internally to handle data and metadata operations. GADS uses the following logical data structure: A Dataset (e.g. "NCOF 1 degree Global") contains one or more Variables (fields) such as "sea_water_potential_temperature". A Variable exists on a particular Grid, which consists of one or more Axes. Note that GADS only understands rectilinear grids of orthogonal axes in latitude-longitude space. When we move to using CDAT this assumption can be relaxed and more grid types and projections (e.g. curvilinear, rotated grids) will be able to be handled.
For some projects and institutions, this structure is not sufficient. For example, in ensemble studies, one might logically associate several Ensembles with a single Dataset. Additionally, if Godiva2 is extended to display data from multiple institutions, one might have an "Institution" entity that "owns" several Datasets or Ensembles.
Godiva2 needs to be able to identify a particular variable for display. Allowing for the fact that different institutions and projects will have different logical organizations of their data, Godiva2 will soon be modified to accept arbitrary hierarchies of datasets. One can model this hierarchy as a set of files and folders in a filesystem. Containers (such as Ensembles and Datasets) are modelled as logical folders, whereas displayable data (i.e. Variables) are modelled as files. For example:
/UKMO/NCOF_1_Degree_Global/sea_water_potential_temperature
/ECMWF/ERA-40/Ensemble_member_1/sea_water_salinity
These are simple unique identifiers for variables (or "coverages" in the terminology of the
Open Geospatial Consortium).
System components
Web interface
The Web interface consists of a single HTML web page (which is actually implemented as a JSP) and associated Javascript files. These have been mostly hand-coded, but takes advantage of a couple of third-party Javascript libraries for handling common tasks. The interface requires a fairly modern browser to work properly: it has been tested with Internet Explorer 6, Mozilla Firefox 1.0 and above and Opera 8.5. It has not yet been tested on Safari. Care has been taken to ensure that browser compatibility issues are minimized.
Metadata server
The metadata server component serves information about the data holdings at a particular data provider. In the current version of Godiva2, this component consists of the following Java Server Pages:
-
getDatasets.jsp: This returns a list of the datasets that are held by the data provider.
-
getVariables.jsp: This returns a list of the variables (fields) in a particular dataset.
-
getVariableDetails.jsp: This returns details about a particular variable. These details currently consist of a list of the depth/height levels in the variable's grid and a suggested scale range for the variable (e.g. for a sea water temperature variable in degrees Celsius, this suggested scale range might be -10 to 40 degrees). The suggested scale range is used to generate the default colour scale, and is simply guessed by the server in many cases.
-
getCalendar.jsp: This returns details of the dates for which data exist. Note that the current version of Godiva2 assumes that there is a maximum of one recorded snapshot per day.
In the current prototype
getDatasets.jsp and
getVariableDetails.jsp return XML data.
getDatasets.jsp and
getCalendar.jsp return HTML markup, which is displayed directly. This structure is not very neat (it does not separate data from presentation adequately) and will be improved.
Note that, to allow logical data structures other than the simple Dataset-Variable structure that GADS understands (see
above),
getDatasets.jsp and
getVariables.jsp will ultimately be replaced by a single function that allows the user to navigate through an arbitrary hierarchical structure (e.g.
/institution/dataset/ensemble_member/variable).
Google Maps image server
The Google Maps (GM) GUI component displays images as a set of tiles in Mercator projection. Each tile is 256x256 pixels and is downloaded individually from the image server. In Godiva2, the GM image server is implemented as a Java servlet. It's job is to accept requests for images (via HTTP GET), extract the relevant data from the data store and render it as a PNG image according to the user's parameters. (PNG is chosen as an image format because it supports transparency.)
The request for an image is in the form of a URL that contains the name of the dataset, variable, z level, timestep, colour scale min/max, the x position of the tile on the Google Map, the y position of the tile and the zoom level. For example, the URL:
http://lovejoy.nerc-essc.ac.uk:8080/Godiva2/GmapsServlet?dataset=MERSEA_NATL_ANAL ...
... &variable=sea_water_potential_temperature&z=0&t=1084&scale=-10.0,35.0&x=1&y=1&zoom=2
generates an image tile representing sea water temperature over an area roughly corresponding to the North Atlantic from the UK Met Office's 1/9 degree FOAM model:
The last three parameters in the URL (
x=1&y=1&zoom=2) are generated automatically by the Google Maps component and are translated by the server into latitude-longitude extents. Note that "missing values" (i.e. pixels representing land) are transparent, meaning that the underlying satellite image on the Google Map can show through.
Google Earth image server
The Google Earth (GE) image server component is very similar to the GM image server. It is also implemented as a Java servlet and also produces images in response to an HTTP GET request. Most of the parameters in the request URL are the same as those in the GM image server request. The main difference is that Google Earth requests images over arbitrary lat-long extents (it does not split images up into tiles as Google Maps does). Also, GE requires images in lat-long projection (Plate Carree), rather than the Mercator projection that GM requires.
When the user clicks on "Open in Google Earth" on the Godiva2 interface, a
KML file is generated automatically. This KML file can then be opened in Google Earth (on some systems this will happen automatically). The KML file contains a
NetworkLink that contains a link to
another KML file, which in turn contains the link to the image server. This apparently-complicated mechanism is necessary to ensure that when the user rotates the Google Earth globe, or zooms into it, the image is automatically refreshed. (In fact, this refreshing currently occurs rather too often: this may improve with the move to KML 2.1.)
An example of a URL used to generate a Google Earth image is:
http://lovejoy.nerc-essc.ac.uk:8080/Godiva2/GEarthPicServlet?dataset=MERSEA_NATL_ANAL ...
... &variable=sea_water_potential_temperature&z=0&t=1084&scale=-10,35&BBOX=-90,0,-30,60
Note that the bounding box of the picture is given as a BBOX argument (which is generated automatically by Google Earth on the basis of the current field of view on the GE globe). This produces the following picture:
Hosting a Godiva2 server
If an institution wishes to allow its data to be viewed through the Godiva2 interface, it will need to become a "data provider", i.e. it will need to run a metadata server, and image servers for Google Maps and Google Earth. As mentioned above, at ESSC these servers are implemented as Java servlets and JSPs within a single Tomcat server. However, any HTTP server system could be used provided that the interfaces remain the same (indeed ESSC might move to an Apache/mod_python system: see
ImplementingGodiva2WithCDAT).
If the institution already holds data in CF-compliant
NetCDF format then the process of becoming a data provider should be fairly straightforward, as the data files already contain all the necessary metadata. The metadata used in Godiva2 is described in the "Metadata server" section
above.
In summary, a potential data provider will only need to host a single HTTP server (such as Tomcat or Apache). It is probably best if this can be a dedicated server as the process of generating images "on the fly" for Google Maps and Google Earth can place a significant load on the server if there is a reasonable amount of activity. It is expected that the Google Maps image server will be the most active component because each GM display consists of several tiles, each of which has to be generated dynamically. When the user drags the map or zooms in, another set of tiles need to be produced.
Other sources of Google Earth data
Future enhancements
Please contact
resc@rdgNOSPAMPLEASE.ac.uk with any suggestions for enhancing the Godiva2 system. Here is a list of potential new features that we have in mind:
More general hierarchies of datasets and variables
See
above.
Display of animated data
Currently neither Google Maps nor Google Earth can display animations (e.g. animated GIFs). This feature might appear in future versions. In the meantime it would be possible to display animations in a limited manner, for example by refreshing the image on the Google Map or Google Earth globe every second or so, moving forward by one timestep each time. This needs further investigation.
Use of OGC standards
Neither of the image serving components comply with any particular standards. However, both Google Maps and Google Earth are able to obtain images from Web Map Servers (WMSs). The Web Map Server is an open specification for serving map images, from the Open Geospatial Consortium, (
OGC). A possible future modification to the image servers used in Godiva2 would be to make them compatible with the WMS specification for greater interoperability, so that they could be used outside Godiva2. The web interface would need to be modified to make WMS requests instead of the current custom requests; this would mean that the Godiva2 interface could display data from the many WMS servers that exist.
Security
The Godiva2 server is currently completely unsecured. ESSC has an agreement with the UK Met Office whereby
images of data can be provided freely but actual
data need to be secured. This would be fine if we only ever wished to serve images of UKMO data, or other datasets with similar licences. However, this will not in general be acceptable for data providers.
In future, we intend to secure the Godiva2 server by requiring users to log in via an HTTPS (encrypted) page. This will give the user a security token (a long string of characters). In order to obtain images, data or metadata from the Godiva2 data providers, the user will need to use this token (which will be time-limited and linked to the user's IP address so that others cannot use it). This will also help to monitor usage. CAS (
http://www.ja-sig.org/products/cas/) could be used for this.
This mechanism cannot prevent denial-of-service (DoS) attacks, in which an attacker grabs a user's security token, spoofs the user's IP address and makes repeated requests to the server. The attacker will not actually receive any data from the server (because the IP address has been spoofed) but this can cause the servers to become loaded. More elaborate mechanisms can prevent this, but many service providers simply accept this as a known risk. The chances of this occurring to a Godiva2 site are probably low.
More efficient use of large datasets in Google Earth
See
http://earth.google.com/kml/kml_21tutorial.html.
Intercomparison
A potentially-useful feature would be the ability to display two datasets on screen simultaneously and allow them to be compared. This could be done by: (1) displaying two Google Maps, with a different dataset in each map; or (2) displaying more than one layer of data on a single Google Map, and allowing the user to "fade" between them by setting the transparency of the layers.
In the case of (1), the two Google Maps could be linked such that dragging or zooming one map would cause the same effect in the other map, ensuring that they always display identical fields of view.
Simple data processing
One could imagine being able to perform simple data-processing operations on the Godiva2 site. These might include:
- Differencing fields (selecting two datasets and displaying the difference between them, which is calculated dynamically)
- Calculating averages of data over a certain time period.
Some of these things would require fundamental changes to the Godiva2 architecture. For example, to calculate the difference of two datasets that are hosted by different providers, we would need a mechanism to transfer actual data (not metadata or images) securely between providers.
Collaboration with the NERC DataGrid
The
NERC DataGrid are investigating similar issues of securely providing, processing and visualizing data and we are looking into how we can best collaborate with them in this respect.