Metdata and GCEP

It is important when a large group of people collaborate that it is easy to see and understand each others results. It should also be easy for the individual to remember when, where and why a particular experiment was run. Therefore it has been suggested that we use some form of standardised metadata across GCEP. One possibility is to write our own, another possibility is to adopt an existing standard.

Within CGAM Katherine Bouton is working on Numerical Model Metadata (NMM) XML. It is very thorough documenting everything from the original code base the model was generated from to the final scientific parameters used in the model run. It is also largely automatic harvesting most of the data needed by itself. This is important as scientists are generally averse to filling in forms for metadata themselves. When the model run has been archived in NMM XML, it can be found through a searchable catalogue. The NMM XML project is still in an early stage. Most of the XML is set up and there are a few tools for harvesting meta data that work with the UM model. At the moment there are no special features for grid applications and ensembles. However, I have talked to Katherine and she says that the final standard has not yet been decided so we could have some input there. She is keen to become involved in GCEP as we could provide good feedback for her.

Also in CGAM, Jonathan Gregory is one of the creators of the Climate and Forecast (CF) Metadata Convention. It is a standardised way of describing climate and forecast model runs to facilitate building applications with powerful extraction, regridding, and display capabilities. It is an extension of COARDS and is developed for netcdf. This might be a disadvantage if we have some data in only pp-format. Also, I am not sure to what extent initial conditions are described, which is important in GCEP. Jonathan is out of the department at the moment, but as far as I can see it does not have any special support for grid applications and ensembles either.

It is important that we choose a standard that is widely used so that any applications we build are useful to others and also so others can see our metadata. NMM XML has been considered by BADC and climateprediction.net. They are also part of the Global Organization of Earth System Portals (GO-ESSP) which will help them develop NMM XML to international standards. CF has been adopted by PCMDI and *MIP, PRISM, ESMF, NCAR, Hadley Centre, GFDL, various EU projects. Whichever metadata we decide to go for, I think it is important that the final aim is to integrate it with the multi-disciplinary Geography Markup Language (GML) and the standards of the Open Geospatial Consortium Inc (OGC). An article about which can be found here.

Any comments? Do you know of any other possible metadata standards that could be useful? -- LeonHermanson - 07 Jun 2006

: Hmm, well, I would prefer to stick with the one the Hadley centre are using. Except its for netcdf frown Is it a good idea to have different (incompatible) doc schemes? - William

A clarification: The difference between CF and NMM is that CF only describes the data, while NMM describes both data and the process (including the model) which created it. NMM uses CF standards to describe the data, so they are in fact compatible. I guess the choice for us centres on how comprehensive we want our metadata to be. - Leon

Topic revision: r3 - 08 Jun 2006 - 09:38:04 - LeonHermanson
 
This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback