The NERC Cluster Grid
A Grid is a collection of computational resources that are connected in a way that facilitates sharing across administrative domains. Shared resources can include High Performance Computing (HPC) and High Throughput Computing (HTC) clusters, data storage facilities and scientific instruments. Grids facilitate the operation of Virtual Organisations that cross geographical and institutional boundaries. The NERC Cluster Grid aims to facilitate collaboration between Natural Environment Research Council (NERC) institutions by making it easier to share their HPC and HTC cluster resources. The following key features are required for the NERC Cluster Grid:
- Load and performance monitoring
- Minimal data footprint on remote clusters
- Easy job submission and control
- Security
Job submission and control facilities are being provided by means of the Grid Remote eXecution (G-Rex) software being developed at the Reading e-Science Centre (
ReSC). G-Rex is the successor to Styx Grid Services [1,2,3] (SGS). G-Rex is light-weight middleware that allows applications residing on remote computational resources to be launched and controlled as if they are running on the user's own computer.
The NEMO ocean model has been deployed as a G-Rex service on HPC clusters at the Environmental Systems Science Centre (ESSC), the Proudman Oceanographic Laboratory (POL) and the British Antarctic Survey (BAS). Users with a G-Rex account can run NEMO on any of these clusters without making significant changes to their model setup and scripts etc.
There are plans to use G-Rex as the core grid middleware in two current NERC e-science projects: GCEP ("Grid Coupled Ensemble Prediction") and GCOMS ("Global Coastal Ocean Modelling System"). This will involve deploying the
HadCM3 coupled climate model and the POLCOMS ocean model as G-Rex services on a wide range of compute resources around the country, including clusters in NERC Cluster Grid.
The Ganglia grid monitoring system (
http://ganglia.sourceforge.net/) has been installed at
ReSC. The Ganglia Web Frontend for the NERC Cluster Grid shows load and performance statistics for the clusters at ESSC, POL and BAS. It can be viewed at the following URL:
http://www.resc.reading.ac.uk/ganglia/
More information about the NERC Cluster Grid and G-Rex can be found in the following poster and presentation material from recent meetings.
--
DanBretherton - 27 Nov 2007