Inferno Grid Diary

Here are some notes about my (JonBlower's) experiences with installing and using an Inferno Grid.

Background

I am using the Inferno Grid distribution dated 30/08/2004, from a CD supplied by Vita Nuova. I have highlighted problems (with the software or the documentation) in red.

Installing the server software

Linux

I am installing the Inferno scheduler on a Linux server that is permanently "up": lovejoy.nerc-essc.ac.uk (192.171.166.111). This is also our Web Services server but doesn't see very high traffic (a better, permanent home might have to be found).
  • Copied the Inferno Grid CD contents to /users/resc/InfernoGridCD (using scp)
  • Note that the instructions in D:\tutorial\install.htm are not quite correct. To install the Inferno Grid, you have to create an installation directory, then run the appropriate script file. I ran:
mkdir /users/resc/infernogrid
sh /users/resc/InfernoGridCD/server/install/Linux-grid-386.sh /users/resc/infernogrid
  • Edited /users/resc/infernogrid/grid/master/schedaddr to contain the text tcp!localhost!infsched. Make sure that there is no newline at the end of the file!
  • Edited /users/resc/infernogrid/grid/master/startsched.sh so that all the paths were correct. The full text of the new script is:
#!/bin/sh
INFERNO_HOME=/users/resc/infernogrid
PATH=$INFERNO_HOME/Linux/386/bin:$PATH
export PATH
case $1 in
-a)     case $2 in
        "")     echo gridsched: missing address for -a >&2; exit 1;;
        *)      echo $2 >/grid/master/schedaddr || exit 1;;
        esac
        shift 2 ;;
-*)     echo gridsched: unknown option $1; exit 1 ;;
esac
emu -r$INFERNO_HOME -pmain=134217728 -pheap=134217728 \
        /dis/sh.dis -c \
        'ndb/cs; /grid/master/startsched $*' "$@" \
        < /dev/null 2>&1 | tee -a $INFERNO_HOME/grid/master/verbose.out

A note about Windows

Apparently it is not recommended to install the scheduler under Windows; it is not yet properly tested, and some vagaries of the underlying Windows filesytem mean that it doesn't work properly. This will (probably) be fixed in future releases.

Installing the client software

Windows

In the first instance I'm installing the client (worker node) software on Windows as this will probably be the majority platform in a Campus Grid.
  • I installed the client software on my Windows 2000 desktop (dual processor AMD Athlon 2600+ with 1 GB RAM, marlow.nerc-essc.ac.uk) using D:\client\install\setup.exe from the CD. I accepted the default destination directory of C:\VNClient.
  • This has installed a cut-down installation of Inferno (32 MB instead of 130 MB, including the probably-redundant directories for different architectures).
    • I'm pretty sure it's safe to remove the FreeBSD, Debian, Linux, MacOSX, Solaris and Irix directories, making the total size of the installation 3.6MB instead of 32MB.
  • Edited the file C:\VNClient\grid\slave\schedaddr to read tcp!192.171.166.111!infsched. Make sure that there is no newline at the end of the file!
  • Copied the file C:\VNClient\grid\slave\tasks\test.job into C:\VNClient\grid\slave\. This allows arbitrary code to be run on this worker node, and I am only doing this because we are in a trusted network!
  • Repeated this installation on my Windows XP (~1 GHz, 768 MB RAM) laptop, called lichen.nerc-essc.ac.uk.

Linux

I then rebooted my laptop (lichen) into Red Hat Linux 9 and installed the client software as follows:
  • Logged on as user inferno, then created a new directory, /home/inferno/infernogridclient.
  • Mounted the Inferno Grid CD, then ran /mnt/cdrom/client/install/Linux-grid-386.sh /home/inferno/infernogridclient.
  • Ran the new Inferno instance with /home/inferno/infernogridclient/Linux/386/bin/emu -r/home/inferno/infernogridclient.
  • Started the grid slave software from the Inferno shell with /grid/slave/startslave.
  • The new node appeared in the Monitor (running on another machine) but the CPU speed didn't appear - maybe Inferno can't automatically determine the CPU speed of a Linux machine?

Starting the server/scheduler

  • Back on lovejoy (the scheduler machine) ran the scheduler from the (Linux) command line by running /users/resc/infernogrid/grid/master/startsched.sh. Got message "+++++++++++++++ start Thu Dec 09 14:16:13 GMT 2004". All appears well.

Starting the client software

  • Installed the client software as a service on both machines (marlow and lichen) by running C:\VNClient\grid\slave\install_service.bat on each machine in turn
  • The service will start automatically on next reboot, but I started the service manually using Control Panel->Administrative Tools->Services
    • Note that the service can be started straight after installation by adding the line "net start InfernoGridService" to the end of install_service.bat

Monitoring the Inferno Grid

To monitor the grid, there is an Inferno GUI application. This can be run (from either the scheduler or a worker node) as follows:
  • Start Inferno. On the scheduler (Linux) machine I ran: /users/resc/infernogrid/Linux/386/bin/emu -r/users/resc/infernogrid -g800x600. On the client (Windows) machine I used the shortcut in Start Menu -> Programs -> Vita Nuova Grid Client -> VNClient
  • At the command prompt that appears, type ndb/cs and press enter. (Is this necessary? Does this get run autmatically?)
  • Then type wm/wm. The Inferno window manager should pop up.
  • Open a shell window and run scheduler/monitor -A tcp!192.171.166.111!infsched.
    • Note that on the Client nodes, the window manager has a shortcut to "Monitor", which can be used to start the monitor.
  • When I ran the monitor, I could see entries for both worker nodes (marlow and lichen) and another entry marked (M) denoting the machine that I'm running the monitor on. All appears well.

Installing across many more nodes

  • In Computer Science, we (ChristopherChapman, IanBland, JonBlower) installed the Inferno Grid client software on several Windows and Linux boxes. This proceeded with few problems.
  • In the case of Windows, we installed on one machine from CD, set the contents of the schedaddr file and copied test.job into c:\VNClient\grid\slave. Then we copied the whole VNClient directory to each machine in turn, starting the service on each (Chris added the line "net start InfernoGridService" to the end of install_service.bat to start the service straight away after it is installed).
  • On Linux (a Fedora Core 2 machine called swallow.rdg.ac.uk), we created a new user and group, both called inferno and installed the Grid software as above under /opt/inferno. We modified the scheduler's startup script to create a script for starting the grid slave:
#!/bin/sh
INFERNO_HOME=/opt/inferno
PATH=$INFERNO_HOME/Linux/386/bin:$PATH
export PATH
case $1 in
-a)     case $2 in
        "")     echo gridsched: missing address for -a >&2; exit 1;;
        *)      echo $2 >/grid/slave/schedaddr || exit 1;;
        esac
        shift 2 ;;
-*)     echo gridsched: unknown option $1; exit 1 ;;

esac
emu -r$INFERNO_HOME -pmain=134217728 -pheap=134217728 \
        /dis/sh.dis -c \
        'ndb/cs; /grid/slave/startslave $*' "$@" \
        < /dev/null 2>&1 | tee -a $INFERNO_HOME/grid/slave/verbose.out
  • Some questions arose:
    • The Inferno Grid monitor does not correctly display the CPU speed or memory for Fedora Core 2, and does not display the CPU speed for Red Hat 9. Can we correct this?
    • Can we run the grid slave startup script (above) as a proper daemon, and add it to the system startup scripts?

-- JonBlower - 09 Dec 2004

Topic revision: r7 - 13 Dec 2004 - 20:42:00 - JonBlower
 
This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback