Styx Grid Services (SGSs) are a means for wrapping command-line programs and allowing them to be used remotely over the Internet. When deployed as an SGS, a program can be run from anywhere on the Internet exactly as if it were a local program. (Note that the Styx Grid Services system will soon be renamed VEIGA: the Very Easy Interface to Grid Applications. Watch this space!)
The Styx Grid Services software is built on top of, and is bundled with, the JStyx library, which is a pure-Java implementation of the Styx protocol for distributed systems. See here for more details about JStyx.
Styx Grid Services are useful when you want to run a program on a machine that is not your own (they are somewhat analogous to Web Services). You can then run the program from anywhere on the Internet exactly as if it were a local program. You might want to do this because:
Styx Grid Services are very simple to install and use and require a minimum of software (just a Java virtual machine and the JStyx libraries).
SGSs can be composed into "workflows", in which a number of SGSs, perhaps in different locations, can be combined using very simple shell scripts to create distributed applications. Data can be streamed directly between the services along the shortest network path.
Let us consider a simple distributed application, consisting of two Styx Grid Services. The first is called calc_mean and reads a set of input files from a set of scientific experiments, calculates their mean and outputs the result as a file. The second SGS is called plot and it might be deployed in a completely different location from the first service. It takes a single input data file and turns it into a graph. The shell script (workflow) that would be used to take a set of input files, calculate their mean and plot a graph of the result would be:
calc_mean input*.dat -o mean.dat plot -i mean.dat -o graph.gif
The important thing to note is that this is exactly the same script as would be used to run the programs if they were installed locally. The input files for each service have been detected and uploaded automatically and the output files have been automatically downloaded.
The intermediate file
mean.dat can be passed directly
between the two services (i.e. without being downloaded by the client)
with a small change to the script:
calc_mean input*.dat -o mean.dat.sgsref plot -i mean.dat.sgsref -o graph.gif
Here are some publications about the SGS system in (roughly) decreasing order of usefulness:
Jon Blower, Andrew Harrison, Keith Haines, Styx Grid Services: Lightweight, easy-to-use middleware for scientific workflows, accepted for oral presentation at the International Conference on Computer Science 2006, and for publication in Lecture Notes in Computer Science. [download paper]
Jon Blower, Keith Haines, Ed Llewellin, Styx Grid Services: lightweight, easy-to-use middleware for e-Science, Presented in the UK e-Science booth at SuperComputing, Seattle, 15-17 November 2005 [download presentation] [download poster]
Jon Blower, Keith Haines, Ed Llewellin, Data streaming, workflow and firewall-friendly Grid Services with Styx, Proceedings of the UK e-Science All Hands Meeting 19-22 September 2005 [download paper] [download presentation]