PRECIS is designed to run on PC systems running the Linux operating system, including desktops, laptops and some types of servers. We refer to these as "shared memory" systems. This is a legacy of PRECIS's philosophy of easy to use and low infrastructure requirement.
A desktop PC running standard Linux (e.g. Red Hat, Opensuse, Debian) will run PRECIS (including PRECIS 2.0) with no problem. The faster the CPU, the better (e.g. a core i7 will run PRECIS faster than a core i3).
Cluster systems sometimes have software installed which is responsible for distributing running processes to the various CPUs on the cluster system in the most efficient manner possible. The larger the cluster system (e.g. a Supercomputer), the more likely this is to be the case. We refer to these as "distributed memory" systems. Cluster systems using this distribution software (also referred to as scheduling/queueing/job control) will not run PRECIS in its default configuration. They can run PRECIS, but it requires an expert software engineer from the PRECIS team to manually install PRECIS. This may take up to 2 days depending on how complex the job control system is on the cluster, and we do not provide this service freely. Sometimes cluster systems let Linux itself control the distribution of running processes.
Example: We have shared about 40 Linux dual quad core "shared memory" servers here in the Met Office which any internal employee can log into and use. If I want to run PRECIS in multiprocessor mode on one of these servers using all 8 cores with full 100% usage of the CPU, I can do so with the default PRECIS installation. HOWEVER, if anyone else logged in to the server and ran software with high utilisation of the CPU, it would slow down my PRECIS experiments.
Thus, answering the question "Will PRECIS run on a cluster?" is not simple. Ideally, we would ask for remote access to the cluster (via ssh) to investigate the system setup ourselves.
We also need to know: * The configuration of the compute nodes (e.g. how many cores per node, total number of nodes, memory etc). * The type of interconnect. * The amount of disk storage. * The type of job control system, if known (e.g. Sun Grid Engine, Portable Batch System)
David
|