FAQ


  1. I would like to ask for the account on XYZ.
  2. I would like to make a request for an account for all the clusters.

  3. Please send this kind of request to: cluster-admin@democritos.it specifing either:
    Here you can find some documentation about local clusters:
    READ IT and subscribe to the cluster users' mailing list:


  4. Houston we have a problem....

  5. Please, write to cluster-admin@democritos.it
    E-mail concerning cluster problems sent to personal addresses will be ignored (i.e. Moreno/Marco/...).


  6. I wrote to cluster-admin@..., but I received a message that says: "Your message is being held until the list moderator can review it for approval"

  7. It's ok, your message will be deferred until one of us accepts your post (which should happen as soon as possible).


  8. [briareo/hokule] I cannot find a file/directory on the scratch filesystem, what's happen?
  9. [briareo/hokule] I get the message "Stale NFS file handle" when I try to access the /scratch filesystem. What does it mean?

  10. The gpfs filesystem could be down or unmounted, contact cluster-admin@democritos.it.


  11. [briareo/hokule] There's an hanging job queued. What can I do?

  12. The service PBS (pbs_mom daemon) is probably stopped on the remote node. Each hour an automatic script checks about the daemons status.
    Just wait or contact cluster-admin@democritos.it.


  13. I tried to link my application with MKL libraries but I found that routine DGxxxx is not present. What should I do ?

  14. Remember that MKL libaries contains just a subset of the full lapack routines. So please use MKL and LAPACK libraries toghether in this way:
    ifc myprog.f -L/usr/local/intel/mkl/lib/32 -L/usr/local/lib/ -lifclapack_std ...
    g77 myprog.f -L/usr/local/intel/mkl/lib/32 -L/usr/local/lib/ -lgnulapack_std ...

    where std stands for standard: i.e. the standard lapack package downloadable from www.netlib.org.


  15. [mulo/somaro] May I use MKL libraries with openMosix cluster ?

  16. Sure why not ?
    Please note however that MKL library is threaded library and therefore application linked against these libraries CAN NOT migrate on free (or less busy CPUs). If you want to make them migratable just tell your application to use one single thread by setting the OMP_NUM_THREADS enviroment variable. In this case your application will be migrated without any problem.

    To define the OMP_NUM_THREADS enviroment variable:

    for [t]csh: setenv OMP_NUM_THREADS 1
    for [ba]sh: export OMP_NUM_THREADS=1


  17. [mulo/somaro] I get the "cp: skipping file `foo', as it was replaced while being copied" message when trying to use cp under mfs.

  18. This is a known bug of cp while working on oMFS (or oMFS bug in some *stat function handler implementation).
    As simple work-around we have placed a safe cp command as /bin/mfscp and we have aliased it for the users.
    If you still get the above error it means you have aliased the cp command in your rc files, to avoid the problem just use /bin/mfscp in your cp aliases (use same trick on your scripts or makefiles when they work on /mfs).


  19. [mulo/somaro] I want to run a huge-output job, which filesystem can I use?

  20. If your job is I/O intensive (frequently I/O operations) you should consider to use mfs (openMosix filesystem).
    Create your own directory on /mfs/<PREFEREDNODE>/local_scratch/ (e.g. mkdir /mfs/5/local_scratch/foo) and the run your job bounded to that (remote) filesystem (see below for a silly example).
    Consider that, in this case, you'll no longer benefit from openMosix load balancing (this is a sort of "manual-balancing").
    In order to see which node has less or no jobs I/O bounded to mfs, you can use this command-line:
    	ompsinfo -A -a mfs | grep -v /
    
    Note that mfs is not a stable filesystem, too many contemporaneous (I/O bounded) jobs may crash the node involved in the I/O operation.

    If you have only a huge output (no I/O intensive) all that you need is the local scratch directory.


    Example on using the openMosix filesystem:
    As script:
    	#!/bin/bash
    	cd /mfs/5/local_scratch/foo/
    	myexe 1>./output.log 2>./output.err
    As cmdline:
    	myexe >/mfs/5/local_scratch/foo/output.log
    If your program has some sort of smart cmdline parsing:
    	myexe --output /mfs/5/local_scratch/foo/output.log
    
    
    Here you can find another example.


  21. [mulo/somaro] I run my job over mfs as suggested but seems that does not migrate. What's wrong?

  22. Probably your job uses threads or other non-migratable stuff.
    If your program is compiled against mkl libraries, it use threads (read this).
    Launching these jobs on specific node is useless unless they can migrate on that node.
    	somaro:~# ompsinfo -a mfs,ppid -u xyz
    	CMD                PID  PPID NODE NMIGS MFS LOCK CANTMOVE
    	sh                2903     1    0     0   5    0 migratable
    	sshd              5356  5352    0     0   /    0 migratable
    	tcsh              5357  5356    0     0   /    0 migratable
    	b                 5366  2903    0     0   5    0 clone_vm
    	b                 5367  5366    0     0   5    0 clone_vm
    	b                 5368  5367    0     0   5    0 clone_vm
    	b                 5369  5367    0     0   5    0 clone_vm
    
    For instance, here you have more threads (at least two, one per processor + parents) which are running on masternode instead of node005.

    My advice is to set the OMP_NUM_THREADS environment variable to 1 in your scripts or in your environment, as suggested here.


  23. [mulo/somaro] Mathematica complains about some not-installed fonts. It works but is difficult to read...

  24. Try to use a script like this:
    	xset fp+ tcp/somaro.sissa.it:7100
    	xhost + somaro.sissa.it
    	ssh [<USERNAME>@]somaro mathematica
    



M. Baricevic, March 2004
Last modified: Tue, 16 Mar 2004 - 19:41:49 CET