Not able to execute matrix multiplication


#16

If it is asking for password, then it means your passwordless ssh to subu-desktop is not set up. Scidb needs passwordless ssh access to ALL machines in the cluster including the one you are running on. Here is the link to the documentation on one way to configure your machine for passwordless ssh:
http://www.paradigm4.com/HTMLmanual/14.8/scidb_ug/apas01s01.html#d0e31875

Please set up your passwordless ssh to your machine (subu-desktop) and test it again as suggested by donghui in the previous post.


#17

As you suggested i established passwordless ssh connection to [color=#0000FF]subu-desktop, scidb@localhost, scidb@127.0.0.1[/color]. Even after i established passwordless ssh connection to these machine, the same error showing when try to execute [color=#0000FF]gemm[/color] command. The error generated is as follows

[color=#FF0000]AFL% gemm(m2x3,m3x2,z);
SystemException in file: src/mpi/MPISlaveProxy.cpp function: checkLauncher line: 59
Error id: scidb::SCIDB_SE_INTERNAL::SCIDB_LE_OPERATION_FAILED
Error description: Internal SciDB error. Operation ‘MPI launcher process already terminated’ failed.[/color]


#18

Just to be absolutely clear - are you able to passwordlessly login to your machine via ssh?
The error seems to be pointing to a problem with MPI. Here is the doc entry for debugging the MPI issues:
http://www.paradigm4.com/HTMLmanual/14.8/scidb_ug/apbs01.html

Take a look in the article above and see if these suggestions fix your problem. Please let us know how it goes.


#19

I had already tried all possibilities explained in MPI Issues except the folowing

[color=#0000BF]Log in once: ssh scidb@. At the following prompt, answer yes.

Are you sure you want to continue connecting (yes/no)?[/color]

Here i can’t locate worker.

Again one more doubt. Can you just explain how passwordlessly login to your machine via ssh? I don’t have any LAN set up here.


#20

I think your configuration is for a single machine (subu-desktop). If this is true, this is the only machine you need passwordless ssh for. “Worker” typically refers to another machine that is different from the one with the default coordinator, but still part of the cluster.
To test passwordless access, you need to login as follows:
ssh “your username”@subu-desktop (or ssh "your username"@127.0.0.1)

If you are prompted for a password, then passwordless ssh is not set up properly (be sure to omit the double quotes in the user name - I included them only to make my point).


#21

Yes, It is possible for me to passwordlessly login to system using both
ssh subu@subu-deskto & ssh subu@127.0.0.1


#22

Alright - try the gemm query again. Did you make sure /etc/hosts file is ok and does not have any extraneous entries?


#23

Error showing while executing gemm.

/etc/hosts content listed below

[color=#FF0000]127.0.0.1 localhost
192.168.1.7 subu-desktop

The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters[/color]


#24

Let’s look at the logs again, see viewtopic.php?f=11&t=1478#p3403


#25

Please see the tar file [color=#0000FF]all_20141112220950.tar [/color]uploaded via ftp


#26

From the mpi logs it seems like there is still a connection/networking problem. While we are looking for possible resolutions, please run these debugging commands and post all results:
[ul]
[]lsb_release -a[/]
[]sudo ufw status[/]
[]ssh subu-desktop echo test[/]
[/ul]
First command is to check your exact OS version. Second is to check Ubuntu firewall (do you know of any other firewall software running on your machine?) And third is just to re-verify passwordless ssh one more time.


#27

[color=#0000FF]subu@subu-desktop:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 12.04.5 LTS
Release: 12.04
Codename: precise[/color]

[color=#FF0000]subu@subu-desktop:~$ sudo ufw status
[sudo] password for subu:
Status: inactive[/color]

[color=#4000FF]subu@subu-desktop:~$ ssh subu-desktop echo test
test[/color]


#28

Try running the folllowing:
dpkg -l | grep mpich2

echo subu-desktop > hosts.txt
/opt/scidb/14.8/3rdparty/mpich2/bin/mpiexec.hydra -f hosts -n 4 uname -a
/opt/scidb/14.8/3rdparty/mpich2/bin/mpiexec.hydra -f hosts -n 4 hostname


#29

The output genereated after i had run the commands specified in the previous post is as follows

[color=#FF0000]subu@subu-desktop:~$ dpkg -l | grep mpich2
ii scidb-14.8-libmpich2-1.2 1.2.1.1-4 Shared libraries for MPICH2
ii scidb-14.8-libmpich2-dev 1.2.1.1-4 Development files for MPICH2
ii scidb-14.8-mpich2 1.2.1.1-4 Implementation of the MPI Message Passing Interface standard[/color]

[color=#0000FF]subu@subu-desktop:~$ echo subu-desktop > hosts.txt[/color]

[color=#FF4000]subu@subu-desktop:~$ /opt/scidb/14.8/3rdparty/mpich2/bin/mpiexec.hydra -f hosts -n 4 uname -a
[mpiexec@subu-desktop] HYDU_parse_hostfile (./utils/args/args.c:169): unable to open host file: hosts
[mpiexec@subu-desktop] mfile_fn (./ui/mpiexec/utils.c:195): error parsing hostfile
[mpiexec@subu-desktop] match_arg (./ui/mpiexec/utils.c:1054): match handler returned error
[mpiexec@subu-desktop] HYD_uii_mpx_get_parameters (./ui/mpiexec/utils.c:1262): argument matching returned error

Usage: ./mpiexec [global opts] [exec1 local opts] : [exec2 local opts] : …

Global options (passed to all executables):

Global environment options:
-genv {name} {value} environment variable name and value
-genvlist {env1,env2,…} environment variable list to pass
-genvnone do not pass any environment variables
-genvall pass all environment variables (default)

Other global options:
-f {name} file containing the host names
-wdir {dirname} working directory to use

Local options (passed to individual executables):

Local environment options:
-env {name} {value} environment variable name and value
-envlist {env1,env2,…} environment variable list to pass
-envnone do not pass any environment variables
-envall pass all environment variables (default)

Other local options:
-n/-np {value} number of processes
{exec_name} {args} executable name and arguments

Hydra specific options (treated as global):

Bootstrap options:
-bootstrap bootstrap server to use
-bootstrap-exec executable to use to bootstrap processes
-enable-x/-disable-x enable or disable X forwarding

Proxy options (only needed for persistent mode):
-boot-proxies boot proxies to run in persistent mode
-boot-foreground-proxies boot foreground proxies (persistent mode)
-shutdown-proxies shutdown persistent mode proxies
-proxy-port port for proxies to listen (boot proxies)
-use-persistent use persistent mode proxies to launch

Communication sub-system options:
-css communication sub-system to use

Resource management kernel options:
-rmk resource management kernel to use

Hybrid programming options:
-ranks-per-proc assign so many ranks to each process
-enable/-disable-pm-env process manager environment settings

Process-core binding options:
-binding process-to-core binding mode
-bindlib process-to-core binding library (plpa)

Checkpoint/Restart options:
-ckpoint-interval checkpoint interval
-ckpoint-prefix checkpoint file prefix
-ckpointlib checkpointing library (blcr)
-ckpoint-restart restart a checkpointed application

Other Hydra options:
-verbose verbose mode
-info build information
-print-rank-map print rank mapping
-print-all-exitcodes print exit codes of all processes[/color]

[color=#4000FF]subu@subu-desktop:~$ /opt/scidb/14.8/3rdparty/mpich2/bin/mpiexec.hydra -f hosts -n 4 hostname
[mpiexec@subu-desktop] HYDU_parse_hostfile (./utils/args/args.c:169): unable to open host file: hosts
[mpiexec@subu-desktop] mfile_fn (./ui/mpiexec/utils.c:195): error parsing hostfile
[mpiexec@subu-desktop] match_arg (./ui/mpiexec/utils.c:1054): match handler returned error
[mpiexec@subu-desktop] HYD_uii_mpx_get_parameters (./ui/mpiexec/utils.c:1262): argument matching returned error

Usage: ./mpiexec [global opts] [exec1 local opts] : [exec2 local opts] : …

Global options (passed to all executables):

Global environment options:
-genv {name} {value} environment variable name and value
-genvlist {env1,env2,…} environment variable list to pass
-genvnone do not pass any environment variables
-genvall pass all environment variables (default)

Other global options:
-f {name} file containing the host names
-wdir {dirname} working directory to use

Local options (passed to individual executables):

Local environment options:
-env {name} {value} environment variable name and value
-envlist {env1,env2,…} environment variable list to pass
-envnone do not pass any environment variables
-envall pass all environment variables (default)

Other local options:
-n/-np {value} number of processes
{exec_name} {args} executable name and arguments

Hydra specific options (treated as global):

Bootstrap options:
-bootstrap bootstrap server to use
-bootstrap-exec executable to use to bootstrap processes
-enable-x/-disable-x enable or disable X forwarding

Proxy options (only needed for persistent mode):
-boot-proxies boot proxies to run in persistent mode
-boot-foreground-proxies boot foreground proxies (persistent mode)
-shutdown-proxies shutdown persistent mode proxies
-proxy-port port for proxies to listen (boot proxies)
-use-persistent use persistent mode proxies to launch

Communication sub-system options:
-css communication sub-system to use

Resource management kernel options:
-rmk resource management kernel to use

Hybrid programming options:
-ranks-per-proc assign so many ranks to each process
-enable/-disable-pm-env process manager environment settings

Process-core binding options:
-binding process-to-core binding mode
-bindlib process-to-core binding library (plpa)

Checkpoint/Restart options:
-ckpoint-interval checkpoint interval
-ckpoint-prefix checkpoint file prefix
-ckpointlib checkpointing library (blcr)
-ckpoint-restart restart a checkpointed application

Other Hydra options:
-verbose verbose mode
-info build information
-print-rank-map print rank mapping
-print-all-exitcodes print exit codes of all processes
[/color]


#30

Please run the coorected command:

echo subu-desktop > hosts.txt
/opt/scidb/14.8/3rdparty/mpich2/bin/mpiexec.hydra -f ./hosts.txt -n 4 uname -a
/opt/scidb/14.8/3rdparty/mpich2/bin/mpiexec.hydra -f ./hosts.txt -n 4 hostname


#31

Output obtained while running commands given in the previous post

[color=#0000FF]subu@subu-desktop:~$ echo subu-desktop > hosts.txt
bash: hosts.txt: Read-only file system[/color]

[color=#FF4000]subu@subu-desktop:~$ /opt/scidb/14.8/3rdparty/mpich2/bin/mpiexec.hydra -f ./hosts.txt -n 4 uname -a
Linux subu-desktop 3.8.0-44-generic #66~precise1-Ubuntu SMP Tue Jul 15 04:01:04 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Linux subu-desktop 3.8.0-44-generic #66~precise1-Ubuntu SMP Tue Jul 15 04:01:04 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Linux subu-desktop 3.8.0-44-generic #66~precise1-Ubuntu SMP Tue Jul 15 04:01:04 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Linux subu-desktop 3.8.0-44-generic #66~precise1-Ubuntu SMP Tue Jul 15 04:01:04 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux[/color]

[color=#8000BF]subu@subu-desktop:~$ /opt/scidb/14.8/3rdparty/mpich2/bin/mpiexec.hydra -f ./hosts.txt -n 4 hostname
subu-desktop
subu-desktop
subu-desktop
subu-desktop[/color]


#32

It still seems like some sort of an ssh problem. What username did you specify for -u switch during the cluster install? Was it scidb or any other username?


#33

subu@subu-desktop:~$ echo subu-desktop > hosts.txt
bash: hosts.txt: Read-only file system
Seems like a problem … looks like you cannot write to ~subu/. Assuming subu is the user under whom scidb runs this is pretty non-standard and we have never tested such an environment. Can you write to /tmp ? (touch /tmp/foo as the user of scidb). If not, that is a problem for sure.
Also, make sure that your iptables and SELinux either configured correctly or turned off. See paradigm4.com/HTMLmanual/14. … 01s02.html.
Anyway, the first step is to make sure that the mpich.hydra command from my previus post works correctly.


#34

Hai,
Thank you for your support. I am planning to reinstall scidb. Last time i followed instructions mentioned in the video presentation found there in the paradigm site for installation. Any additional preparations to be made before the installation? Any suggestions?
Regards
Subu


#35

One thing that may help is to shut down all postgres services other than postgres 8.4. It will help to configure postgres 8.4 during the installation. Otherwise, please follow the instructions for the cluster install located here:
https://github.com/Paradigm4/deployment/tree/master