Keep in mind, a SciDB process can be a little greedy memory-wise. The micro instance may not have enough RAM.
When setting up EC2 clusters, the big headache is IP addresses, password-less SSH and ports. EC2 has VPC and “Placement Group” features that may help ease the pain with IP addresses.
You need to make sure postgres is running on coordinator and the pg_hba.conf file allows the other instances to connect.
22 for SSH. Instances need to be able to password-less ssh to each other as scidb user.
5432 for postgres accessible to all instances
All instances need to be able to communicate to all other instances
SciDB coordinator runs on port 1239
Other instances run on port 1240+, the first instance on each machine starts at 1240 and then increments by 1.
So if you are running 8 instances on each machine, you would need to open 1239-1247
If you are running 1 instance on each machine, open 1239-1240
8080 on coordinator accessible to the outside world (for HTTP clients if you want to use it)
8787 on coordinator accessible to the outside world (for RStudio server if you want to use it)
8888 on coordinator accessible to the outside world (for iPython notebook if you want to use it)
For high-performance linear algebra, SciDB uses Scalapack/MPI and we recently discovered MPI will pick a random port on every query.
So to run linear algebra (operators gemm and gesvd) you need to open nearly all ports
10000-65535 accessible to all instances; note: you don’t need to open them to the outside world, just for the instances to talk to each other
If you don’t do this, operators gemm/gesvd won’t work, but the rest of the system will run OK.
Let us know how it goes!