Install SciDB on multiple EC2 nodes


#1

Hi,

Is there anyone who has installed SciDB on multiple EC2 nodes? Once a stopped EC2 instance is restarted, its IP address will change. As a result, it seems that I have to reset the IP addresses of worker nodes in the SciDB configuration file and then reinitialize the SciDB. Is there any way to avoid such reconfiguration?

Thanks!

-Yi


#2

Same question here. From the official document here: https://paradigm4.atlassian.net/wiki/display/ESD/Installing+SciDB+Community+Edition

It says at the “pre-installation” step, users need to specify the host IPs.
In EC2, when instances are restarted we’ll need to redo the pre-installation and rebuild everything again?

-Don


#3

There are various features in EC2 and other cloud providers that allow for the IP address to remain past restart. Some providers offer a kind of Virtual Subnet with fixed IP addresses, others let you have a set of special static IPs. You could possibly also utilize some kind of DNS - and use hostnames instead of IPs in the config.ini.


#4

so there’s no (easy) way to reconfigure the node IPs after the initial deployment?


#5

Well… it’s fairly easy to use DNS and use hostnames instead of IPs.

If you must reconfigure IPs using SciDB, you could update the config.ini file and then update the “instance” table in the SciDB postgres catalog. Then restart SciDB. That should work as well.


#6

Some of my colleagues point out - you can also use the Linux /etc/hosts file to create a mapping…


#7

Thanks! Will try! -Don


#8

just out of curiosity: what if the user wants to add/remove a node to/from the SciDB cluster? Would he need a complete redeployment?


#9

The P4 Enterprise Edition allows for Elasticity - adding new nodes on the fly. This video shows an example. In the Community Edition this is harder.

P4 does provide an Academic License to interested researchers - check with info@paradigm4.com


#10

Thanks! Then I guess extending the SciDB AMI on EC2 (which I believe is a single-node deployment) to multiple instances won’t be trivial. I’d better compile from the source on a cluster of EC2 instances.