What is the core data storage media?


#1

Hello all,

I got confused when it comes to where excatly the data stored in SciDB. When installing scidb, it also installed the postgresql, is scidb more like a managment system with the core of scidb engine, and using postgresql storing data or the postgresql is only response for scidb system catalog data.
In short, what is the storage media in scidb for storing those arrays?

Thanks.


#2

Postgres only stores the metadata - array names, dimensions, attributes, residency and, in EE, user accounts, etc. Postgres is just the “catalog”, and we aspire to phase it out.

The SciDB config file (usually /opt/scidb/16.9/etc/config.ini) specifies a “base-path” AKA the “data directory”. Thats where the data is stored. You can also use the “data-dir-prefix” directories to split that storage among multiple drives on each node.


#3

Thank you for the fast response.
I understand that the base path is where the data is stored, but what is the format for storing those data? For example, MongoDB using json files.
Here is a screen of my data directory, are those number works as codes for seperating different types of data?


Really appriciate for your help.


#4

Well the format is binary and specific to scidb, optimized for representing dense and sparse arrays. There’s some information here: SciDB’s MAC(tm) Storage Explained

On your screenshot - in the datastores directory you will see files corresponding to each array ID.


#5

Thanks for the explaination. If you dont point that out, it might take me some time to realizize the numbers mean for array ID.