SciDB aggregate operations


#1

Hi,
I‘m a newbie, I want to know if SciDB aggregate operations can execute on multiple CPU in parallel.
Thanks.


#2

Hi @pengji,

SciDB aggregate operations can execute on multiple CPUs in parallel. This is done by setting the number of instances per server in the config.ini file and by tuning the chunk size parameter of the array correctly. The chunk size is the atomic unit of IO within SciDB.

Jonathan Rivers


#3

Thank you, I want to read the source code about this part, but I don’t understand the source code and can’t find code of execution concurrently.Could tell something about the code?


#4

The concurrency is achieved by having different SciDB instance processes. Note that AggregatePartitioningOperator::execute(), for example, will be called by all scidb instances, with portions of the data. You can then see there is a merging process (redistributeToRandomAccess()) which merges the partially-aggregated data from different instances.