Performance difference

I am trying to understand the performance differences considering aggregating data in different dimensions (the first and the last). Considering the array:

iquery -nq "create array foo  <a:double>[d0=1:790599,7906,0, d1=1:7906,7906,0];"
iquery -naq "store(build(data1, d0+d1), foo);"

I ran these queries using SciDB 16.9 (my environment consists of 16 nodes):

time iquery -naq "consume(aggregate(foo, sum(a), d0));"
Query was executed successfully

real	0m34.103s
user	0m0.020s
sys	0m0.000s

time iquery -naq "consume(aggregate(foo, sum(a), d1));"
Query was executed successfully

real	1m21.017s
user	0m0.012s
sys	0m0.004s

Why there is such a difference between these two queries? I mean, if we were to explain in simple terms how does SciDB goes about processing the aggregates, we could say that it iterates over the chunks in order and writes the data into the output chunks (possibly out of order), or does it gathers data from the input chunks out of order and writes it in the output chunk in order? Or neither?

Other scenario is this:

time iquery -naq “consume(between(foo, null, null, 1000, 7000));”
Query was executed successfully

real	0m3.168s
user	0m0.016s
sys	0m0.000s


 time iquery -naq "consume(subarray(foo, null, null, 1000, 7000));"
Query was executed successfully

real	0m35.528s
user	0m0.012s
sys	0m0.016s

Why subarray is this slower when compared to between?

Hi,

The first example involving aggregates can be explained by how data payloads are stored within chunks. Our data is stored in row major order and so reading from the column-wise direction tends to be slower since sequences of data are not adjacent to each other.

The second example can be explained by the fact that the between() operator returns an array that is the same shape as the input array, but with the values that don’t satisfy the between criteria nulled out. Since it is creating an array where the copied chunks are necessarily on the same instance before and and after, the copies go very quickly. The subarray() operator creates an ouput array of a completely different shape - smaller - with the data satisfying the criteria being re-indexed from zero. Those chunks will probably be copied between instances and the transfer of that data is slower.

Hope this helps and thanks for using SciDB.

-Jason Kinchen