I am trying to find why scidb has relatively slow performance.
I have stored a 1024x1024 color image in scidb using the following commands:
csv2scidb -s 1 -p NNNNN < test.csv > test.scidb
iquery -a -q "CREATE ARRAY Michalis <r:int64, g:int64, b:int64> [i=0:1023,1000,0, j=0:1023,1000,0]"
iquery -a -q "CREATE ARRAY MichalisFlat < i:int64, j:int64, r:int64, g:int64, b:int64 > [v=0:*,1000000,0]"
iquery -q "LOAD MichalisFlat FROM '/var/www/html/stra/test.scidb'"
iquery -a -q "redimension_store(MichalisFlat,Michalis)"
test.csv looks like
It seems since my chunk size for i = 1000 and for j = 1000, I have a 24byte *10^6 chunk which is roughly 24Mb or in other words this image is a single chunk - if I am correct.
When I am running the same query multiple times the time needed is exactly the same as if there is no cache.
I consistently get the same time (~3.3s!) for this query:
Trying with smaller chunk sizes of i = 10 and j = 10 this same query goes up high to ~8.3s !
This time is pure query time and no printing involved.
Also I found something that is not quite right - could be a bug or something not implemented:
This query takes ~3.3s:
SELECT r,g,b FROM Michalis
while this query which partitions by 1x1 (I would expect that to be dropped and be transformed to the above)
takes ~16.8s. I guess there is no query optimizer right now right?
Thanks for any insight
Edit: I am using SciDB 3.6 on VM and although I don’t care about the absolute times themselves I care about their relative relation