How to insert a 2-dim array into a 3-dim array


#1

Hi,
I have a 3-dimensional array {t,x,y} and now i want to add a 2-dimensional array {x,y} into the 3.dimensional array resulting into a 3-dimensional {t+1,x,y} array.

edit: i solved this problem. Scroll to the end of this post for solution

I can append the 2-dim array at the and of the 3-dim array with

create array base <v:uint8> [t=1:*:0:1000; x=0:2:0:1000; y=0:3:0:1000]

store(build(<v:uint8>[t=1:2:0:1000; x=0:2:0:1000; y=0:2:0:1000],(t-1)*9%2B(x*3%2By)), base)

// produces:
{t,x,y} v
{1,0,0} 0
{1,0,1} 1
{1,0,2} 2
{1,1,0} 3
{1,1,1} 4
{1,1,2} 5
{1,2,0} 6
{1,2,1} 7
{1,2,2} 8
{2,0,0} 9
{2,0,1} 10
{2,0,2} 11
{2,1,0} 12
{2,1,1} 13
{2,1,2} 14
{2,2,0} 15
{2,2,1} 16
{2,2,2} 17

create array toInsert <v:uint8> [i=0:2:0:1000; j=0:2:0:1000]

store(build(<v:uint8>[i=0:2:0:1000; j=0:2:0:1000], 3), toInsert)

// produces:
{i,j} v
{0,0} 3
{0,1} 3
{0,2} 3
{1,0} 3
{1,1} 3
{1,2} 3
{2,0} 3
{2,1} 3
{2,2} 3

// append toInsert at end of base to produce a {t+1,x,y} array:

insert(redimension(apply(toInsert, t, 3, x, i, y, j), base), base)

// produces:
{t,x,y} v
{1,0,0} 0
{1,0,1} 1
{1,0,2} 2
{1,1,0} 3
{1,1,1} 4
{1,1,2} 5
{1,2,0} 6
{1,2,1} 7
{1,2,2} 8
{2,0,0} 9
{2,0,1} 10
{2,0,2} 11
{2,1,0} 12
{2,1,1} 13
{2,1,2} 14
{2,2,0} 15
{2,2,1} 16
{2,2,2} 17
{3,0,0} 3
{3,0,1} 3
{3,0,2} 3
{3,1,0} 3
{3,1,1} 3
{3,1,2} 3
{3,2,0} 3
{3,2,1} 3
{3,2,2} 3

Fine so far, but what if I want to insert it somewhere in the middle of the 3-dim array?
I have some ideas, but they seem to be sort of not performant.

i.e.

  1. save the 3-dim array into binary file
  2. save the 2-dim array into binary file
  3. create a new binary file for the subarray of the 3-dim array for t=[0,insertIndex]
  4. append the data of the 2-dim array for t=[insertIndex+1,insertIndex+1]
  5. append the remaining binary file for the subarray of the 3-dim array for t=[insertIndex+1,end]
  6. load the binary file into new 3-dim array.

My other approach would be a loop of appendings, i.e.:

  1. get the 3-dim subarray for t=[0, insertIndex]
  2. append the toInsert array
  3. for each t=insertIndex+1…end:
    get the 2-dim subarray for t of the base array & append it.

I wonder if there isn’t a faster approach than these?

solution:

// create 3-dim base array {t=0:3,x=0:2;y=0:2}
create array base <v:uint8> [t=0:*:0:1000; x=0:2:0:1000; y=0:2:0:1000]

store(build(<v:uint8>[t=0:3:0:1000; x=0:2:0:1000; y=0:2:0:1000],t*9%2B(x*3%2B(y%2B1))), base)

// create 2-dim insert array {x=0:2;y=0:2}
create array toInsert <v:uint8> [i=0:2:0:1000; j=0:2:0:1000]

store(build(<v:uint8>[i=0:2:0:1000; j=0:2:0:1000], 2), toInsert)

// do magic to store 3-dim base array {t=0:4; x=0:2; y=0:2}, where [t=3:4] has the values of previous [t=2:3]:
insert(redimension(apply(cast(subarray(base,2,0,0,3,3,3),<v:uint8>[q=0:1;x=0:2;y=0:2]),t,q+3),base), base)

// insert the insert array into base:
insert(redimension(apply(toInsert, t, 2, x, i, y, j), base), base)

// the output is:
{t,x,y} v
{0,0,0} 1
{0,0,1} 2
{0,0,2} 3
{0,1,0} 4
{0,1,1} 5
{0,1,2} 6
{0,2,0} 7
{0,2,1} 8
{0,2,2} 9
{1,0,0} 10
{1,0,1} 11
{1,0,2} 12
{1,1,0} 13
{1,1,1} 14
{1,1,2} 15
{1,2,0} 16
{1,2,1} 17
{1,2,2} 18
{2,0,0} 2
{2,0,1} 2
{2,0,2} 2
{2,1,0} 2
{2,1,1} 2
{2,1,2} 2
{2,2,0} 2
{2,2,1} 2
{2,2,2} 2
{3,0,0} 19
{3,0,1} 20
{3,0,2} 21
{3,1,0} 22
{3,1,1} 23
{3,1,2} 24
{3,2,0} 25
{3,2,1} 26
{3,2,2} 27
{4,0,0} 28
{4,0,1} 29
{4,0,2} 30
{4,1,0} 31
{4,1,1} 32
{4,1,2} 33
{4,2,0} 34
{4,2,1} 35
{4,2,2} 36

#2

Hi @mojioms, thanks for the nice writeup.

I would keep an eye on the chunk sizing because 1000x1000x1000 could get heavy if you don’t have tons of RAM. You may want to use chunk length of 1 along t for example:
store(build(<v:uint8>[t=0:3:0:1; x=0:2:0:1000; y=0:2:0:1000],t*9%2B(x*3%2B(y%2B1))), base)

Also, if the data is sparse, then you may want to erase prior to insertion:
delete(base, t=2)
prior to your last step.

Sometimes when we need to inflate datasets artificially, we use something like this:

That works much faster if your “shift distance” is a multiple of your chunk length. So, for example, it would work with my suggested schema above where t has a chunk length of 1.


#3

Hi @apoliakov,
When using this insertion on real data (x=0:10k,y=0:10k), the picked chunk size has a high impact on the performance of the insertion. The lower the chunk size, the faster the insertion.
However, I noticed that the chunk size also has impact on the outcome of the array. Too small chunk sizes lead into wrong results for between queries. I wonder if that’s intended - is it?


#4

Hi @mojioms - yes the chunk size definitely affects performance - thus allowing users to tune for different cases - but you should not get a wrong result!

If you are getting a wrong result somewhere, we would appreciate a reproducible example so we can fix it ASAP. Thank you for your help!


#5

Hi @apoliakov,
I produced a small reproducible example on Github: https://github.com/MojioMS/testSciDB

Let me know if you have any questions or different results.


#6

Thank you for the detailed report, @mojioms! Appreciate it. We are investigating.