Merge and Subarray Problem


#1

Hi Experts,
I’m having a problem with merge + subarray, seems the order of merge and subarray change the result. Example below:

Say I have 2 arrays with identical schema, that are sparsely populated, each with different data.

$ iquery -aq "show(tmp1)"
[("tmp1<a_RA:float NULL,a_DEC:float NULL,a_ID:int32 NULL,b_RA:float NULL,b_DEC:float NULL,b_ID:int32 NULL> [RAVAL(int32)=36001,1800,300,DECVAL(int32)=18001,900,150]")]

$ iquery -aq "show(tmp2)"
[("tmp2<a_RA:float NULL,a_DEC:float NULL,a_ID:int32 NULL,b_RA:float NULL,b_DEC:float NULL,b_ID:int32 NULL> [RAVAL(int32)=36001,1800,300,DECVAL(int32)=18001,900,150]")]

Then in one small section of these 2 arrays, there are some objects:

$ iquery -aq "subarray(tmp1,19320,14340,19330,14350)"

[[{0,0}(),{0,1}(),{0,2}(),{0,3}(),{0,4}(),{1,0}(),{1,1}(),{1,2}(),{1,3}(),{1,4}(),{2,0}(),{2,1}(),{2,2}(),{2,3}(193.266,53.4889,8688,null,null,null),{2,4}()]]

$ iquery -aq "subarray(tmp2,19320,14340,19330,14350)"

[[{0,0}(),{0,1}(),{0,2}(),{0,3}(),{0,4}(),{1,0}(),{1,1}(),{1,2}(),{1,3}(),{1,4}(),{2,0}(),{2,1}(null,null,null,193.237,53.4576,8688),{2,2}(),{2,3}(),{2,4}()]]

Now is the interesting part, if I do subarray first, then merge, I get the desired outcome of merge: having 2 filled cells in the merged array.

$iquery -aq "merge(subarray(tmp1,19320,14340,19330,14350),subarray(tmp2,19320,14340,19330,14350))"
[[{0,0}(),{0,1}(),{0,2}(),{0,3}(),{0,4}(),{1,0}(),{1,1}(),{1,2}(),{1,3}(),{1,4}(),{2,0}(),{2,1}(null,null,null,193.237,53.4576,8688),{2,2}(),{2,3}(193.266,53.4889,8688,null,null,null),{2,4}()]]

However, if I do merge(tmp1,tmp2) 1st then subarray, I only get the object that was in tmp1:

$ iquery -aq "subarray(merge(tmp1,tmp2),19320,14340,19330,14350)" [[{0,0}(),{0,1}(),{0,2}(),{0,3}(),{0,4}(),{1,0}(),{1,1}(),{1,2}(),{1,3}(),{1,4}(),{2,0}(),{2,1}(),{2,2}(),{2,3}(193.266,53.4889,8688,null,null,null),{2,4}()]]

And as expected, if I swap the order of tmp1 and tmp2 in merge, I get the object that was in tmp2:

$ iquery -aq "subarray(merge(tmp2,tmp1),19320,14340,19330,14350)"
[[{0,0}(),{0,1}(),{0,2}(),{0,3}(),{0,4}(),{1,0}(),{1,1}(),{1,2}(),{1,3}(),{1,4}(),{2,0}(),{2,1}(null,null,null,193.237,53.4576,8688),{2,2}(),{2,3}(),{2,4}()]]

Can you help explain why?

What is the correct way to do it if I want to merge tmp1 and tmp2?

Thanks

-Yushu


#2

Yushu, in this case, you are hitting the problem where non-integer dimensions don’t work with merge. If you cast your RAVAL and DECVAL to int64 and declare them as [RAVAL=0:3600,1800,300,…] I’m certain the problem will go away.

In scidb if a dimension is not int64, it’s assumed “non integer”. That means scidb creates a special hidden array that holds the dimension values and uses a lookup table to find them. This mechanism is necessary if you want your dimensions to be strings or doubles. But in your case you have int32. In fact – there is really NEVER a reason to declare dimensions to be int32 (or (u)int16 or (u)int8). For this case just use int64. Their values are not stored so there is no loss of space.

We are still working on non-integer dimensions (NIDs we call them) and properly handling them in all operators.