What are the best practices for array design?


#1

Hi,

I am going through this tutorial : viewtopic.php?f=18&t=1204

It has a good example under the section “Array Data model and Redimension” on how to add new attributes to an existing array if the dimensions of these new attributes match those of the existing array. I have a different requirement and I was wondering what is the right way to model it:

I have an array with dimensions A,B. I have another array with dimensions B,C. Since these arrays share a dimension, is there a way to have dimension C also in the first array and get rid of the second array? If I have just one array with dimensions A,B,C, what would be the general coordinates of the attributes that came from the second array? Would they be {null,b,c} ? And what would be the general coordinates of the attributes that belonged to the first array? Would they be {a,b,null} ? If I keep the data in separate arrays, I will have to join them, which is perhaps not an issue in SciDB. But I’d like to know what is the preferred design here.

PS: Is there a tutorial somewhere specifically on array design best practices? Would appreciate if it takes a relational schema and walks through step-by-step on how to represent it as an array in SciDB.


#2

A 2D array can be turned to a 3D array, where the new dimension has a single coordinate.
In your case, you may add dimension C to the first array, and add dimension A to the second array. Now that both arrays are three dimensional, merge them. The result is a 3D array which has values only along two faces. See the running example below:

$ iquery -a

AFL% create array one<v:int64>[A=1:2,2,0,B=1:2,2,0];
Query was executed successfully

AFL% create array two<v:int64>[B=1:2,2,0,C=1:2,2,0];
Query was executed successfully

AFL% store(build(one,A+B), one);
{A,B} v
{1,1} 2
{1,2} 3
{2,1} 3
{2,2} 4

AFL% store(build(two,B*C), two);
{B,C} v
{1,1} 1
{1,2} 2
{2,1} 2
{2,2} 4

AFL% create array template<v:int64>[A=0:2,1,0,B=0:2,1,0,C=0:2,1,0];
Query was executed successfully

AFL% merge(redimension(adddim(one,C), template), redimension(adddim(two,A), template));
{A,B,C} v
{0,1,1} 1
{0,1,2} 2
{0,2,1} 2
{0,2,2} 4
{1,1,0} 2
{1,2,0} 3
{2,1,0} 3
{2,2,0} 4

P.S. regarding to the tutorial you asked, we don’t have one yet but plan to write one.