Retrieving chunks from all other instances


#1

Hello,

Now I am developing an operator taking two arrays as input.
Say Op(A, B)
A is very large and is evenly distributed on all instances, but B is small and normally fits in one chunk.
Now I need every instance has the information in B to correctly process A.
Is there any way I can easily program to let all instances have all chunks in B?
I looked into the SciDB source code, it looks like redistribute() can do that, if so, how could I use that in my operator?


#2

Hello,
Yes, redistribute() is a way to go. Notice though that it is deprecated (as of 14.12) and is being replaced by a set of redistributeXXX() functions. A good example of the usage is in src/query/ops/PhysicalSG.cpp. In general it should look something like:

PartitioningSchema ps = psReplication;
shared_ptr<Array> srcArray = B; // or however you access B
 InstanceID instanceId = ALL_INSTANCE_MASK;
shared_ptr <CoordinateTranslator> distMapper;
const size_t instanceIdShift=0;
const bool enforceDataIntegrity=true;
 shared_ptr<Array> distributedB =  redistributeToRandomAccess( // or just redistribute() without the last parameter
                                                                      srcArray, 
                                                                      query, // provided in the execute() method of Op
                                                                      ps,
                                                                     instanceID, 
                                                                     distMapper,
                                                                     instanceIdShift ,
                                                                     shared_ptr<PartitioningSchemaData>(),
                                                                    enforceDataIntegrity);

}}}
So, distributedB is a MemArray (not persisted) that is present on all the instances.

#3

Looks good, thank you very much!

Y