About the logical and physical ways for operator source


#1

Hi, to whom it may concern,

I am now working on implementing a custom operator, which could read an array into memory and call other scripts in C or Python to run clustering algorithm on this array and return the result as the operator’s output. I have seen several previous posts on this forum about customizing the operator.

But I am quite new to SciDB and I didn’t find any documentation about the source code or API of SciDB development. So I could only try to read the code to understand the development way. I have mainly read the code in examples/bestmatch, as well as the aggregate operator. After this, I am still feeling confusing about the logicalXXX.cpp and physicalXXX.cpp.

What’s the reason to separate the operator function into these two files. I saw the operator"avg" only has logicalXXX.cpp and in inferSchema() it called an existing function in aggregate class to finish the computation. Does is mean I could manipulate date after I load the scidb array into a memory array(true array that I want to compute directly on) inside the function inferSchema(). Thanks.

Also some physicalXXX.cpp has a function named execution. Must it be executed when the operator is called in a query? Where should we call our main computation function? in inferSchema() or execute.


#2

Hello,

You need to provide both, Logical… and Physical… The inferSchema() method just tells the system “what is the shape of the output array”. It does not do the actual computation. The actual computation is done in execute(). You can rely on that execute() will be called on each instance with that instance’s portion of the input array. avg() is a bad example because the optimizer rewrites avg(Array, attribute, dimension) into "aggregate(Array, avg(attribute), dimension).

Also, have you seen this: viewtopic.php?f=18&t=1099 ?

  • Alex Poliakov

#3

Hello! Thanks for your reply!
I have the same question about the logicalXXX.cpp and PhysicalXXX.cpp when I want to add a new UDF like indexlookup().
So, I want to know more about it, but I find that your link is unavailable.
Could you update your URL or give some other information to us?
Thanks so much!!!(^o^)/