Regrid on a UDO


#1

Hello,
Probably an easy question on how to query using one user defined operator and regrid.

I started playing around with user defined operators and I was able to write one but got stuck on how can I use ‘regrid’ so that my operator will be called with different subarray as input within the same query.

Taking as an example the trades array from the example queries provided with the latest package, I can call:

"myop( project( between( trades, 10, 1000, null, 10, 9000, null ), price, volume) )"

and successfully get output an array with a single row, and 4 attributes.

{i} BSA,BSR,ASA,ASR {0} 8,8,4,4

in that example myop( ) will take as input the price and volume of all trades between 1000 and 9000 mills for the symbol_id 10

How do I change my query so that myop() will regrid() every 3000 milliseconds?

An example output would look like:

{i} BSA,BSR,ASA,ASR {0} 2,2,1,1 {1} 3,2,3,4 {2} 1,1,2,2

As a side note, a much simpler solution would be to create 4 aggregate functions (e.g. BSA(), BSR(), ASA(), and ASR() )
and then query:

but multiple arguments are not allowed.

Any suggestion how I could go ahead?

Thanks
mikej


#2

Hi Mike

Interesting. There are two ways to proceed.

  1. Open up the code for regrid (Aggregator.h), learn it, and structure your UDO to do the same thing. In other words, you’ll be re-writing a special case of regrid with a particular purpose in mind. Much of it will be copy-paste code.

  2. Define four UDAs and, to go around the one argument requirement, you combine the price_volume pair into a single value. The easiest would be like

apply(array, pv, string(price)+'_'+string(volume))

And then teach your UDA to parse such strings… That’s the crude first-attempt method. That’s what I would do to start.
The better way is to define your own UDT for price-volume. We have some examples of that in the src tree (point and rational).
Multi-argument aggregates are on the roadmap for some point in the future.

I recommend number 2 as its quicker and simpler. The only reason to pick number 1 is if you have some very data-specific optimizations that are vital to you.


#3

Thank you,
I remember reading about multi-argument aggregates on another thread as well. Do you think this will be on a 13.x release?


#4

MikeJ?

What are you doing with those price/volume numbers? Can you manipulate them to produce some value that can be pushed into an aggregate?

Or are you trying to do something like calculate a pairwise aggregate between the pairs? (ie. are price and volume covariant?)

Curious.

Paul


#5

Hi Paul,

I used the price/volume array from the sample data in the udo examples to illustrate the problem I try to solve, making easier to explain the problem.

My dataset is a little more complicated where each cell has more attributes (e.g. the size and price of the bid and ask, in addition to the trade price and size at the time the trade took place)

But the problem I have is identical to the simpler example I described, how to use existing aggregate logic with my UDO.

With the UDO I have, I can take an array as input and calculate and output an array with a single dimension, single cell, multiple attributes. I was trying to see how easy it would be to use aggregation to output multiple cells for different subarrays of the input array.

I thought about UDA but with single argument aggs it is not flexible. Multi-argument aggs would make it easier, but still a UDO is preferred as it solves some problems that aggregates cannot.

For now I am looking into the first solution provided by Alex, writing the aggregation code within the UDO.
.mike