Example UDA: penmax


#1

Attached is a tarball that describes an example User-Defined Aggregate (UDA) with some comments in hope to start documenting the API. The API is still under development and subject to change. The current possible use cases are

  • aggregates over User-Defined Types (UDTs), i.e. given a set of “points” compute a centroid
  • approximate medians and quantiles
  • exact medians and quantiles when applied to smaller sets or windows of bounded size
  • checksums, i.e. crc
  • application-specific windowing computations

Note: some examples of UDTs (i.e. “point”) and UDFs are provided in the source tree in the “examples” directory.

This is a crude example that can easily be templatized and further optimized. penmax returns the second largest element of a set.
pen_max.tar (20 KB)

$ make SCIDB_SOURCE_DIR=/home/apoliakov/workspace/scidb_trunk 
g++ -Dexample_EXPORTS -pedantic -W -Wextra -Wall -Wno-strict-aliasing -Wno-long-long -Wno-unused-parameter -fPIC -D__STDC_FORMAT_MACROS -Wno-system-headers -isystem -O2 -g -DNDEBUG -ggdb3  -D__STDC_LIMIT_MACROS -I. -DPROJECT_ROOT="\"/home/apoliakov/workspace/scidb_trunk\"" -I"/opt/scidb/13.3/include" -I"/home/apoliakov/workspace/scidb_trunk/include" -I/usr/include/google/protobuf -I/usr/include/postgresql -o plugin.cpp.o -c plugin.cpp
g++ -Dexample_EXPORTS -pedantic -W -Wextra -Wall -Wno-strict-aliasing -Wno-long-long -Wno-unused-parameter -fPIC -D__STDC_FORMAT_MACROS -Wno-system-headers -isystem -O2 -g -DNDEBUG -ggdb3  -D__STDC_LIMIT_MACROS -I. -DPROJECT_ROOT="\"/home/apoliakov/workspace/scidb_trunk\"" -I"/opt/scidb/13.3/include" -I"/home/apoliakov/workspace/scidb_trunk/include" -I/usr/include/google/protobuf -I/usr/include/postgresql -o PenMax.cpp.o -c PenMax.cpp
g++ -Dexample_EXPORTS -pedantic -W -Wextra -Wall -Wno-strict-aliasing -Wno-long-long -Wno-unused-parameter -fPIC -D__STDC_FORMAT_MACROS -Wno-system-headers -isystem -O2 -g -DNDEBUG -ggdb3  -D__STDC_LIMIT_MACROS -I. -DPROJECT_ROOT="\"/home/apoliakov/workspace/scidb_trunk\"" -I"/opt/scidb/13.3/include" -I"/home/apoliakov/workspace/scidb_trunk/include" -I/usr/include/google/protobuf -I/usr/include/postgresql -o libpenmax.so plugin.cpp.o PenMax.cpp.o -L"/home/apoliakov/workspace/scidb_trunk/lib" -shared -Wl,-soname,libpenmax.so -L.

$ cp libpenmax.so ~/workspace/scidb_trunk/stage/install/lib/scidb/plugins/

$ iquery -aq "load_library('penmax')"
Query was executed successfully

$ iquery -aq "list('aggregates')"
No,name
0,"ApproxDC"
1,"avg"
2,"count"
3,"max"
4,"min"
5,"pen_max"
6,"stdev"
7,"sum"
8,"var"

$ iquery -aq "create array foo <val:int64 null> [x=1:10,10,0]"
Query was executed successfully

$ iquery -aq "store(build(foo, x), foo)"
x,val
1,1
2,2
3,3
4,4
5,5
6,6
7,7
8,8
9,9
10,10

$ iquery -aq "aggregate(foo, pen_max(val))"
i,val_pen_max
0,9

$ iquery -aq "window(foo,1,0, pen_max(val))"
x,val_pen_max
1,?1
2,1
3,2
4,3
5,4
6,5
7,6
8,7
9,8
10,9

How to merge partial chunks from different instances?
#2

Is this the best and latest documentation available on UDA? Does it work with 15.7?


#3

I updated the Makefile to use the 15.7 boost include directory and C++14. See the updated Makefile here. It does not compile and it seems to be due to API changes in SciDB. The error I get is:

PenMax.cpp: In member function 'virtual void scidb::PenMaxAggregate::initializeState(scidb::Value&)':
PenMax.cpp:133:15: error: 'class scidb::Value' has no member named 'setVector'
         state.setVector(stateSize);
               ^
PenMax.cpp:134:15: error: 'class scidb::Value' has no member named 'setZero'
         state.setZero();
               ^
In file included from /opt/scidb/15.7/3rdparty/boost/include/boost/filesystem/path_traits.hpp:23:0,
                 from /opt/scidb/15.7/3rdparty/boost/include/boost/filesystem/path.hpp:25,
                 from /opt/scidb/15.7/3rdparty/boost/include/boost/filesystem.hpp:16,
                 from /usr/src/scidb-15.7.0.9267/include/system/Utils.h:36,
                 from /usr/src/scidb-15.7.0.9267/include/system/Config.h:45,
                 from /usr/src/scidb-15.7.0.9267/include/query/Operator.h:58,
                 from PenMax.cpp:24:

User defined window functions
#4

Hi,

Update plugin.cpp to correspond to 15.7, include < cstring > to PenMax.cpp, and change initializeState(Value& state) according to this:

virtual void initializeState(Value& state)
{
// state.setVector(stateSize);
// state.setZero();
memset(state.setSize(stateSize), 0, stateSize);
}

I tested it with your Makefile and it works like a charm.

BR.
Janne


#5

Customized window operation