Aggregation using redimension


#1

Hello, I am using SciDB 14.12.

I am trying to to aggregate on an array, but getting inconsistent results. Here is an example.

CONSTRUCT

m = scidb(“build(val:int32[i=0:19,20,0], random()%1000)”)
m = bind(m, “x”, “random()%3”)
m = bind(m, “y”, “random()%5”)
m = scidbeval(m)

Now we have an array with 20 cells, and three attributes, here is an example:

val x y
0 206 0 3
1 610 2 4
2 112 2 0
3 94 1 4
4 853 1 2
5 206 0 4
6 182 1 2
7 183 1 1
8 941 0 0
9 822 0 3

This works:

z <- redimension(m, “val:int32 [x=0:3,4,0, y=0:5,6,0]”)

z[]
4 x 6 sparse Matrix of class “dgCMatrix”

[1,] 941 113 . 206 206 .
[2,] 499 183 853 . 94 .
[3,] 112 . . 16 610 .
[4,] . . . . . .

Now, I would like to sum the values that collapse into the same <x,y> coordinate. For example, rows 4 and 6:
row val x y
4 853 1 2
6 182 1 2
have the same x and y, I would like to get in the result M[1, 2] = 853 + 183

Here are two different ways of doing this that should work, but do not

1) FUN=“sum”

z <- redimension(m, “val:int32 [x=0:3,4,0, y=0:5,6,0]”, FUN=“sum”)
Error in redimension(m, “val:int32 [x=0:3,4,0, y=0:5,6,0]”, FUN = “sum”) :
FUN must be a function

2) FUN=sum

z <- redimension(m, “val:int32 [x=0:3,4,0, y=0:5,6,0]”, FUN=sum)
Error in scidbquery(query, afl, async = FALSE, save = “lcsv+”, release = 0, :
UserException in file: src/query/ops/redimension/LogicalRedimension.cpp function: inferSchema line: 152
Error id: scidb::SCIDB_SE_INFER_SCHEMA::SCIDB_LE_ATTRIBUTE_DOESNT_EXIST
Error description: Error during schema inferring. Attribute with id x does not exist in array ‘’.

What is the correct way of doing this in SciDB?

Thanks,
Ohad.


#2

I spent quite a while on this, and finally found a solution:

  z <- redimension(m, "<s:int64 null> [x=0:3,3,0, y=0:5,6,0]", FUN="sum(val) as s")

Another, less elegant solution is:

AGGR <- aggregate(m, by=list("x", "y"), FUN="sum(val) as v", eval=TRUE)
A <- AGGR[0:3, 0:5][]

I hope this helps others trying to use SciDB.
Ohad.