Counting occurences


#1

Hello,
I have been trying to use an equivalent of the “table” R function in SciDB-R, without success. For example, the following standard R code:

x <- c(1, 2, 1, 1, 2)
table(x)
x
1 2
3 2

Produces a table of number of appearances of each value. 1 appears twice, and 2 appears three times. How can I do this from SciDB-R? I found an example in the AFL query language, but could not find a way to do this in R.

I would like to use this to count occurrences in each row of a two dimensional matrix.

Thanks,
Ohad.


#2

Hey Ohad,

Sure the aggregate() function should do it. Here’s a 1000x1000 sparsely populated matrix:

> scidbconnect()
> m = scidb("build(<val:double>[i=0:99999,100000,0], random())")
> m = bind(m, "x", "random()%1000")
> m = bind(m, "y", "random()%1000")
> m = redimension(m, "<val:double> [x=0:999,1000,0, y=0:999,1000,0]")
> m = scidbeval(m, temp=TRUE)
> count(m)
[1] 95179
> aggregate(m, FUN=count, by="x")[]
   [1]  96  94  89  89  98 105  98  82 100  98 ...

Here’s a count grouped by attribute:

> v = scidb("build(<val:double>[i=1:10000000,1000000,0], random()%10)")
> v = scidbeval(v, temp=TRUE)
> aggregate(v, FUN="count(*)", by="val")[]
    count val
0 1000793   0
1 1000620   1
2  999486   2
3  999931   3
4  999081   4
5  998826   5
6 1000485   6
7 1000942   7
8  999755   8
9 1000081   9

For some reason the R-friendly form ‘count’ doesn’t work in the second case, but the SciDB form ‘“count(*)”’ does. Investigating.