 # Best way to sum times series of 2D Boolean masks

#1

Hi devs and users, I found a use case for which I would like to use SciDB but am unsure on how to proceed and would like to ask for some advice. I’m faced with a time-series (sampled each half hour) of a 2-dimensional Boolean mask. I wish to aggregate these Boolean masks (and sum the individual i, j indices) over an arbitrary time-series range.

The result would be a two-dimensional array of counts (number of times i, j has value True).

I hope I made myself clear!

Best and thank you for the help.

#2

Sure, this is using 16.9 with the AIO plugin. For simplicity, we will use SciDB sparsity and not store the 0 values. You could store those if you want to distinguish between `0` and `NULL` - for example if you have missing data.

Make an example TSV file like so. We’ll use the simplest time scale possible:

``````\$ cat /tmp/sample_data.tsv
t	x	y	value
0	1	3	1
0	5	2	1
0	7	9	1
1	1	3	1
1	5	2	1
2	5	2	1
2	9	3	1
``````

Load it into an array:

``````\$ iquery -aq "
store(
redimension(
apply(
t, int64(a0), x, int64(a1), y, int64(a2), value, bool(int64(a3))
),
<value:bool> [t=0:*:0:*, x=0:*:0:*, y=0:*:0:*]
),
test_data
)"
{t,x,y} value
{0,1,3} true
{0,5,2} true
{1,1,3} true
{1,5,2} true
{2,5,2} true
{0,7,9} true
{2,9,3} true
``````

And finally aggregate:

``````\$ iquery -aq "aggregate(test_data, count(*), x,y)"
{x,y} count
{1,3} 2
{5,2} 3
{7,9} 1
{9,3} 1
``````