Getting 1st Non-Null cell of the array?


#1

Hi Experts,

Just wondering how do I get the 1st non-null cell out of an array?
E.g. if it’s a 1-D array, I want to do firstvalue(array) and return the 1st cell of the array that’s not empty.
(Same for 2D arrays, which is the top-left most cell)

Thanks

-Yushu


#2

Hi Yushu!

Did you mean NULL or EMPTY? Remember these are different in SciDB. I think you mean empty so I’ll tell you about empty. We can discuss NULL later.
Here’s a hypothetical dummy 2D example:

iquery -aq "create array foo<val:int64> [x=1:10000,1000,0,y=1:10000,1000,0]"
iquery -anq "store(build_sparse(foo,x+y, x>=11 and y>=3141), foo)"

So the right answer is “{11,3141}[3152]”

Here’s a two-query solution. First find out the coordinates. Then find the value at those coordinates:

iquery -aq "aggregate(apply(foo, x_val, x, y_val, y),min(x_val) as min_x, min(y_val) as min_y)"
[(11,3141)]
iquery -aq "between(foo, 11,3141,11,3141)"
[[{11,3141}(3152)]]

Here’s a slow one-query solution. It’s for educational purposes. It’s a good case of using cross() but it’s slow because it performs a lot of boolean comparisons:

iquery -aq "
filter(
  cross(
    foo, 
    aggregate(apply(foo, x_val, x, y_val, y),min(x_val) as min_x, min(y_val) as min_y)
  ), 
  x=min_x and y=min_y
)"
[[[{11,3141,0}(3152,11,3141)]]];

And here’s a fast one-query solution. It’s fast because join() on dimensions is fast in SciDB:

iquery -aq "
join(
  foo,
  redimension(
    apply(
      substitute(
        aggregate(apply(foo, x_val, x, y_val, y),min(x_val) as min_x, min(y_val) as min_y), 
        build(<val:int64> [i=0:0,1,0], 0), 
        min_x, min_y
      ), 
      val, true
    ), 
    <val:bool> [min_x=1:10000,1000,0,min_y=1:10000,1000,0]
  )
)"
[[{11,3141}(3152,true)]]

Note the apply is superficial (so we have one attribute) and the substitute is also tacky. We’re hoping we can simplify this kind of syntax soon!
Are these informative?