SciDB limit on number of operands allowed in a filter operation


#1

TL;DR

As of my testing on the latest version of SciDB, a SciDB expression may have no more than 400 operands.

Details

I was curious to know the number of operands allowed in a filter operation. Did not find any documentation for this, so I forced SciDB into giving me the answer:

iquery -aq "show(FEATURE)"
# {i} schema
# {0} # 'FEATURE<name:string> [feature_id=0:*:0:1000000]'

time iquery -aq "op_count(filter(FEATURE,  
            ( feature_id = 2835 ) OR ( feature_id = 3150 ) OR ( feature_id = 4491 ) OR ( feature_id = 7736 ) 
            OR ( feature_id = 8756 ) OR ( feature_id = 10264 ) OR ( feature_id = 10749 ) 
            OR ( feature_id = 12232 ) OR ( feature_id = 13116 ) OR ( feature_id = 15009 ) 
            OR ( feature_id = 15340 ) OR ( feature_id = 15797 ) OR ( feature_id = 15862 ) 
            OR ( feature_id = 16427 ) OR ( feature_id = 17700 ) OR ( feature_id = 20350 ) OR ( feature_id = 23677 ) OR ( feature_id = 24253 ) OR ( feature_id = 24928 ) OR ( feature_id = 28647 ) OR ( feature_id = 31173 ) OR ( feature_id = 36351 ) OR ( feature_id = 38549 ) OR ( feature_id = 38571 
           ...

After increasing the number of expressions to a suitably high number I got:

UserException in file: src/query/parser/Translator.cpp function: checkDepthExpression line: 4162
Error id: scidb::SCIDB_SE_PARSER::SCIDB_LE_EXPRESSION_HAS_TOO_MANY_OPERANDS
Error description: Error during query parsing. 
A SciDB expression may have no more than 400 operands.

Recording here for others.


#2

how about putting your 400+ operands into another array and doing join?


#3

@senya72 yup that is the other way. Something like:

cross_join(FEATURE as A, 
    redimension(
       build(<feature_id:int64>[idx=1:5,100000,0],'[(250515),(252063),(317513),(390260),(9168)]', true), 
   <idx:int64>[feature_id=0:*,1000000,0]) as B, 
A.feature_id, B.feature_id)

I posted this mainly because I wanted the limit on number of expressions to be documented.

One reason for my quest to find this: it requires trivial AFL knowledge to generate the filter query, and would be good for users who are new to AFL. The other uses cross_join, redimension, build with literal, chunk-sizes etc. i.e. a bit more knowledge is required.