I’m working with SciDB and I need to implement a complex query on two arrays that results in one array, all are stored in the database. The input arrays are:
The output array is:
The queries that I need to execute are the following, I’m writing them in a python string format and execute the final query with iquery:
cross_forest = "cross(Forest,Forest)"
apply_filt = “apply(%s, flux_sq, flux*flux_2)” % cross_forest
forest_cross = “project(%s, flux_sq, dist, dist_2)” % apply_filt
cross_join_theta_and_forest = “cross_join(%s, Theta, %s.index, Theta.index, %s.index_2, Theta.index_2)” % (forest_cross, forest_cross, forest_cross)
calc_r = “apply(%s, r_parallel, dist100 * theta3 , r_transverse, dist+dist_2+theta)” % cross_join_theta_and_forest
filter_r = “filter(%s, r_parallel + r_transverse<100)” % calc_r
project_r = “project(%s, flux_sq, r_parallel, r_transverse)” % filter_r
final_step = “redimension(%s, Bins, sum(flux_sq) as total_flux)” % project_r
iquery -aq “<final_step>;”
My problem is that I don’t need the output of the intermediate arrays though it seems that I have to fetch and store them, which takes a very long time. For example, the first command I should execute is:
iquery -aq "store(cross(Forest, Forest), forest_cross);"
and so on.
Is these any way for me to avoid fetching and saving the intermediate results?
Any other suggestions to speed up the process are welcome as well!