Are TEMP arrays materialized?


#1

Hi, I’m starting to use SciDB. I have a question about the behaviour of TEMP arrays. Are they materialized or are only a reference for another query?

I try to explain myself with an example. If I execute the following query:

store(apply(filter(SR,somefilter),something),FR);

is the same of executing that one?

store(filter(SR,somefilter),TEMP_A);
store(apply(TEMP_A,something), FR);
remove(TEMP_A);

TEMP_A (an already defined TEMP array) will be materialized into the db or it just a reference to the first partial query? really thanks


#2

Cattanisimone,

In SciDB, both a TEMP array and a non-TEMP array are materialized, in the sense that if you store() into any of them, the full content of the array will be computed. So no, store(apply(filter(…))) is not the same as store(filter(…)) + store(apply(…) + remove(TEMP_A).

A TEMP array may be faster than a non-TEMP array for the following reason. While a non-TEMP array must be hardened to disk upon transaction commit, a TEMP array may live entirely in memory (unless there is memory pressure, in which case the buffer manager will swap out some in-memory chunks to disk).

Donghui


#3

Thanks, I thought that. So I have another related question (please let me know if is better open a new post into the forum): if I have a partial result that I should use two times in the following queries, how can I optimize the computation? In other words, in the following query, the SciDB optimizer recognize that the ‘subqueryA’ was already computed or not?

op1( subqueryA , op2 ( subqueryA) );

Thank you again

Simone


#4

Currently, SciDB optimizer does not recognize that the ‘subqueryA’ was already computed. So in your query, subqueryA will be computed twice.

It may be better to:

create temp array T …
store(subqueryA, T)
op1(T, op2(T))
remove(T)