What I do is as follows;
It’s kinda redundant to say it, but my habit is always to create a script file that lists the queries that make up your benchmark. And because it’s a good idea to try to minimize the “noise” in your measurement, I try to set up my “queries” so that their run-times are longer than 100 seconds. This might mean taking a short-running query and repeating it (with some variation) enough times to get a kind of “phase” that lasts the requisite 100 seconds. The idea is to get something that you can run and re-run the script as you change scale-factors, physical design choices and SciDB tuning options.
Then I use SciDB to record the per-query time. Before running the benchmark script (drop and re-)create the following array.
CREATE ARRAY Timings
time : datetime,
what : string
[ Step=0:*,1,0 ];
Then, between each of the queries, Q1, Q2 … Q10, add a version of the following query. Change the string that is stored in the
what attribute to tell you what’s going on.
build ( < Step : int64 > [ R=0:0,1,0 ],
iif ( high ( ‘Timings’, ‘Step’ ) < 0,
high ( ‘Timings’, ‘Step’ ) + 1 )
what, 'Step ’ + string ( Step ) – Replace this with per-step message.
So your benchmark script ends up looking like this:
remove ( Timings );
create array Timings …
insert ( ‘Initialize’, Timings );
insert ( ‘Q1’, Timings );
insert ( ‘Q2’, Timings );
So what you will end up with is an
Timings array that contains the times at which each of the query phases begins and ends. Then you can use the following SciDB query to produce a list showing how long each phase took.
Timings AS T,
window ( Timings, 1, 0, min ( time ) as start, max ( time ) as end ),
duration, end - start
Not for everyone. Others like to create a script using bash that has some variation on the following.
iquery -aq “query one”
iquery -aq “query two”
But then you need to write a bunch of gnarly scripts to parse the times out and so on. But I’m lazy. So I just use SciDB.