Real-time visualisation of incoming data


Hello. I’m creating system for seismic monitoring @ several objects. Each object has 3 seismographs; each seismograph sends 50 acceleration readings (150 readings if you count x/y/z axis) per half second. Operating programme reads acceleration from that 3 seismographs, calculates speed and distance and puts all that data into database (currently not each half second, but each 3 seconds). So each second there are 5036 = 900 readings (100 acceleration, 100 speed, 100 distance per X, Y and Z) from each seismograph => 2700 readings from 3 of them.

Currenty I’m using Firebird for storing data (data is stored in seperate BLOBs each 3 seconds => each 3 seconds there are 3 axis * 3 seismographs = 9 new entries in DB) and node.js-based server for visualisation (that looks for new database entries every 3 seconds and draws them).

Will SciDB allow putting that data every 0.5 seconds and reading all new data every 0.5 secs (e.g. writing/reading 450 double-precision numbers each 0.5 sec)? What’s top limit for it? Is there any good way to get all newly-incoming data? Maybe some other DB in RAM that will give new data to SciDB and to visualisation software?


SciDB performance depends on a lot of factors so I’m not sure about an upper bound, but in general it should easily be able to handle your data insertion/output workload. For example, let’s say new data appears in the file /tmp/450.bin as a binary blob of 450 double-precision numbers. And suppose we create an array called ‘d’ to hold some data, for example created with:
iquery -aq “create_array(d,<x:double>[i=0:*,1000,0])”

Then, consider:

time iquery -aq “apply(insert(input(<x:double>[i=0:*,1000,0],’/tmp/450.bin’,0,’(double)’),d),y,sin(x))”

This query loads the data from the file, inserts it into the existing array called ‘d’ and computes a simple computation on it and returns the results in ascii form to the console. Average runs on a small SciDB installation on a workstation average about 0.1s for this, including the return of the ascii output.

Alternatively, you could send your output back into another file or pipe for consumption by another procss, something like:

time iquery -naq “save(apply(insert(input(<x:double>[i=0:*,1000,0],’/tmp/450.bin’,0,’(double)’),d),y,sin(x)),’/tmp/output.bin’,0,’(double)’)”

This was less consistent for me, probably because of the output disk. It averaged about 0.25s per run on my workstation.

We are about to release a JDBC interface (in March) that you could also use, which will likely be faster.

I hope this helps,

Bryan Lewis