I am in the process of designing a remote monitoring system for renewable energy hybrid microgirds. The nature of the data I will be collecting is time series - streams of measurements taken from sensors deployed in the field. Thus, I have been looking into various specialised time series databases such as OpenTSDB, InfluxDB, KairosDB, Riak TS, etc. At first, as I start reading about each of these offerings, it seems to be the ideal solution - the greatest thing since sliced bread - much more suitable than Postgres, or MySQL. However, after a bit more digging through the discussion forums, I have found compelling reasons not to go with any of the above.
When I came across SciDB and learned that it uses multidimensional arrays to store data, I had an “a-ha!” moment. It seems to me that multidimensional arrays are the ideal way to handle time series data. Yes, time is a single-dimensional phenomenon, but the cyclical nature of the solar system we live in means that we do not use it this way. We constantly chop and slice time into recurring “dimensions” - years, months, days, hours, minutes, seconds, milliseconds, etc. And we naturally want to query a stream of time series data by aggregating along these dimensions, looking at the big picture first, and then zooming into the detail (at least I do in my use case).
So, I am wondering if there are others who have used SciDB as a time series database (TSDB) who can share their experiences around this. Is there a definitive guide for this using SciDB as a TSDB? Has anyone done performance comparisons between SciDB and any of the specialised time series databases mentioned above? Does my thought of splitting minutes, hours, days, months, etc into different dimensions even make sense? Does it make it easier to, say make a zoomable time series chart for a web app? Is Paradigm4 interested in conquering the burgeoning market for time series databases which the “Internet of Things” craze is spawning?