Data analysis tasks with SciDB


#1

Hi,I have installed and runned across scidb. The array features are pretty nice.
And I got this question: what is a good way to work data analysis or DM algorithms with scidb? Such as clustering, dimensionally reduction(PCA,SVD)…
Use a client API(python,c…) to access the data and process the tasks? or write UDFs? ( seems can’t take arrays as input )

I wish to know the proper way to do data analysis tasks with scidb data and the advantages of scidb in such situation.

Thank you in advance,
jl


#2

Hello!

We are actively working on data analysis algorithms. Maybe near release 12.11 will have what you want.
Technically UDF cannot take an array since it’s scalar operation. For array processing we support user defined operators. It’s more complex task but it’s possible to write you own operator. Try to find some information about it in documentation. A lot of examples of how to implement operators can be found in SciDB sources (src/query/ops/…). You can find more close to your needs and change behaviour.
Feel free to ask us any questions if you will have.