Hi, new here. At the expense of showing ignorance, I want to get the opinion here on the pros and cons of setting up large data scientific computing solution based scidb vs. spark, which is also very popular at the moment.
For example what are the use cases that are typically bad for each?
To put things in context, for the problem at hand, data set may not fit on one machine either memory or disk, but there are ways around it via preprocessing to chop it up. On the other hand, if scidb indeed solves the problem of sparse tensor storage and direct computation on it, then maybe that preprocessing along with algorithmic limitations that go along with it would be superfluous.
Thank you in advance!