This is a tough problem, because the term binary can be used to cover a lot of things.
Here’s where we are:
We have the basic CSV load utility, and a “proprietary” (read - convenient for us) external format. These are designed for ease of use over either performance or adherence to any kind of standard. The idea is that in a lot of places passing around CSV files is the “way these things are handled”.
We have a couple of people working on loaders / file access layers for specific “binary” formats: the ones that are common in the science data community. For example HDF-5, and FITS. Why these? Because there are people using SciDB with data in these formats.
And the FITS loader is going be added to the trunk as an example.
- The design of the engine is such that adding another external format isn’t that hard. 2 groups (at least) have already done this for their formats with (relatively) little help from anyone here. We also prototype ideas for storage manager designs by writing drivers that talk to external (ie. Non-SciDB storage manger) files.