SciDB-Py and compression


#1

Hi,

it says in the SciDB-Py docs that SciDB-Py now supports compression when transferring arrays from SciDB to Python if a sufficiently new version of shim is installed on the server.
I gave this a try (introduced “sdb.default_compression = 1” right after connecting) but got an error when wrapping an existing (in SciDB) array:

Traceback (most recent call last):
  File "MyCompressionTest.py", line 21, in <module>
    gridnodes = sdb.wrap_array("SpatialGridNode")
File "/usr/lib/python2.7/site-packages/scidbpy/interface.py", line 305, in wrap_array
    schema = self._show_array(scidbname, fmt='csv')
  File "/usr/lib/python2.7/site-packages/scidbpy/interface.py", line 272, in _show_array
    return self._execute_query(query, **kwargs)
  File "/usr/lib/python2.7/site-packages/scidbpy/interface.py", line 1684, in _execute_query
    result = self._shim_read_lines(session_id, n, compressed)
  File "/usr/lib/python2.7/site-packages/scidbpy/interface.py", line 1782, in _shim_read_lines
    text_result = unzip(text_result)
  File "/usr/lib/python2.7/site-packages/scidbpy/interface.py", line 58, in unzip
    raise ValueError("Could not unzip: %s" % repr(payload))
ValueError: Could not unzip: "schema\n'SpatialGridNode<lat:float,lon:float,cellid:int64 NULL DEFAULT null,land_flag:bool> [sgrid=0:*,1,0,gpi=0:3264390,250000,0]'\n"

Does this mean that unzipping doesn’t succeed because the payload is not actually zipped?
How can I verify that my shim installation supports compression? I installed the “http://paradigm4.github.io/shim/shim-14.8-1-experimental.x86_64.rpm” binary package (on my Ubuntu 12.04 server); when that didn’t work, I tried compiling and installing shim myself but it didn’t improve matters.


#2

Hi,

shim has not been tagged yet because I’m still testing a few things, but you can check the commit version with:

wget -O - -q localhost:8080/version

replacing “localhost” with your SciDB machine IP address that shim is installed on. The last few digits of the most recent version shoud be 18c0.

If you compile shim from source you can check that things are working by running

make alltests

one of those tests exercises compression.

I’ll experiment with the Python package later today to try to narrow down the problem. Compression is a rather new shim option (put in by request recently).


#3

Hey there,

that error definitely looks like your version of shim is older and not compressing the transfer. Unfortunately Scidb-Py does not currently detect whether the version of shim supports compression – it just blindly tries to decompress the result, and fails if the data aren’t compressed.

chris


#4

Thanks :smile:
My version right now seems to be v14.8-14-g18c0-dirty - it’s from the scidb-14.12-dev-tools Ubuntu binary package. So 18c0 is a substring of that but compression doesn’t seem to work. (Also, lately I’m getting a 404 when accessing the web dashboard but that may be another matter.)
Should I rather use paradigm4.github.io/shim/ubuntu_ … _amd64.deb ?

Finally a small (hopefully?) feature request: could you put the version number somewhere in the web interface so that it’s really obvious what version is used? :smile:

So what version do you recommend?


#5

Bryan can best advise the best Shim version to install to get compression (it’s a somewhat recent feature, and I don’t know off-hand if there are binary builds for it yet).