Upload Numpy Array to SciDB keeping the Numpy Array's indices


#1

Hello all,
This forum seems a bit dead, but I’m going to try my luck once more.

I’m trying to upload data to my Scidb using SciDB-Py 16.9.1. The data that I’m trying to upload is a Numpy Array, which, for the sake simplicity, can be:

In [1]: myarray = np.array([[1,2],[2,4]])

In [2]: myarray
Out[2]: 
array([[1, 2],
       [2, 4]])

In [3]: myarray.shape
Out[3]: (2, 2)


notice that values of this array can be accessed in the following way:

In [4]: myarray[0,0]
Out[4]: 1

In [5]: myarray[1,1]
Out[5]: 4

My problem starts when I try to upload data like this to Scidb. If I use Scidb’s input operand to create an array based on the numpy array, data is not indexed in the same way, as you can see below

In [6]: db.input('<x:int64 not null>[i,j]',upload_data=myarray)[:]
Out[6]: 
array([(0, 0, 1), (0, 1, 2), (0, 2, 2), (0, 3, 4)],
      dtype=[('i', '<i8'), ('j', '<i8'), ('x', '<i8')])
In [7]: db.input('<x:int64 not null>[i,j]',upload_data=myarray)[0]
Out[7]: (0, 0, 1)

In [8]: db.input('<x:int64 not null>[i,j]',upload_data=myarray)[3]
Out[8]: (0, 3, 4)

what seems to happen is that j is filled first and i only increments after j reaches its max (just like what happens in two nested for loops).

what I would like to happen is:

In [7]: db.input('<x:int64 not null>[i,j]',upload_data=myarray)[0]
Out[7]: (0, 0, 1)

In [8]: db.input('<x:int64 not null>[i,j]',upload_data=myarray)[3]
Out[8]: (1, 1, 4)

notice the differences between both Out[8].

Does anyone know how to solve this?

Thank you.


#2

Ok guys, never mind. I got it to work already. I’ll post it here in case someone needs it in the future.

In [9]: db.input('<x:int64 not null>[i=0:1;j=0:1]',upload_data=myarray)[:]
Out[9]: 
array([(0, 0, 1), (0, 1, 2), (1, 0, 2), (1, 1, 4)],
      dtype=[('i', '<i8'), ('j', '<i8'), ('x', '<i8')])

and as test:

In [10]: db.input('<x:int64 not null>[i=0:1;j=0:1]',upload_data=myarray)[0]
Out[10]: (0, 0, 1)

In [11]: db.input('<x:int64 not null>[i=0:1;j=0:1]',upload_data=myarray)[1]
Out[11]: (0, 1, 2)

In [12]: db.input('<x:int64 not null>[i=0:1;j=0:1]',upload_data=myarray)[2]
Out[12]: (1, 0, 2)

In [13]: db.input('<x:int64 not null>[i=0:1;j=0:1]',upload_data=myarray)[3]
Out[13]: (1, 1, 4)

So all we have to do is set the limits of our dimensions beforehand, in this case [i=0:1;j=0:1], and we are done.


#3

Thanks @davv-deimos for posting on the forum. Glad that you found the answer to your question.

As mentioned in this post, we are not always able to answer immediately. But we (at Paradigm4) do monitor the forum and try to be as responsive as possible. And then there are other good folks in the community that contribute answers on the forum.

Look forward to hearing more from you.


#4

Hi @Kriti_Sen_Sharma :slight_smile:

Glad to hear that people are still active here. Would you mind having a quick look to this post that I posted the other day? I still could not understand it by myself…