I wonder if any one has tried scidbpy for persisting data to remote nodes. This should be a common workload and I/O pattern for parallel algorithms. Here's my use case:
I have a 4 GB numpy array and I want to store it to 4 instances on 2 physical nodes (i.e., 2 instances per node). I split the data on the 4th dimension ranging [0,288). The code looks like the following ('0' and '1' are local instance IDs, '4294967298' and '4294967299' are remote instance IDs)
data_sdb_0 = sdb.from_array(data[:,:,:,0:72],0,name='mri_4_0')
data_sdb_1 = sdb.from_array(data[:,:,:,72:144],1,name='mri_4_1')
data_sdb_2 = sdb.from_array(data[:,:,:,144:216],4294967298,name='mri_4_2')
data_sdb_3 = sdb.from_array(data[:,:,:,216:288],4294967299,name='mri_4_3')
The first two lines for local instances completed pretty quick (roughly 80 seconds each), and yet the 3rd line seemed running forever... Any ideas or suggestions?
PS: I could store an array to remote nodes in AQL without any issues.