Parallel loading with 15.7


#1

Hello,

I have two questions and appreciate any answers or comments:

  1. Is there any improvements for the parallel loading / redimensioning since v14.12? We conducted our experiments on very large data set on a cluster on 14.12 long time ago and I remember that loading the data and the redimensioning were very very slow.

  2. Is there any efficient way to generate very large 2-D dense matrix with random numbers?

Thanks!


#2

Hi.

For parallel load(), there were no substantial changes made in 15.7. However, make sure your 14.12 binaries were updated after 27-Jan-2015. At that time 14.12 bits were respun to fix a parallel binary load bug introduced by some minor refactoring.

Also, 15.7 now includes the load_tools add-on pre-installed. This was formerly found at github.com/Paradigm4/load_tools but is going to included and better supported going forward.

Regarding redimension(), improvements were made to memory management which should reduce fragmentation. There was also an optimization that can eliminate the need to create auxiliary temp data structures in some circumstances. So it’s definitely worth rerunning your redimension() benchmarks.

As to the second question, I will have to defer to others. Should be able to get you an answer soon.

Cheers,
Mike Leibensperger


#3

Thank you Mike for your reply. I will wait for your answer to the second question.