Out of memory caused by BIG chunks


#1

Hello, SciDB experts, had my first out of memory error today.

SystemException in file: src/query/executor/SciDBExecutor.cpp function: executeQuery line: 232 Error id: scidb::SCIDB_SE_NO_MEMORY::SCIDB_LE_MEMORY_ALLOCATION_ERROR Error description: Not enough memory. Error 'std::bad_alloc' during memory allocation. Failed query id: 1101749883243
I did this: $ iquery -nq “SELECT * into MerraMonthly from MerraRaw”

as an attempt to redimension one instance of gridded climate data.

This started out as a question, but as I was typing this, I played with the system on another screen.
So now instead of a plea for help, this is just a story. I learnt something from my own stupid mistakes. Since I have no pride, I’ll share in the hope, that someone else may find it useful.

I want 388 months of climate data stored in an array. There are 20+ attributes (temp, humidity, wind velocity components and many others…) There are 42 height levels, 361 lats and 540 longitude cells. The natural grid for this set is 42x361x540 per time-slice. For each timeslice I am reading 42x361x540 = 8187480 set of about 20+ attributes.

I load the data into the raw array, where the 3 dimensions are attributes.

AFL% show (MerraRaw); [("MerraRaw<day:int64,height:int64,xdim:int64,ydim:int64,H:float,T:float,U:float,V:float,QV:float,O3:float,Cov_U_V:float,Cov_U_T:float,Cov_V_T:float,Cov_U_QV:float,Cov_V_QV:float,vsts:float,Var_H:float,Var_T:float,Var_U:float,Var_V:float,Var_QV:float,Var_O3:float> [Line=0:*,1000000,0]")]

This is the destination into which I want to redimension:

AFL% show(MerraMonthly); [("MerraMonthly<H:float,T:float,U:float,V:float,QV:float,O3:float,Cov_U_V:float,Cov_U_T:float,Cov_V_T:float,Cov_U_QV:float,Cov_V_QV:float,vsts:float,Var_H:float,Var_T:float,Var_U:float,Var_V:float,Var_QV:float,Var_O3:float> [day=197701:*,100000,0,height=0:41,10000,0,xdim=0:539,10000,0,ydim=0:360,10000,0]")] AFL%

And therein lied my problem. I cut and paste dimension specs for the array, and have big chunks everywhere.

The inital scidb load into the raw array seemed to work, ecco:

AFL% analyze(MerraRaw); [("Cov_U_QV","-0.013443","1e+15",5938,8187480),("Cov_U_T","-331.577","1e+15",418113,8187480),("Cov_U_V","-474.918","1e+15",392122,8187480),("Cov_V_QV","-0.02533","1e+15",6445,8187480),("Cov_V_T","-239.375","1e+15",389741,8187480),("H","-154.186","67939.6",289148,8187480),("O3","0","1e+15",20,8187480),("QV","1e-06","1e+15",3143,8187480),("T","187.166","1e+15",324587,8187480),("U","-76.7378","1e+15",358230,8187480),("V","-53.1816","1e+15",397954,8187480),("Var_H","44.0004","1.81675e+06",216093,8187480),("Var_O3","0","1e+15",2,8187480),("Var_QV","0","1e+15",22,8187480),("Var_T","0.089355","1e+15",50320,8187480),("Var_U","0.490723","1e+15",133413,8187480),("Var_V","0.334473","1e+15",127228,8187480),("day","198001","198001",1,8187480),("height","0","41",42,8187480),("vsts","-210.56","1e+15",363751,8187480),("xdim","0","539",540,8187480),("ydim","0","360",361,8187480)]
The raw load is done with:

$ cat /tmp/junk | csv2scidb -c 1000000 -p NNNNNNNNNNNNNNNNNNNNNN > /tmp/load_pipe & $ time iquery -anq "load ( MerraRaw, '/tmp/load_pipe')"

The command "iquery -nq “SELECT * into MerraMonthly from MerraRaw” causes the memory error.
When I changed the MerraMonthly schema to look like almost completely, but not quite entirely unlike the one quoted above with this small change:

day=197701:*,100,0, height=0:41,100,0, xdim=0:539,100,0, ydim=0:360,100,0]
and upon reexecuting the select from raw into monthly, the memory error did not show again.

Maybe there is already an operator or function that will look at an array schema and say “do you know what you are doing?” when the chunksizes get out of hand. In isolation, this array could probably live and be well. But when it interacts with other arrays during some operations implicit internal joins even if temporary can be huge, yes?

maybe a sanitycheck operator could take pairs of arrays and list potential dangers considering the kinds of pairwise operations that can be defined.

What do you think?

George


#2

George,

Thank you for reporting this. And many other problems!
Is this 12.3 or the 12.7 prototype?


#3

It is my pleasure :smile: I really want this thing to work well!
This was with 12.3 I have installed but not exercised 12.7 yet.

-George