Error running loadcav.py with null values


#1

Hi - I am trying to do a parallel load, and I get an error that looks like this:

ERROR

Load failed.
UserException in file: src/query/ops/input/InputArray.cpp function: end line: 196
Error id: scidb::SCIDB_SE_IMPORT_ERROR::SCIDB_LE_FILE_IMPORT_FAILED
Error description: Import error. Import from file ‘myd_2013.dlf’ (instance 0) to array ‘myd_2013’ failed at line 2, column 134, offset 139, value=‘null’: Number of errors exceeds threshold.

I have also had an issue with nulls getting produced from the csv2scidb program. When I was trying the load that way, I had to remove the word null from the produced file (the scidbfile) so I could then load it using the iquery load command, otherwise it would not work to just use the .scidb file produced from the csv2scidb program. I don’t see any switches on csv2scidb or on loadcsv.py to handle nulls. I know that loadcsv.py is actually running csv2scidb, so my guess is the same thing is happening. The directions for csv2scidb say that you can just leave columns blank in the csv file, which I did. But the csv2scidb program outputs the word null, which then can’t be loaded.

Any suggestions?


#2

Did you set the attributes in the target array to be OK with null values?

ie.

CREATE ARRAY Load_Array 
< 
   attribute_name_1  : int64  NULLS DEFAULT NULL
>
[ I ... ]

By default, SciDB array attributes don’t allow nulls (missing codes). You need to be explicit about allowing them.


#3

No I had not, I overlooked that. Sorry for the obvious question. I went back and looked at the CREATE ARRAY statement, which clearly says I need to specify if nulls are allowed. Thanks for the response.