Each chunk being performed twice


#1

Hi,

I noticed each chunk is performed twice.
I guessed it is because of SciDB’s multithreading (still weird, why the same chunk should be operated several times?)
Anyhow, I tried to update the following configuration variables to stop the multithreading.

result-prefetch-queue-size=1
result-prefetch-threads=1
operator-threads=1

But still it is happening.

Please help me out!

Thank you,
MJ


#2

Hi,

I notice that it happens when I use “store”.

For example,“iquery -aq multiply(m,transpose(m))” runs once but when I run “iquery -aq store(multiply(m,transpose(m)),n)”, then “multiply” is performed twice. Am I doing something wrong?
Please help me out!

MJ


#3

Hi, MJ,

I think it has to do with the hidden empty tag attribute. If you have some array fooval:double [x,y] there is a hidden attribute foo <val:double, empty_tag:indicator> which actually stores the information about which cells are populated with values. So that would show up in multiply because multiply has to return all chunks (val and empty_tag) to the caller. Without this attribute you cannot have sparse arrays.

To confirm that this is the case, print the AttributeID of the chunk in your debug statements.

In the current version there actually is a hidden “create not empty array foo val:double …” command but that syntax is going away. The semantics there are broken.

Also, multiply() itself is no longer present in 13.12, for example. Consider trying gemm(). For sparse, consider an aggregate(apply(cross_join())) type query.


#4

In that case, is there any way I can make use just one attribute? In my code, every array has just one attribute.

Thanks again!
MJ


#5

Hi,

Regarding the multiplication operation on sparse matrices, you mentioned that that can be done using “aggregate(apply(cross_join()))”.

Could you give me more about how to do?

Thank you!!


#6

Here you are:

$ iquery -aq "create array left_matrix <val:double> [r=1:2,2,0, c=1:3,3,0]"
Query was executed successfully

$ iquery -aq "store(build(left_matrix, r),left_matrix)"
{r,c} val
{1,1} 1
{1,2} 1
{1,3} 1
{2,1} 2
{2,2} 2
{2,3} 2

$ iquery -aq "create array right_matrix <val:double> [r=1:3,3,0, c=1:4,4,0]"
Query was executed successfully

$ iquery -aq "store(build(right_matrix, c),right_matrix)"
{r,c} val
{1,1} 1
{1,2} 2
{1,3} 3
{1,4} 4
{2,1} 1
{2,2} 2
{2,3} 3
{2,4} 4
{3,1} 1
{3,2} 2
{3,3} 3
{3,4} 4

$ iquery -aq "aggregate(apply(cross_join(left_matrix, right_matrix, left_matrix.c, right_matrix.r), m, left_matrix.val*right_matrix.val), sum(m) as val, left_matrix.r, right_matrix.c)"
{r,c} val
{1,1} 3
{1,2} 6
{1,3} 9
{1,4} 12
{2,1} 6
{2,2} 12
{2,3} 18
{2,4} 24

Operator cross_join is not yet as smart as it could be. Right now, it works by fully replicating the second argument on every instance. So, whenever possible, put the smaller array as the second argument to cross_join.


#7

Thank you!

Now that I tried A*transpose(A) for a 3000x3000 sparse array (90 non-zero entries), A.

It killed SciDB (see the error below).

SystemException in file: src/network/BaseConnection.h function: receive line: 294
Error id: scidb::SCIDB_SE_NETWORK::SCIDB_LE_CANT_SEND_RECEIVE
Error description: Network error. Cannot send or receive network messages.

I actually was using SciDB 12.12, is the error because of that or is there any other problem?

Thank you again!
MJ


#8

Yes it’s most likely because of 12.12. A lot has changed since then - you will not get good numbers there.
Recommend using 13.12.


#9

Hi,

I actually only have 12.12 with several custom operators so do you think I can install the newer version separately with the old version still working?

Thanks.


#10

We never tested 13.12 working together with 12.12.
You could try…

  1. Obviously install them into different directories, that happens by default
  2. Name the configurations differently - different config name, different PG user
  3. use base_port to make sure they listen on different ports. Make sure the base_port for 13.12 and base_port for 12.12 are at least NUM_INSTANCES apart as instances listen to base_port +1, base_port+2,…
  4. put the storage directories in different locations

Use iquery -p to specify the right port.
Should work, but cant say 100%.


#11

Hi,

Thank you for your guidance.

Unfortunately after I tried to install SciDB 13.12, the existing SciDB 12.12 gets not compilable.

It seems that boost library versions are conflicting since only thing I did is to have installed the boost 1.54.(Even after installed 1.54, SciDB 13.12 still has an error saying like “Boost 1.54 is not found”.)

Anyhow the bigger problem here is that SciDB 12.12 is not complied (errors from boost library).
e.g., CMakeFiles/preparecdashreport.dir/src/cdashreportapp.cpp.o: In function boost::program_options::basic_command_line_parser<char>::run()': /usr/local/include/boost/program_options/detail/parsers.hpp:107: undefined reference toboost::program_options::detail::cmdline::get_canonical_option_prefix()'
collect2: ld returned 1 exit status

Could you let me know which boost library is compatible with 12.12? I think I should install the boost library again.


#12

Hi,

I posted about SciDB12.12 is not complied after I installed boost 1.54 library and SciDB 13.12.

The errors are like:

scidb-12.12.0/tests/harness/src/helper.cpp:366: undefined reference to boost::filesystem::path::extension() const' CMakeFiles/preparecdashreport.dir/src/helper.cpp.o: In functionis_regular’:
/usr/local/include/boost/filesystem/operations.hpp:315: undefined reference to `boost::filesystem::detail::status(boost::filesystem::path const&, boost::system::error_code*)’

Please help me out!!
MJ


#13

Ok have you tried
cd 12_12/deployment
./deploy.sh prepare_toolchain localhost

All the necessary things should be in those scripts.