Svd in SciDB-R


#1

Hello,

I tried ‘svd’ with the following example from documentation:

install_github(“SciDBR”,“Paradigm4”)
library(“scidb”)

x <- as.scidb(matrix(rnorm(500*500),500))
iquery(‘svd(x)’,return=TRUE)

But it does not work:

Error in scidbquery(query, afl, async = FALSE, save = “&save=lcsv+”, release = 0) :
SystemException in file: src/query/OperatorLibrary.cpp function: createLogicalOperator line: 85
Error id: scidb::SCIDB_SE_QPROC::SCIDB_LE_LOGICAL_OP_DOESNT_EXIST
Error description: Query processor error. Logical operator ‘svd’ does not exist.
Failed query id: 1101730719345

Do I need to load some library, how can I load it?

Thanks,
Meng


#2

Hi Meng,

You should have library dense_linear_algebra loaded.
Your syntax wasn’t exactly right. The scidb operator is called “gesvd” and it accepts the input array and an argument - ‘left’, ‘center’, or ‘right’.
The easiest way to do it is to use the R svd fucntion - i.e.:

> x <- as.scidb(matrix(rnorm(500*500),500))
> svd(x)
$u
A reference to a  500x500 SciDB array

$d
Reference to a SciDB vector of length 500 

$v
A reference to a  500x500 SciDB array

Paradigm4 also has another proprietary “tsvd” extension that is a good algorithm for truncated svd of sparse matrices.


#3

Hi Alex,

Thanks for your reply. I still want to do svd in scidb, when I do that in R, seems like the records were cutted to 1000 records.

So how can I load the dense linear algebra library, I tried iquery(‘load_library(‘dense_linear_algebra’)’), seems not the right way to go.

Thanks,
Meng


#4

install_github(“SciDBR”,“Paradigm4”)
library(“scidb”)

x <- as.scidb(matrix(rnorm(500*500),500))

Don’t do this: iquery(‘svd(x)’,return=TRUE)

Instead, just use:

s <- svd(x)

See also the help on the svd function in the R package, for example:

help(“svd”, “scidb”)


#5

Thanks!

I understand and could run svd() in R now, however, then I tried tsvd, seems it doesnt work?
I also managed to load the dense linear algebra library. But I only found gesvd, which is used to deal with the dense array. Looks to me the coordinate of the array should start with zero. Is there a function that could deal with sparse array?
Is there a way to migrate or redefine the array coordinate once the data were put into the template array? The way that I’m using is ‘subarray’, might not be practical all the time.

However,when I tried the gesvd, I get the following error
iquery(“gesvd(new1,‘U’)”)
Error in scidbquery(query, afl) :
UserException in file: src/linear_algebra/scalapackUtil/ScaLAPACKLogical.cpp function: checkScaLAPACKInputs line: 110
Error id: DLA::SCIDB_SE_INFER_SCHEMA::DLA_ERROR41
Error description: Error during schema inferring. ChunkInterval is too small.
Failed query id: 1103046567470

my array is like this:

iquery(‘dimensions(new1)’,return=TRUE)
No name start length chunk_interval chunk_overlap low high type
1 0 Gn1 0 82 82 0 0 81 int64
2 1 timeint 0 30 30 0 0 29 int64

I look forward to getting some answers.
In addition, svd() in R works, but find it takes much longer time than PCA e.g. prcomp() in R on the R array.

Regards
Meng


#6

Hi, and the sweep() also doesnt work?


#7

The sweep function is supported.

Here is a tiny example that centers a random 5x5 SciDB matrix:

library(“scidb”)
scidbconnect()
A = build(‘random()’, c(5,5), eval=TRUE)
B = sweep(A, 2, apply(A, 2, mean))
apply(B, 2, mean)[]
[1] 2.384186e-08 0.000000e+00 0.000000e+00 -4.768372e-08 4.768372e-08


#8

Thanks Bryan,

However, I got the following error…

A = build(‘random()’, c(5,5), eval=TRUE)
A
A reference to a 5x5 SciDB array
B = sweep(A, 2, apply(A, 2, mean))

Error in is.scidb(STATS) :
error in evaluating the argument ‘x’ in selecting a method for function ‘is.scidb’: Error in as.vector(x, “character”) :
cannot coerce type ‘closure’ to vector of type ‘character’

Could you help me to have a look?

Thanks
Meng


#9

Dear Meng,

Your code runs fine for me. Perhaps you don’t have the most recent version of the scidb package for R? You can always install that with (from R):

library(“devtools”)
install_github(“SciDBR”, “Paradigm4”, quick=TRUE)


#10

Hello,

here I got another problem:
I can not run svd, gemm or gesvd now (in the earlier times I can):

no matter I run:
svd(A5)
iquery(‘store(gemm(gemm(A5,B5, build(A5,0)),C5, build(A5,0)),product5)’)
iquery(‘gesvd(product5,‘U’)’)

I got the same error:

Error in scidbquery(query, afl) :
SystemException in file: src/mpi/MPISlaveProxy.cpp function: checkLauncher line: 59
Error id: scidb::SCIDB_SE_INTERNAL::SCIDB_LE_OPERATION_FAILED
Error description: Internal SciDB error. Operation ‘MPI launcher process already terminated’ failed.

Besides, I am trying to do the matrix inversion in SciDB, are there better ways than as listed in the documentation (using gemm and gesvd)?

Thanks a lot,
Meng