Performance Tuning


#1

Okay I’ve got two installations of Scidb 16.9, each running on an Ec2 M3.large instace. One I freshly installed using the steps here Scidb 16.9 CE Installation Guide (I did not deviate from any of the steps in the installation). And the second installation is an AMI that I acquired via ec2 community.

I ran a single query on both machines:
store(build(<val:double>[f=0:9999,2000,0,d=0:127,128,0],double(random()%100)/100),db)

On the machine with the fresh installation execution took 7.688 seconds. However, on the AMI machine execution took 1.513 seconds !!

I viewed the log file for both queries, it would seem that on the first machine the query used up almost the whole seven seconds on this:
2017-02-26 08:27:32,574 [0x7f4fd0be9700] [DEBUG]: DBArray::DBArray ID=304, UAID=303, ps=1, desc=public.ACCOUNT_152@1<val:double> [f=0:9999 (4611686018427387903:-4611686018427387903):0:2000; d=0:127 (4611686018427387903:-4611686018427387903):0:128] ArrayId: 304 UnversionedArrayId: 303 Version: 1 Flags: 0 Distro: dist: hash ps: 1 ctx: redun: 0 off: {} shift: 0 res: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] <val:double,EmptyTag:indicator NOT NULL> 2017-02-26 08:27:39,349 [0x7f4fd0cea700] [DEBUG]: handleReplicaChunk: received eof
as you can see the jump between 32s & 39s

I compared the log with the one in the AMI machine, same DEBUG line was found, however there was no time delay present:
2017-02-27 14:02:33,055 [0x7fb47b8b1700] [DEBUG]: DBArray::DBArray ID=20, UAID=19, ps=1, desc=public.ACCOUNT_10@1<val:double> [f=0:9999 (4611686018427387903:-4611686018427387903):0:2000; d=0:127 (4611686018427387903:-4611686018427387903):0:128] ArrayId: 20 UnversionedArrayId: 19 Version: 1 Flags: 0 Distro: dist: hash ps: 1 ctx: redun: 0 off: {} shift: 0 res: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] <val:double,EmptyTag:indicator NOT NULL> 2017-02-27 14:02:33,067 [0x7fb47bbb4700] [DEBUG]: handleReplicaChunk: received eof
It took twelve milliseconds to jump to the replica handling.

I find this strange, because on both machines I’m using the exact same config file! So performance should be the same on both, but it’s not.

Does the fresh installation of 16.9 lack any features that boost performance?


#2

Hello.

Interesting result. I can think of a few things that may account for it.

  1. Check the build type for whether or not it’s optimized. The easiest way to check is with --version:
$ scidb --version
SciDB Version: 16.9.0
Build Type: RelWithDebInfo
Commit: db1a98f
Copyright (C) 2008-2015 SciDB, Inc

RelWithDebInfo (“rel” meaning “release”) is optimized. Assert or Debug are not optimized. You can specify a build type when you run run.py setup and it is also displayed as part of the output of that command. I’d wager most likely your custom installation is not a “rel” build and that accounts for it.

  1. Check disk speeds since this is a write query. In our benchmarking we found the write and read performance of EC2 drives to be quite interesting. Depending on which option you choose, you sometimes get a “burst” of IO credits - so your IO goes fast but only for a few operations, and then slows down. This makes EC2 benchmarking a very interesting exercise sometimes.

More info here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-io-characteristics.html

I’d bet it’s more likely (1). Let us know if this makes sense.


#3

You’re absolutely correct. The difference was in the build type.

The debug build is extremely slow in computations compared to the release version. I will be finishing my work on the AMI image rather than the fresh installment.
Thank you very much for clarifying things.