EC2 vCPU-s vs. real cores


#1

[Not really a SciDB usage topic. But somewhat related too, hence posting here.]

EC2 lists machine types by something they call vCPU-s (or virtual CPU-s). The rates are dependent on, among other things, the number of vCPU-s on a machine.

For example,

  • m4.4xlarge with 16 vCPU-s is billed at $0.958 per hour
  • m4.10xlarge with 40 vCPU-s is billed at $2.394 per hour
    [Source]

Now, do vCPU-s equate to one core, or half a core? I have been doing some digging around.

Firstly here is a benchmark run by using gzip on EC2 instances.
https://www.pythian.com/blog/virtual-cpus-with-amazon-web-services/
The blogger’s tests showed that a vCPU equals half a core.

I ran the tests suggested in the above link on an m4.10x EC2 machine.

Run 1 gzip thread on Processor 0:         136 MB/s
Run 2 gzip threads on Processor 0:         68 MB/s
Run 2 gzip threads on Processor 0, 1:     136 MB/s
Run 2 gzip threads on Processor 0, 20:     86 MB/s

The only difference from the blogger’s result is that he/she had reported that consecutive processors were hyperthreaded (e.g. {0,1}, {2,3} on a 4 vCPU machine). My tests show that in a 2*N vCPU system, processors i and (N+i) belong to the same core.

Shows that 40 vCPU-s reported by Amazon actually equal 20 real cores.

Yes, there is a benefit due to hyper-threading (86 MB/s instead of 68 MB/s, but definitely less than the true dual core rate of 136 MB/s).
(Note: this is an ongoing investigation, so comments are welcome).

Also found the following:

Some independent software vendor (ISV) licensing is based on the number of virtual cores an instance provides. To assist you with virtual core licensing calculations for ISV software, the following tables shows the virtual cores provided by Amazon EC2 Instances and Amazon RDS DB Instances.

EC2 Instance Type
m4.10xlarge
Virtual Core Count
20
https://aws.amazon.com/ec2/virtualcores/

Here is another relevant article describing the problem:
https://blogs.oracle.com/partnertech/en/entry/cpu_utilization_of_multi_threaded

Here is the raw shell dump of my test:

ubuntu@ip-172-31-50-2:~$ taskset -pc 0 $$
pid 1920's current affinity list: 0,20
pid 1920's new affinity list: 0
ubuntu@ip-172-31-50-2:~$ dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null
2170552320 bytes (2.2 GB) copied, 15.9747 s, 136 MB/s
ubuntu@ip-172-31-50-2:~$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
[1] 2205
[2] 2207
ubuntu@ip-172-31-50-2:~$ 2170552320 bytes (2.2 GB) copied, 31.9454 s, 67.9 MB/s
2170552320 bytes (2.2 GB) copied, 31.9787 s, 67.9 MB/s
for i in {1..2}; do dd if
[1]-  Done                    dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null
[2]+  Done                    dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null
ubuntu@ip-172-31-50-2:~$ taskset -pc 0,1 $$
pid 1920's current affinity list: 0
pid 1920's new affinity list: 0,1
ubuntu@ip-172-31-50-2:~$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
[1] 2214
[2] 2216
ubuntu@ip-172-31-50-2:~$ 2170552320 bytes (2.2 GB) copied, 15.9378 s, 136 MB/s
2170552320 bytes (2.2 GB) copied, 15.9469 s, 136 MB/s

[1]-  Done                    dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null
[2]+  Done                    dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null
ubuntu@ip-172-31-50-2:~$ taskset -pc 0,20 $$
pid 1920's current affinity list: 0,1
pid 1920's new affinity list: 0,20
ubuntu@ip-172-31-50-2:~$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
[1] 2234
[2] 2236
ubuntu@ip-172-31-50-2:~$ 2170552320 bytes (2.2 GB) copied, 25.2956 s, 85.8 MB/s
2170552320 bytes (2.2 GB) copied, 25.2981 s, 85.8 MB/s

[1]-  Done                    dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null
[2]+  Done                    dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null
ubuntu@ip-172-31-50-2:~$ taskset -pc 1,21 $$
pid 1920's current affinity list: 0,20
pid 1920's new affinity list: 1,21
ubuntu@ip-172-31-50-2:~$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
[1] 2243
[2] 2245
ubuntu@ip-172-31-50-2:~$ 2170552320 bytes (2.2 GB) copied, 25.2869 s, 85.8 MB/s
2170552320 bytes (2.2 GB) copied, 25.2887 s, 85.8 MB/s

[1]-  Done                    dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null
[2]+  Done                    dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null