Benchmarking another server
After publishing my previous blog entries about dual Xeon machine, I
was offered to do similar benchmarking of the server (thanks to city
of Largo and Dave Richards) with four Xeons 3 GHz. Every CPU is
running (or least Linux kernel thinks so) at 2.4 GHz. We still do not
know why not at 3GHz. The Xeons are of older generation, based on
Pentium IV architecture though, not based on new Core architecture (at
least this is what I can read from the Linux kernel). The system is
equipped with 16GB of RAM.
The system is installed with openSUSE 10.2 so I was able to spot
several issues that people using this (very good, BTW) version will
see. I fixed them all in my build system (I'll fix them properly in the
following days) and I was finally able to build without manual
intervention (this time both DEBs and RPMs).
The first two builds were done to get some overview of the speed and
to get motivated to get the total time down ;-)) Both were done with
cold ccache cache.
The system was preconfigured with hyperthreading turned on, so the
first idea was to compare the effect of this BIOS option on the build
time. It was done with only one process, and the result was terrible
8) Full build took 8 hours and 1 minute. With hyperthreading off, it
took only 7 hours and 9 minutes. So by turning hyperthreading off, the
build was actually faster - we saved 52 minutes! But uniprocess build
is probably the worst possible scenario for HT (for more info and
benchmarks about HT, see IBM developerWorks' excellent article
Hyper-Threading
speeds Linux).
The next set of build numbers was generated using hyperthreading
turned off and with ccache already populated with the previous builds
and using -P argument to build (not dmake). As I read today, it is not
module parallelism but directory parallelism.
I run 16 builds and measured the time of all builds (done with -P1,
-P2, ..., -P16). The result is in the following graph:
There are several important points about the graph - the best
performance was generated using -P13 or -P14 in this configuration
(four dual processors). Adding more processes doesn't make sense.
The build time of the fastest build was 33 minutes (both DEBs and RPMs
instsets and en-US language pack in both formats).
For the fun of it, I meassured the same statistics on my regular build
server with
single Pentium IV 2.8GHz machine:
So even on single processor machine, it does make sense to use more
parallel directory builds.
I'll collect the same statistics from these two machines, bus using
the other parallel build method - in directory parallelism (dmake -P)
and will blog about it tomorrow.
(And as always: this is not accurate statistics, I'm not filtering the
data, I'm not doing repetitive measurements, etc.)