, 1 min read

Effect of Optimizer in gcc on Intel/AMD and Power8

What effect can the optimizer have for gcc?

On Intel/AMD I ran my intpoly program (with -n0) once with and once without optimizer. It showed a speed-up of about 3.

  1. no optimizer: 7.84s
  2. -O3: 2.25s

gcc for Intel/AMD is version 4.8.2.

On Power8 I again ran intpoly (with -n0). The factor is more than 8 (eight).

  1. no optimizer: 28.58s
  2. -O3: 3.31s

gcc for Power8 is also 4.8.2.

The effect is less pronounced for floating point, it just showed a factor of 3 on Power8, and a factor of 2 for Intel/AMD. So, the effect of the optimizer depends on integer/floating-point, and CPU architecture.

For my Power8 tests I used the free test drive on RunAbove, which I learned on RunAbove: A POWER8 Compute Cloud With Offerings Up To 176 Threads in Phoronix.

Interestingly enough, intpoly on Power8 showed the same effect regarding multiple cores as described in CPU Usage Time Is Dependant on Load.

Update 19-Jun-2016: RunAbove no longer offers PowerP8 servers, their offer is now closed (Dead link: https://www.runabove.com/ibm-power-8.xml).