What effect can the optimizer have for
On Intel/AMD I ran my
intpoly program (with
-n0) once with and once without optimizer. It showed a speed-up of about 3.
- no optimizer: 7.84s
gcc for Intel/AMD is version 4.8.2.
- no optimizer: 28.58s
gcc for Power8 is also 4.8.2.
The effect is less pronounced for floating point, it just showed a factor of 3 on Power8, and a factor of 2 for Intel/AMD. So, the effect of the optimizer depends on integer/floating-point, and CPU architecture.
For my Power8 tests I used the free test drive on RunAbove, which I learned on RunAbove: A POWER8 Compute Cloud With Offerings Up To 176 Threads in Phoronix.
intpoly on Power8 showed the same effect regarding multiple cores as described in CPU Usage Time Is Dependant on Load.
Update 19-Jun-2016: RunAbove no longer offers PowerP8 servers, their offer is now closed.