AMD Bulldozer CPU Architecture Overview

Architecture per core:
Bulldozer1


At the chip level, 8 cores per CPU:
BulldozerChipLevel

Registers per core:
BulldozerRegisters

Unfortunately I didn’t find opcode clock cycles on AMD’s website. AMD’s website seems to be almost unmaintained, i.e., lots of dead links. Below are some clock cycles gathered through experiment by Agner Fog. I took the values measured for Bulldozer in Instruction Tables.

opcode clock cycles
MOV 1
ADD,SUB 1
AND,OR,XOR 1
CMP 1
MUL 4
DIV 16
FABS 2
FADD,FSUB 5-6
FLD 2
FMUL 5-6
FDIV 10-42
FSQRT 10-52
FSIN 65-210
FCOS ~160
FSINCOS 95-160
FPTAN 95-245
FPATAN 60-440

Due to out-of-order execution timings may vary in actual programs.

See the following programmer’s manuals from AMD:

  1. AMD64 Architecture Programmer’s Manual Volume 1: Application Programming
  2. AMD64 Architecture Programmer’s Manual, Volume 3: General-Purpose and System Instructions
  3. AMD64 Architecture Programmer’s Manual, Volume 5: 64-Bit Media and x87 Floating-Point Instructions
Advertisements

2 thoughts on “AMD Bulldozer CPU Architecture Overview

  1. Pingback: Output of lstopo from hwloc | Elmar Klausmeier's Weblog

  2. Pingback: Operation Costs Measured in CPU Clock Cycles – Elmar Klausmeier's Weblog

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s