, 1 min read
AMD Bulldozer CPU Architecture Overview

At the chip level, 8 cores per CPU:

Unfortunately I didn't find opcode clock cycles on AMD's website. AMD's website seems to be almost unmaintained, i.e., lots of dead links. Below are some clock cycles gathered through experiment by Agner Fog. I took the values measured for Bulldozer in Instruction Tables.
| opcode | clock cycles |
|---|---|
| MOV | 1 |
| ADD, SUB | 1 |
| AND, OR, XOR | 1 |
| CMP | 1 |
| MUL | 4 |
| DIV | 16 |
| FABS | 2 |
| FADD, FSUB | 5-6 |
| FLD | 2 |
| FMUL | 5-6 |
| FDIV | 10-42 |
| FSQRT | 10-52 |
| FSIN | 65-210 |
| FCOS | ~160 |
| FSINCOS | 95-160 |
| FPTAN | 95-245 |
| FPATAN | 60-440 |
Due to out-of-order execution timings may vary in actual programs.
See the following programmer's manuals from AMD:
