x86 Overview

I get a lot of questions concerning the proliferation of x86 compatible machines. I decided I might save myself some time by hacking out a quick and dirty explanation page I could refer people to. That is what this is, and it probably contains a boatload of errors, but these are the answers you'd get, more or less, if you asked me in person the following questions:

Reading about the crusades has satisfied a great deal of my hunger for religous warfare, and so I don't attempt to compare these chips here.


Instruction Set Overview

I get a lot of questions on the state of x86 compatible architectures. The x86 world is comprised of a *lot* of differing chips. The first classification would by the native data length, called IA32 and IA64. This is, I think, Intel Architecture 32 bit and Intel Architecture 64 bit. This page will only cover IA32, since that is what most people have these days. I'll just mention that IA64 architectures include the Itanium and forthcoming McKinley processors, as well as AMD's forthcoming 'hammer series. A big reason people are talking about IA64 machines is that they can natively use memories in excess of 4GB, which is something IA32 does not handle natively . . .

OK, here's the IA32 x86 compatibles I'm aware of, catagorized by company, and then ordered roughly by release date (I leave out companies that are no longer real players, such as Cyrix or Centaur):

So, all these chips have one thing (at least) in common: they use the x86 instruction set architecture (ISA). What this means is that the same code will run on all of them (though it may run slower or faster). However, the newer chips contain ISA extensions, which, if the programmer uses them, will result in an executable that is not portable across all x86 platforms. NOTE: floating point instructions using the standard FPU (Floating Point Unit) are often said to by x87 compatible, since this is what the FPU was called.

x86 ISA Extensions:

MMX
Set of "MultiMedia eXtensions" to the x86 ISA. Mainly new instructions for integer performance, and maybe some prefetch. For Intel, all chips starting with the PentiumMMX processor possess these extensions. For AMD, all chips starting with the K6 possess these extensions.
SSE
Streaming SIMD (Single Instruction Multiple Data) Extensions. SSE is a superset of MMX (i.e., a chip with SSE automatically possesses MMX) These instructions are used to speed up single precision (32 bit) floating point arithmetic. By operating on 4 single precision values with one instruction, they allow for a theoretical peak of 4 FLOPs (FLoating point OPerations) every cycle (eg, a 500Mhz PIII can theoretically perform 2GFLOPS (2 billion FLoating point Operations Per Second)). The results returned by SSE are IEEE compliant (as are classical x86 floating point results). For Intel, all chips listed starting with the Pentium III possess SSE extensions. For AMD, all chips starting from Athlon4 possess SSE.
3DNow!
AMD's extension to MMX that does almost the exact same thing SSE does, except the single precision arithmetic is not IEEE compliant (i.e. it is not as fault-tolerant as x86 arithmetic). It is also a superset of MMX (but not of SSE; 3DNow! was released before SSE). It is supported only on AMD, starting with the K6-2 chip.
Enhanced 3DNow!
An extension to 3DNow! starting with the Athlon onward. Some additional prefetch commands (essentially, they added support for SSE-style prefetch, I think), and some other stuff I really don't know a whole lot about.
3DNow! Professional
AMD's extension that is essentially Enhanced 3DNow! + SSE. Available on AMD chips starting with the Athlon4.
SSE2
New instructions that perform double precision floating arithmetic. Allows for 2 double precision FLOPs every cycle. For Intel, supported on the Pentium 4. Not supported by any released AMD chip.


A Rose by 10^6 Names

There are an enormous number of names for each of these chips. There is usually at least two for each chip (a pre-release name, and a final name) and then there are names for various subcatagories. I'll try to discuss some of these here.

Linux/gnu catagories
These are very generic catagories, and my best guess as to their meaning is: Note that due to the various extensions, these chip catagories are only binary compatible if you use strict x86 instructions (no SSE or 3DNow!, etc).

Rundown of selected AMD chips

Rundown of selected Intel chips

Unfortunately, this doesn't even scratch the surface of the naming game. For instance, for every Intel class above, they define an extra catagory called "Xeon", which is usually the same chip with a slightly bigger L2 and a much bigger price tag. I think AMD is planning on playing this game as well . . .


Peak Floating Point Performance Overview

Peak floating point performance is given in FLOPS (Floating point Operations Per Second). It can be derived by some constant times the cycle time (eg, Mhz) of a chip. It also varies depending on what instruction set you are using (as explained above). With no ISA extensions, all the IA32 Intel architectures can do at most 1 FLOP per cycle (eg., a 500Mhz PIII can theoretically get at most 500 MFLOP). For all AMD machines before the Athlon, this number is actually less than one. For Athlon and later, however, it is 2 (eg, a 500Mhz Athlon has 1 GFLOP theoretical peak). Here is a table listing some of the newer chips, and the constant to multiply the cycle time by to get peak (an entry of 0 indicates that chip does not have the given ISA extension) FOR SINGLE PRECISION ARITHMETIC:

CHIP x87 SSE 3DNOW!
Pentium 1 0 0
Pentium II 1 0 0
Pentium III 1 4 0
Pentium 4 1 4 0
Athlon 2 0 4
Enhanced Athlon2 0 4
Athlon4 2 4 4
AthlonMP 2 4 4
Multiplier of cycle time to get single precision peak

Here's the same table fore double precision (64 bit) arithmetic:

CHIP x87 SSE2
Pentium 1 0
Pentium II 1 0
Pentium III 1 0
Pentium 4 1 2
Athlon 2 0
Enhanced Athlon2 0
Athlon4 2 0
AthlonMP 2 0
Multiplier of cycle time to get double precision peak