3
votes

As I have always understood it, AMD built their CPUs by reverse engineering Intel's instruction set and now pay Intel to use their instruction set, and Intel do the same for AMDs 64-bit instructions.

This is how windows can be installed on both types of CPUs without needing to purchase a specific build, such as a version compiled for ARM, and so all apps, games etc work in the same way, working interchangeable on CPUs...

However lately some things have been making me question some of this...

Firstly, I've noticed some games have been a bit laggy on my system (AMD) and after reading it turns out the game is optimised for Intel CPUs...

Also, OSX is sold on Intel CPUs but after discovering the hackintosh community it turns out it is possible but very hard to get OSX to run on AMD. This is because again OSX is designed for Intel...

After these things..

What does it mean to be optimised for Intel or AMD? How can it be possible to be different / optimised for one but not the other, if they are meant to be slot in replacements for each other? I.e both support same instructions etc.

5
This site is for programming (code) and programmers tools related questions. What specific programming question can we help you with?Ken White
AMD never "reverse engineered" anything, thats complete and utter nonsense -- internet BS. This myth is repeated over and over again -- without citing any credible source. Reverse engineering hardware like that would be highly illegal in many countries, in fact if it was true Intel would've already sued the crap out of AMD, probably even bought the entire company after sueing it into bankruptcy. Just because some criminal activities might be possible they wont be legal. AMD created their own solutions whilst building upon shared knowledge. End of story.specializt

5 Answers

7
votes

They implement the same ISA, but with different performance characteristics because the microarchitecture is different.

e.g. see Agner Fog's microarch pdf for details, and other links from the tag wiki. e.g. David Kanter's Haswell microarchitecture writeup vs. his writeup of AMD Bulldozer.

Agner Fog's instruction tables also show you exactly how fast each instruction is on each CPU. e.g. imul r64, r64/m32, imm32 is 6 cycle latency / one per 4c throughput on AMD Bulldozer-family. On Intel SnB-family, it's 3c latency with one per 1c throughput.

So when tuning for AMD, it would be worth replacing a 64bit multiply by a constant with a couple shifts / adds if possible. On Intel, it's maybe only worth it if you can get the job done in one or 2 shift / lea instructions.

AMD's designs also have a notably weaker cache hierarchy, and lower single-threaded throughput due to using pairs of cores that are permanently split instead of Intel's Hyperthreading dynamic sharing of resources between two hardware threads on the same core. IIRC, AMD is planning to change that for their next microarchitecture. Some of this is stuff you can't really "optimize for", it's just AMD being slower. :(


So they run the same code, because that's what it means to be the same architecture.

Some CPUs support ISA extensions (new instructions) that the other doesn't. e.g. XOP is AMD-only, while AVX2 and BMI2 are (so far) Intel-only, so code that wants to use more than a common baseline has to check for support at runtime.

Wikipedia's AMD Excavator article is not very up to date. Hardware has been out for a while now, but the article still says it's "expected to have" AVX2 and BMI2. Agner Fog hasn't tested it and updated his instruction tables yet, either.

2
votes

When I first saw this question it had more downvotes than upvotes. But I think it is a reasonable question related to system performance and the differences between AMD and Intel processors. I think there are a couple of points worth addressing.

ISA Licensing

As I have always understood it, AMD built their CPUs by reverse engineering Intel's instruction set and now pay Intel to use their instruction set, and Intel do the same for AMDs 64-bit instructions.

I don't know the full history of AMD and Intel license agreement for x86, but this is a bit of an oversimplification. Currently AMD and Intel have a cross licensing agreement that allows both of them to implement the same ISA. For instance the 64-bit extensions to the x86 ISA were developed by AMD back when Intel was pushing the Itanium ISA. Regardless it is true that both AMD and Intel support the same core x86 ISA now and they generally have extensions to it that are compatible with each other.

Overall performance

Firstly, I've noticed some games have been a bit laggy on my system (AMD) and after reading it turns out the game is optimised for Intel CPUs...

The overall performance of program execution depends on three basic things. The number of instructions that need to be executed, the frequency of the CPU (clock speed), and the number of instructions executed per cycle (per clock tick). Currently high-end Intel CPUs tend to have better overall performance than AMD CPUs, even when executing the exact same application that does not have any specific optimizations. So it's likely that if the game is slow on your system it is just because the CPU is too slow, rather than because it's optimized for a particular microarchitecture. Also there could be other factors (GPU tends to matter the most for gaming), but debugging the performance of a game isn't going to be on-topic for stackoverflow, unless you are a game developer trying to understand a specific coding problem.

CPU Specific Optimizations

What does it mean to be optimised for Intel or AMD? How can it be possible to be different / optimised for one but not the other, if they are meant to be slot in replacements for each other? I.e both support same instructions etc.

Although Intel and AMD both develop CPUs that run x86 applications, the internal microarchitecture of the CPUs is different. And there is not simply an Intel microarchitecture or an AMD microarchitecture. Instead each company has various different groups of CPUs that it develops under a specific microarchitecture. So a program could be optimized for Skylake (and Intel microarchitecture) or Bulldozer (an AMD microarchitecture).

When the compiler is generating code it can make very minor tweaks that might benefit one microarchitecture more than another. If a developer doesn't know what the target CPU family is then it might make sense not to target a specific microarchitecture and simply generate code that is expected to perform the best overall. But if the developer know which microarchitecture the program will run on then it can be possible to get a slight performance improvement by specializing for that microarchitecture.

Usually these performance gains are pretty small compared to the baseline optimization. One exception is when a new feature like SSE4 is available. In that case it could make a big difference for certain workloads that are able to take advantage of the new feature. But even then the optimization is more specific to that feature than a specific processor vendor since both AMD and Intel support SSE4 now.

0
votes

Software compatibility with processors is ensured by the fact that they can be queried for availability of certain well-defined instructions or instruction groups. (The instruction sets are extremely volatile these days; this can be a nightmare for developers.)

So even among the Intel family, programs can run at quite different performance, depending on what the processor supports and how the software exploits it.

-1
votes

basically there is a difference in the processing. AMD and Intel pay each other fees for using the patents of the others. That does not mean that both have the same design. The base instruction set is equal, but both have additional instructions that are specific for the CPU while they are basically emulated on the other CPU (at least most of them) which causes a software using the additional (optimized) instructions from Intel on AMD might be slower as the other way round. Additionally it is not said that all instructions will be emulated on both of the CPUs. There can be slight differences.

Hope this clarifies it a little ;-)

-1
votes

SIMD instructions are very different, and for some tasks (like games) they can make a difference. See this answer for specific example: https://stackoverflow.com/a/17355341/126995

If you really want to, you can create several versions of your inner-loop algorithms, and use cpuid in runtime to select the best implementation for the platform. Some people do just that, e.g. the people developing x264 video codec definitely do:

int x264_intra_satd_x9_4x4_ssse3( uint8_t *, uint8_t *, uint16_t * ); // Intel 2006+, AMD 2011+
int x264_intra_satd_x9_4x4_sse4( uint8_t *, uint8_t *, uint16_t * ); // Both around 2006 but slightly different instructions
int x264_intra_satd_x9_4x4_avx( uint8_t *, uint8_t *, uint16_t * ); // Intel 2011, AMD around 2012
int x264_intra_satd_9_4x4_xop( uint8_t *, uint8_t *, uint16_t * ); // AMD only

For many projects, doing like that i.e. optimizing for all of them is prohibitively expensive. So the software got optimized for only most popular architecture[s].

This page http://store.steampowered.com/hwsurvey?platform=pc (click on Other Settings) tells that:

  • 99.95% have SSE3
  • 91.04% have SSSE3
  • 84.76% have SSE4.1
  • 81.60% SSE4.2
  • 67.56% AVX (mostly Intel, I think)
  • 22.05% SSE4a (that’s AMD only)

If you’re managing a project, and you have a choice how to spend your budget: would you specifically optimize your software for 67% users who have AVX or for 22% of users who have SSE4a?


AMD implemented SSE4a before it implemented SSSE3. 22.83% of the users use AMD and since 22.05% of the users have SSE4a it's safe to say nearly all AMD users have SSE4a. I think we can conclude that the majority of users without SSSE3 there are AMD K10 users. This the main reason that SSE3 is becoming the baseline and not SSSE3.