I write quite a bit of code in 64-bit x86_64 assembly language, and I am about to begin another large function library to provide all conventional bitwise, shift, logical, arithmetic, math operators and functions for s0128
, s0256
, s0512
, s1024
signed integer types and f0128
, f0256
, f0512
, f1024
floating-point types.
I have AMD FX-8150
(bulldozer) CPUs in both my computers (ubuntu64 and win7-64). After reviewing the operations my code needs to perform, I find a great number of recent bit manipulation instructions will be extremely helpful.
However, when I read various documents, including the official AMD documents on their website, I find endless contradictions about whether certain instructions and instruction sets are supported by bulldozer CPUs (FX-8150
) and/or piledriver (FX-8350
). The confusion is especially common with regard to the various recent bit manipulation instructions and instruction sets, and the FMA3
and FMA4
instruction sets.
I know some of the AMD documents are wrong, because I've been programming with FMA3
and FMA4
instructions on my FX-8150
and they work just fine, while the AMD document comparing bulldozer and piledriver contradict this.
Given that ALL sources of documentation I can find appear to be wrong to some degree about this issue, does anyone out there know which instructions and/or instruction sets work on piledriver (FX-8350
) but not bulldozer (FX-8150
)?
Since my problem is the validity of documentation out there, please don't just point me at some document unless you know for sure it is correct. The best answers would come from programmers who have tested these instructions and instruction sets on their bulldozer [and piledriver] CPUs.
FMA3
instruction, that generated aSIGILL
(illegal instruction), whileFMA4
instructions work fine (and indeed, I have dozens ofFMA4
instructions in my code). Of course if you look at the AMD document I gave a link to above, it claims the bulldozer CAN executeFMA3
instructions (wrong), but CANNOT executeFMA4
instructions (wrong). Now onto bit-oriented instructions. – honestannFX-8150
). After laborious checks of the variousCPUID
bits, they seem mostly accurate (but a royal pain to understand). One strange one is aFMA
bit (that contains false in my bulldozerFX-8150
) while it does executeFMA4
but notFMA3
instructions. But I did find anotherFMA
bit in the second set (with the0x80000000
prefix) that's set to 1. Overall,CPUID
does seem fairly reliable, while the documentation for bulldozer out there in the world is massively inconsistent and largely wrong. – honestann