[LLVMdev] Macro-op fusion experiment
Jakob Stoklund Olesen
stoklund at 2pi.dk
Fri Apr 8 10:27:50 PDT 2011
On Apr 8, 2011, at 9:56 AM, NAKAMURA Takumi wrote:
>>> 8B C3 mov eax, ebx
>>> 03 C1 add eax, ecx
>>> 8B C3 03 C1 add eax, ebx, ecx
> In my understanding, twoaddr pass tends to emit such a sequence.
Yes, it always does, and the coalescer tries very hard to eliminate the copy.
> Though I don't have sandybridge, I have not measured.
> Prior processors(intel and amd) might spend 1 ALU to execute "mov",
> then mov - add must have dependency.
I think you will find it is more complicated than that. A 'mov' usually doesn't need an ALU resource.
You should read about the 'reservation station' style register renaming.
More information about the llvm-dev