[PATCH] D80466: [X86] Improve i8 + 'slow' i16 funnel shift codegen
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 26 14:45:01 PDT 2020
efriedma added inline comments.
================
Comment at: llvm/test/CodeGen/X86/fshl.ll:22-23
+; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT: shll $8, %eax
+; X86-NEXT: orl %edx, %eax
+; X86-NEXT: andb $7, %cl
----------------
craig.topper wrote:
> RKSimon wrote:
> > foad wrote:
> > > Would it be worth trying to generate just `movb %al, %dh` instead of zext+shll+orl?
> > Yes that might be useful but probably should be done generally. I don't know much about the hi-byte move logic @craig.topper might be able to advise?
> I think you'd have to jump through some hoops to get the register allocator to do it. You'd need an INSERT_SUBREG to force the join. Possibly even a pseudo instruction on 64-bit to force NOREX on the other register to avoid an encoding issue.
>
> I'm not sure it makes sense to write an h register on modern Intel CPUs. It guarantees a merge uop needs to be generated when bits 15:8 and 7:0 are both read by the consuming instruction.
On processors that don't have special rename machinery for 8-bit registers, it should simply save an instruction, if it's legal. On big Intel cores, even if it doesn't save a uop, it should still be smaller.
That said, even if it's profitable in this exact case, the register allocation constraints to make it work are really tight; it's probably only worthwhile if the values are already in ABCD registers.
================
Comment at: llvm/test/CodeGen/X86/fshl.ll:26
+; X86-NEXT: shll %cl, %eax
+; X86-NEXT: movb %ah, %al
; X86-NEXT: retl
----------------
We should probably prefer shrl over movb.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D80466/new/
https://reviews.llvm.org/D80466
More information about the llvm-commits
mailing list