[PATCH] D63318: [DAGCombine] Teach DAGCombine to fold the aext + select pattern

Tue Jun 18 21:53:27 PDT 2019

steven.zhang marked an inline comment as done.
steven.zhang added inline comments.

================
Comment at: llvm/test/CodeGen/X86/cmov-promotion.ll:130
+; CMOV-NEXT:    movl $4294967295, %eax # imm = 0xFFFFFFFF
+; CMOV-NEXT:    cmovneq %rcx, %rax
 ; CMOV-NEXT:    retq
----------------
spatel wrote:
> steven.zhang wrote:
> > craig.topper wrote:
> > > steven.zhang wrote:
> > > > craig.topper wrote:
> > > > > This is slightly worse. Maybe don't do this when zext is free?
> > > > I check the scheduling information for cmovneq and cmovnel, both latency are 1. I didn't catch the "slightly worse" you mean ... Could you explain it more, as I am new to X86 instr. Thank you!
> > > Cmovneq’s encoding is 1 byte longer than cmovnel. The 64-bit size requires a REX prefix.
> > I wonder if we can do this inside X86 target, as it seems a valid improvement for x86. For cmovneq, if the high 32bit is zero, use cmovnel ?
> The problem is larger than just this transform or x86. We do the same transform in instcombine, so we need to check constant values to reverse it.
> 
> But there's no reason to make the problem worse by not using the existing TLI hook suggested by Craig. If we just add one more clause to the 'if' check, we avoid this test diff without changing any others:
> 
>     if (isa<ConstantSDNode>(Op1) && isa<ConstantSDNode>(Op2) &&
>         (Opcode != ISD::ZERO_EXTEND || !TLI.isZExtFree(N0.getValueType(), VT))) {
> 
Ah, sorry, I didn't get the point. Sure, it makes sense.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63318/new/

https://reviews.llvm.org/D63318