[PATCH] D54640: [DAGCombiner] narrow truncated binops

Thu Nov 29 10:38:40 PST 2018

andreadb added inline comments.

================
Comment at: test/CodeGen/X86/clear-lowbits.ll:898-900
 ; X64-BMI2-NEXT:    movzwl %di, %eax
-; X64-BMI2-NEXT:    movl $16, %ecx
-; X64-BMI2-NEXT:    subl %esi, %ecx
+; X64-BMI2-NEXT:    movb $16, %cl
+; X64-BMI2-NEXT:    subb %sil, %cl
----------------
craig.topper wrote:
> craig.topper wrote:
> > andreadb wrote:
> > > Not related to this patch: do you know why we need a movzwl here?
> > > 
> > > Only the lower 16-bits of EAX are 'defined' on exit from this function. Users of this function would not rely on the upper bits of EAX. So, - unless I am missing something - it is redundant to zero extend `%val`.
> > > 
> > I think it needs to be zero extended because we're using 32-bit shifts and you need zeros in the upper bits to guarantee that shifting right will shift 0s into the lower 16 bits.
> Though we also shift back left by the same amount. So it is redundant, but SimplifyDemandedBits stopped when it saw the shift amount on the shl was non-constant. Maybe for the non-constant case, SimlifyDemandedBits should see if the input is a SHR with the same amount and then propagated demanded bits to the shr input.
mm.. but isn't the second shift (the shlxl) using the same shift count?
Even if the shift right introduces non zero bits, the second shift would get rid of those extra bits anyway (unless I am reading the code completely wrong... which is probably the case, since I am tired).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54640/new/

https://reviews.llvm.org/D54640