[PATCH] D54640: [DAGCombiner] narrow truncated binops

Thu Nov 29 10:08:46 PST 2018

andreadb added a comment.

LGTM.

At the beginning, I was a bit concerned by the increase in the number partial register writes.
So I started experimenting with your patch fo a bit. In practice, I couldn't come up with an obvious case where your patch introduces extra partial register stalls. Also, none of the tests modified by your patch really suffers from partial register stalls.
The checks perfomed by your transform are quite conservative (the new combine rule is limited to pre-legalization, and it only affects binary opcodes where one of the operands is a constant). So, I am happy with it.

================
Comment at: test/CodeGen/X86/clear-lowbits.ll:898-900
 ; X64-BMI2-NEXT:    movzwl %di, %eax
-; X64-BMI2-NEXT:    movl $16, %ecx
-; X64-BMI2-NEXT:    subl %esi, %ecx
+; X64-BMI2-NEXT:    movb $16, %cl
+; X64-BMI2-NEXT:    subb %sil, %cl
----------------
Not related to this patch: do you know why we need a movzwl here?

Only the lower 16-bits of EAX are 'defined' on exit from this function. Users of this function would not rely on the upper bits of EAX. So, - unless I am missing something - it is redundant to zero extend `%val`.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54640/new/

https://reviews.llvm.org/D54640