[PATCH] D21774: [X86] Transform setcc + movzbl into xorl + setcc

Michael Kuperstein via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 28 15:41:14 PDT 2016


mkuper added a comment.

Thanks for the feedback, Dave!

In http://reviews.llvm.org/D21774#469301, @DavidKreitzer wrote:

> Hi Michael,
>
> Thanks for working on this! Ultimately I am okay with this approach, but I have a few high level comments.
>
> (1) The advantage of the pseudo-SETcc instruction approach over this approach is that the dependence of the SETcc on the XOR is explicit.


Right, I agree it would be better to have an explicit dependence. It's just that I'm not a fan of either of the two ways we currently have to achieve this (modifying SETCC or introducing a pseudo).

> In your transformed code, I think there is nothing to prevent this:

> 

>   v1 = MOV32r0

>   <CC setter>

>   v2 = SETcc

>   v3 = INSERT_SUBREG(v1, v2)

> 

> 

> from being transformed into this

> 

>   <CC setter>

>   v2 = SETcc

>   v1 = MOV32r0

>   v3 = INSERT_SUBREG(v1, v2)

> 

> 

> I'd expect this to be a reasonably common occurrence, since the RA might choose to regenerate the MOV32r0 at its use to save on register pressure. (Regenerating the MOV32r0 eliminates the register conflicts with the operands of the CC setter.) And this would, ironically, have the effect of using an extra register & an extra movb instruction since v1 and v2 could no longer be assigned the same register.


I actually don't think this happens often, but you're right, conceptually it's not something we should be relying on, and it may be fatal.

> (2) Do you have any performance data on the patch?


No, I've verified it fixes the simple cases, but didn't really run comprehensive performance checks, as I assumed generating these xors is pure goodness. :-)
Will do, either for this patch or for a different one, depending on whether we go with this or with pseudos.


================
Comment at: lib/Target/X86/X86FixupSetCC.cpp:1
@@ +1,2 @@
+//===---- X86FixupSetCC.cpp - optimize usage of LEA instructions ----------===//
+//
----------------
DavidKreitzer wrote:
> Please fix the cut & paste error here.
Right, thanks.

================
Comment at: test/CodeGen/X86/fp128-compare.ll:11
@@ -10,3 +10,3 @@
 ; CHECK:       callq __gttf2
-; CHECK:       setg  %al
-; CHECK:       movzbl %al, %eax
+; CHECK:       xorl  %ecx, %ecx
+; CHECK:       setg  %cl
----------------
DavidKreitzer wrote:
> This looks like a problem ...
It's the same issue as pcmpestr - we are constrained because both the input of the instruction defining eflags, and the eventual output of the setcc must be eax.

The full code is:

```
# BB#0:                                 # %entry
	pushq	%rax
.Ltmp1:
	.cfi_def_cfa_offset 16
	callq	__getf2
	xorl	%ecx, %ecx
	testl	%eax, %eax
	setns	%cl
	movl	%ecx, %eax
	popq	%rcx
	retq

```
I should regenerate the test with the update script.


http://reviews.llvm.org/D21774





More information about the llvm-commits mailing list