[PATCH] D127801: [InstCombine] convert mask and shift of power-of-2 to cmp+select

Wed Jun 15 08:18:37 PDT 2022

spatel added a comment.

In D127801#3584097 <https://reviews.llvm.org/D127801#3584097>, @bcl5980 wrote:

> We need to assume `X < BitWidth` https://alive2.llvm.org/ce/z/-GjVsu

Ah, yes good point. The expansion back to the shift form requires an extra safety check for an over-shift in IR. Here's another version of the proof to show that:
https://alive2.llvm.org/ce/z/E_KCpi

And this is a hard-coded example of what the backend does currently when it creates a shift:
https://alive2.llvm.org/ce/z/6EfMnj

But in the backend, the target can specify the exact behavior for an over-shift - it does not have to be undefined (create poison). If a target returns 0 for an over-shift, then no masking or extra cmp+select may be needed.

Do you have an example for a target where the code looks worse with the cmp+select IR?

I tried some tests with x86 and AArch, and they do not look worse to me with the cmp+select IR. Here is an example:

  define i32 @src(i32 %x) {
    %shl = shl i32 2, %x
    %r = and i32 %shl, 64
    ret i32 %r
  }

  define i32 @tgt(i32 %x) {
    %i = icmp eq i32 %x, 5
    %r = select i1 %i, i32 64, i32 0
    ret i32 %r
  }

  % llc -o - sh2.ll -mtriple=x86_64 
  src: 
  	movl	%edi, %ecx
  	movl	$2, %eax
  	shll	%cl, %eax
  	andl	$64, %eax

  tgt:  
  	xorl	%eax, %eax
  	cmpl	$5, %edi
  	sete	%al
  	shll	$6, %eax

  % llc -o - sh2.ll -mtriple=aarch64
  src:                    
  	mov	w8, #2
  	lsl	w8, w8, w0
  	and	w0, w8, #0x40

  tgt:        
  	cmp	w0, #5
  	cset	w8, eq
  	lsl	w0, w8, #6

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127801/new/

https://reviews.llvm.org/D127801