[PATCH] D123453: [InstCombine] Fold mul nuw+lshr to a single multiplication when the latter is a factor

Wed Apr 20 00:13:32 PDT 2022

bcl5980 added a comment.

  void test(unsigned* pa, unsigned* pb, unsigned short i)
  {
      unsigned a = i * 100;
      unsigned b = a >> 2;
      *pa = a;
      *pb = b;
  }

I write a test to verify backend can optimize the pattern without one use.
This is the AArch64 baseline with one-use:

  // %bb.0:                               // %entry
  	and	w8, w2, #0xffff
  	mov	w9, #100
  	mul	w8, w8, w9
  	lsr	w9, w8, #2
  	str	w8, [x0]
  	str	w9, [x1]
  	ret
                                          // -- End function

This is the AArch64 result we remove one-use:

  "?test@@YAXPEAI0G at Z":                   // @"?test@@YAXPEAI0G at Z"
  // %bb.0:                               // %entry
  	mov	w8, #100
  	and	w9, w2, #0xffff
  	mov	w10, #25
  	mul	w8, w9, w8
  	mul	w9, w9, w10
  	str	w8, [x0]
  	str	w9, [x1]
  	ret
                                          // -- End function

This is the X86 baseline with one-use:

  "?test@@YAXPEAI0G at Z":                   # @"?test@@YAXPEAI0G at Z"
  # %bb.0:                                # %entry
  	movzwl	%r8w, %eax
  	imull	$100, %eax, %eax
  	movl	%eax, (%rcx)
  	shrl	$2, %eax
  	movl	%eax, (%rdx)
  	retq
                                          # -- End function

This is the X86 result we remove one-use:

  "?test@@YAXPEAI0G at Z":                   # @"?test@@YAXPEAI0G at Z"
  # %bb.0:                                # %entry
  	movzwl	%r8w, %eax
  	imull	$100, %eax, %r8d
  	leal	(%rax,%rax,4), %eax
  	leal	(%rax,%rax,4), %eax
  	movl	%r8d, (%rcx)
  	movl	%eax, (%rdx)
  	retq
                                          # -- End function

Backend is not easy to figure out the case as we need to loop mul's operation 0.

In D123453#3459579 <https://reviews.llvm.org/D123453#3459579>, @spatel wrote:

> 2. There's an existing fold for: `(X * C2) << C1 --> X * (C2 << C1)`
>
> ...and it does not check for one-use. For consistency, we probably don't want the one-use restriction here either. If there's already a multiply in the pattern before this transform, another one is probably fine? The backend could theoretically decompose it back to shift (but I don't think we have that transform currently).

Can we add one-use for this pattern also? https://godbolt.org/z/x3bo7q54j

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123453/new/

https://reviews.llvm.org/D123453