[PATCH] D150670: [WebAssembly] Disable generation of fshl/fshr for rotates

Paulo Matos via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue May 16 07:53:07 PDT 2023


pmatos added a comment.

Proposal to fix https://github.com/llvm/llvm-project/issues/62703

This not ready to land yet but it should open a discussion on if this is a good fix for this issue.
If we want to go this route we need: support for 64bits, tests.

A quick summary of why this is needed. The function :

  inline u32 bswap(u32 x) {
      return __builtin_rotateleft32((x & 0xFF00FF00), 8) 
          | __builtin_rotateright32((x & 0x00FF00FF), 8);
    }

generates really poor code in WebAssembly. A rotate should be generated but instead we get quite a large amount of code just to set up constants and perform a shift. The issue is that the rotateleft is lowered to a fshl(%and, %and, 8). During instcombine, the second argument is simplified since %and is the result of a bitwise operation. However, this results in the wasm backend not being able to generate the rotate since the first and second arguments to the fshl are not any longer the same. The generated code is:

  bswap:                                  # @bswap
  	.functype	bswap (i32) -> (i32)
  # %bb.0:                                # %entry
  	local.get	0
  	i32.const	24
  	i32.shl
  	local.get	0
  	i32.const	65280
  	i32.and
  	i32.const	8
  	i32.shl
  	i32.or
  	local.get	0
  	i32.const	8
  	i32.shr_u
  	i32.const	65280
  	i32.and
  	local.get	0
  	i32.const	24
  	i32.shr_u
  	i32.or
  	i32.or
                                          # fallthrough-return
  	end_function

With the fix this becomes:

  bswap:                                  # @bswap
  	.functype	bswap (i32) -> (i32)
  # %bb.0:                                # %entry
  	local.get	0
  	i32.const	16711935
  	i32.and
  	i32.const	8
  	i32.rotr
  	local.get	0
  	i32.const	-16711936
  	i32.and
  	i32.const	8
  	i32.rotl
  	i32.or
                                          # fallthrough-return
  	end_function

The alternative way to implement something like this would be to block the instcombine on fshl/fshr instructions for the wasm backend, however adding target dependent stuff to instcombine feels even more icky than this patch. I welcome suggestions on how to improve the patch, since the hack in `emitRotate` is also not great, and I have not seen other places lowering a generic builtin into a target specific intrinsic.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150670/new/

https://reviews.llvm.org/D150670



More information about the cfe-commits mailing list