[PATCH] D47980: [InstCombine] Fold (x << y) >> y -> x & (-1 >> y)

Mon Jun 11 10:01:39 PDT 2018

rampitec added a comment.

In general for AMDGPU two shifts are better. Any shift immediate can be folded right into the shift instruction while a rather big mask produced by this change would require either extra 4 bytes in the encoding or even worse a move and a register.

What's the rational for the folding?

In addition as tests suggest we would expect the pattern to be folded into a bfe instruction but https://reviews.llvm.org/D48005 shows it is at best "bfm" (with an extra register to hold a mask) and "and". I.e. it basically shows a regression for our target. There probably would be no concern if the sequence is converted to a bfe as expected.

Repository:
  rL LLVM

https://reviews.llvm.org/D47980