[PATCH] D125750: [InstCombine] fold fake floating point vector extract to shift+trunc.

Tue May 24 06:33:45 PDT 2022

spatel added a comment.

In D125750#3533124 <https://reviews.llvm.org/D125750#3533124>, @jacquesguan wrote:

> In D125750#3525445 <https://reviews.llvm.org/D125750#3525445>, @spatel wrote:
>
>> Is there a motivating codegen example for this transform? Or some other IR transform that will fire as a result of this transform?
>>
>> In the case with a shift, we have an extra IR instruction, so this would be a rare fold that increases instruction count. Maybe that's justifiable just because we want to keep the symmetry with the other patterns, but it would be better to show some kind of win from this patch.
>
> Mostly, it would be much cheaper if we use scalar shift + cast rather than bitcast + vector extractelemt, even the former might cause one more instruction in LLVMIR. For example in RISCV, the former one woule be lower to 3 scalar instructions, but the latter one would  firstly move from GPR to vector register and then use 2 vector instruction to extract the element, it is truely much more expensive, even without counting the vector configuration instruction that should be insert for using vector instructions.

Yes, I understand the codegen motivation. I should have been more explicit though - that's generally not enough to justify an IR canonicalization if the backend could just as easily do this transform.

We really want to show that the change in IR leads to an improvement in analysis and/or results in even more optimization. Double-check, but we could probably add a test like this:
https://alive2.llvm.org/ce/z/77k-Zg

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125750/new/

https://reviews.llvm.org/D125750