[PATCH] D111077: [LV] Support converting FP add to integer reductions.
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 7 02:38:04 PDT 2021
dmgreen added a comment.
In D111077#3040938 <https://reviews.llvm.org/D111077#3040938>, @fhahn wrote:
> In D111077#3040417 <https://reviews.llvm.org/D111077#3040417>, @dmgreen wrote:
>
>> Interesting idea. Are these two bits of code always the same?
>> https://godbolt.org/z/EfPKPTMdf
>
> I think both cases above should be the same. But I think we can construct slight variations where is they would not be. E.g. consider a loop where the induction variable starts at 0 and is incremented and overflow is allowed. If `n` would be negative, the result of removing the loop and converting `n` to a float would yield a negative number , but the loop version would always return a positive number. I might be missing some subtleties when it comes to sign handling, perhaps @scanon as further thoughts.
Yep, I was ignoring the negative numbers :) I meant more about the general idea of converting the loop to straight line code.
>> Should we be doing this more generally, outside the vectorizing reductions?
>
> I think it might be worthwhile to convert such reductions outside the vectorizer in some cases. My motivation for starting in LV is that it should be clearly profitable if it allows vectorization. For general loops without vectorization, it might not be profitable I think, e.g. for loops that only execute once, due to the conversion overhead.
As far as I can tell from this code: https://godbolt.org/z/caPszPafr
The trace through when n==1 would be
cmp w1, #1
b.lt .LBB0_3
cmp w1, #1
b.ne .LBB0_4
mov w8, wzr
movi d0, #0000000000000000
b .LBB0_7
sub w8, w1, w8
fmov s1, #1.00000000
subs w8, w8, #1
fadd s0, s0, s1
b.ne .LBB0_8
ret
vs straight line code with no branches:
bic w8, w1, w1, asr #31
mov w9, #1266679808
scvtf s0, w8
fmov s1, w9
fminnm s0, s0, s1
ret
And that's not including vectorization. It's kind of like a "high cost expansion" from SCEV (but to be fair as far as I understand we wouldn't always rewrite high cost exit values, even if it would mean deleting the loop (?)). Which is what made me wonder if we should be doing it generally, not just in the vectorizer. (Not that I have anything against this patch - it looks pretty sensible and doesn't complicate the reduction code any more than it already is. It seems to fit quite well).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D111077/new/
https://reviews.llvm.org/D111077
More information about the llvm-commits
mailing list