[PATCH] D98708: [LoopVectorize] relax FMF constraint for FP induction

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 17 06:08:58 PDT 2021


dmgreen accepted this revision.
dmgreen added a comment.

In D98708#2631545 <https://reviews.llvm.org/D98708#2631545>, @spatel wrote:

> In D98708#2631295 <https://reviews.llvm.org/D98708#2631295>, @david-arm wrote:
>
>> Hi @dmgreen, yes of course you're right. I'd forgotten about the nsz requirement. It's definitely needed at compile time for vectorising FP reduction loops, i.e. `clang -freassociative-math -fno-trapping-math -fno-signed-zeroes`. I guess adding a check for nsz here is consistent with that?
>
> Yes - clang derived its requirements from gcc, so we've passed that into the optimizer in some places (instcombine at least). I don't know of any practical examples where you could have FP reassociation and still guarantee sign-of-zero, but maybe I'm not being imaginative. :)
> So currently there's no easy way (starting from C/C++ at least) to have IR that has `reassoc` without `nsz`.
>
> Ok if I push this change, so we're consistent within the vectorizer? Then, I'll push a follow-up (we'll need a pile of new regression tests) to add the `nsz` requirement for both induction and reduction. That way, we'll be conservatively correct in requiring the extra flag, and we'll match the expected IR coming out of clang.
>
> Note that the FMF requirements for fmul/fadd reduction/induction are different than the fmin/fmax patterns that we've also recently updated; fmin/fmax require `nnan` and `nsz` to rearrange, but not `reassoc` (since there's no FP math involved in those ops).

Yeah that sounds fine to me.

I was thinking more in terms of Alive - how reassoc might imply nsz or not, mathematically. It seems it would not general, but may be mistaken.

I'm not sure where nsz for a vanilla fadd reduction would be needed though. Reassoc might well be enough. For predication + inloop reductions (which no-one uses yet for floats) it would be important if the identity value was 0.0 and we generated:

  %x = select %cond, %load, {0.0, 0.0,...}
  %y = vecreduce.fadd(%x)
  %z = fadd %x, %phi

But if we change that to -0.0 instead, would there be another need of nsz?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98708/new/

https://reviews.llvm.org/D98708



More information about the llvm-commits mailing list