[PATCH] D45453: [InstCombine] Enable Add/Sub simplifications with only 'reassoc' FMF

Wed Apr 11 12:00:24 PDT 2018

wristow added a comment.

Thanks for the clarifications Sanjay!

In https://reviews.llvm.org/D45453#1064271, @spatel wrote:

> ...
>  Yes. Example from fast-math.ll:
>
>   ; (X + C1) + C2 => X + (C1 + C2)
>   define float @fold5_reassoc(float %f1, float %f2) {
>   ; CHECK-LABEL: @fold5(
>   ; CHECK-NEXT:    [[ADD1:%.*]] = fadd reassoc float [[F1:%.*]], 9.000000e+00
>   ; CHECK-NEXT:    ret float [[ADD1]]
>   ;
>     %add = fadd float %f1, 4.000000e+00
>     %add1 = fadd reassoc float %add, 5.000000e+00
>     ret float %add1
>   }
>  
>
>
> So put something like that under the existing test to show the minimal FMF requirement. Or just replace the 'fast' in the existing test with 'reassoc'.
>
> > I have a concern that meaningful transformations based on the 'reassoc' flag alone may require both -instcombine and -reassociate.
>
> There could be some kind of symbiotic behavior here, but this is really saying that we don't have a well-defined division of labor between these passes, right?

Yes, a lack of a well-defined division of labor between these passes is exactly my concern.  I'll look into adding or modifying tests.  Depending on possible interaction, that may not be fruitful.  Ultimately, that should be addressed.

To be explicit, my immediate goal is to make reassociation "work" when unrelated FMF are disabled.  For example:

  float foo(float arg, float x) { return (arg + x) - x; }

should simply return `arg` when compiled at `-O2 -ffast-math -fno-reciprocal-math`.  But as it currently stands, since `-fno-reciprocal-math` makes `isFast()` false, disabling the reciprocal transformation incorrectly prevents the reassoication.

So although ultimately the -instcombine/-reassociate division of labor should be addressed, I'd prefer to make this small step in improving the FMF status without doing that more substantial piece of work.

>> So in a sense, an instruction that has only 'reassoc' set on it isn't complete, in that it cannot be generated by Clang.  I feel like this is an IEEE thing, rather than a C/C++ thing (so other language front-ends ought to do something similar).  But from this perspective, is checking only 'reassoc' OK?  Or is should we change tests that have only 'reassoc' on an instruction to also have 'nsz'?  Which then leads me to the problem/question that `-fno-trapping-math` doesn't have a bit in FMF, so it creates a new dimension to the problem/question.
>> 
>> Anyway, I took the interpretation that when the 'reassoc' bit is set, it in a sense means the equivalent of at least `-fassociative-math -fno-signed-zeros -fno-trapping-math` from a C/C++ command-line perspective.  I'm not sure how we should interpret this.  As I write this comment, I feel like I've opened a can of worms.
> 
> Hopefully, not. :)
> 
> We need to be careful here - just because clang is behaving some way doesn't mean other front-ends are doing the same. We defined the LLVM FMF (maybe not precisely enough in the LangRef) so 'reassoc' and 'nsz' are independent bits.

OK, I can certainly be on board with that.  My point about it feels like an IEEE thing rather than a C/C++ thing was to imply that other front-ends really should be doing the same.  But in any case, I do like that clarity of (for example) 'reassoc' and 'nsz' explicitly being understood to be independent bits (or more directly, that 'reassoc' alone isn't generally enough to enable reassoication).  Further, that clarification helps to keep the can of worms closed.

> ...
>  So a few ways forward:
> 
> 1. Easiest: change the FMF requirements in this patch to reassoc + nsz. We're confident that this code only deals with fadd/fsub, so { nnan , ninf , arcp , afn , contract } shouldn't be in play.
> 2. More involved: audit (and add tests for) all of the potential transforms enabled by FAddCombine and determine when only 'reassoc' is needed.
> 3. Ideal: audit instcombine and the reassociate pass and split up the transforms, so we don't have redundancies - or at least the redundancies are justified and documented. Note that this goes beyond just the FAddCombine chunk of code. We also do reassociation in instcombine under the name SimplifyAssociativeOrCommutative(), and I think FAddCombine() has redundant code for things that are handled there too.

Short term, I'll intend to do (1) -- this approach will let me meet my goal of making reassociation work when unrelated FMF are disabled (at least it will in some cases).  In doing that, I'll experiment with updating the tests in fast-math.ll, like you suggest.
Also in doing that, I'll look into (2) at least somewhat, to see if there are additional changes that clearly would go well as part of this sort of change.

Longer term, I'll look into (3).  Not sure when I'll get to that, though.

https://reviews.llvm.org/D45453