[PATCH] D45453: [InstCombine] Enable Add/Sub simplifications with only 'reassoc' FMF

Wed Apr 11 07:31:31 PDT 2018

spatel added a comment.

In https://reviews.llvm.org/D45453#1063950, @wristow wrote:

> A couple of short comments/questions, and one longer one:
>
> In https://reviews.llvm.org/D45453#1063225, @spatel wrote:
>
> > ... the remaining big blob is FAddCombine() - which is poorly named because as shown in this patch, it's also called from visitFSub()...
>
>
> Should we rename it to FAddSubCombine() in this work (assuming this patch moves forward, given the other questions here)?

I'd prefer more specific like reassociateFAddFSub(), but I wouldn't bother unless we want to keep that code in its current form.

>> ... I'd still like to have dedicated tests in instcombine for whatever is happening in FAddCombine() though, so add to the tests from https://reviews.llvm.org/rL170471 to demonstrate at least some of the diffs without needing -reassociate?
> 
> To be sure I'm understanding, you mean demonstrate some diffs in those tests using the flag 'reassoc' rather than 'fast' (without needing -reassociate), right?

Yes. Example from fast-math.ll:

  ; (X + C1) + C2 => X + (C1 + C2)
  define float @fold5_reassoc(float %f1, float %f2) {
  ; CHECK-LABEL: @fold5(
  ; CHECK-NEXT:    [[ADD1:%.*]] = fadd reassoc float [[F1:%.*]], 9.000000e+00
  ; CHECK-NEXT:    ret float [[ADD1]]
  ;
    %add = fadd float %f1, 4.000000e+00
    %add1 = fadd reassoc float %add, 5.000000e+00
    ret float %add1
  }

So put something like that under the existing test to show the minimal FMF requirement. Or just replace the 'fast' in the existing test with 'reassoc'.

> I have a concern that meaningful transformations based on the 'reassoc' flag alone may require both -instcombine and -reassociate.

There could be some kind of symbiotic behavior here, but this is really saying that we don't have a well-defined division of labor between these passes, right?

> My longer question relates to:
> 
>> ... but if others are confident that it's just reassociation (without need for nsz/nnan/ninf), then I suppose this is ok to loosen up.
> 
> In a sense, I do feel we should check 'nsz' in addition to 'reassoc' (and maybe others).  And similarly, the tests in the other files that I modified in this patch possibly should only do the transformation if 'reassoc' and 'nsz' (and others?) are set.  But when starting with C/C++ (when `-ffast-math` isn't specified), we only will set 'reassoc' if some other switches in addition to `-fassociative-math` are passed.  Specifically:
>  `-fassociative-math -fno-signed-zeros -fno-trapping-math`
> 
> (details at https://reviews.llvm.org/D39812).
> 
> So in a sense, an instruction that has only 'reassoc' set on it isn't complete, in that it cannot be generated by Clang.  I feel like this is an IEEE thing, rather than a C/C++ thing (so other language front-ends ought to do something similar).  But from this perspective, is checking only 'reassoc' OK?  Or is should we change tests that have only 'reassoc' on an instruction to also have 'nsz'?  Which then leads me to the problem/question that `-fno-trapping-math` doesn't have a bit in FMF, so it creates a new dimension to the problem/question.
> 
> Anyway, I took the interpretation that when the 'reassoc' bit is set, it in a sense means the equivalent of at least `-fassociative-math -fno-signed-zeros -fno-trapping-math` from a C/C++ command-line perspective.  I'm not sure how we should interpret this.  As I write this comment, I feel like I've opened a can of worms.

Hopefully, not. :)

We need to be careful here - just because clang is behaving some way doesn't mean other front-ends are doing the same. We defined the LLVM FMF (maybe not precisely enough in the LangRef) so 'reassoc' and 'nsz' are independent bits.

I've made several recent changes to the FMF requirements in instsimplify/instcombine assuming that 'reassoc' does not imply anything beyond the ability to rearrange math ops. If the sign-of-zero doesn't match the IEEE requirements, then the transform needs 'nsz' too. Trapping isn't a concern because we assume a default FP environment (see recent edits to the LangRef).

So a few ways forward:

1. Easiest: change the FMF requirements in this patch to reassoc + nsz. We're confident that this code only deals with fadd/fsub, so { nnan , ninf , arcp , afn , contract } shouldn't be in play.
2. More involved: audit (and add tests for) all of the potential transforms enabled by FAddCombine and determine when only 'reassoc' is needed.
3. Ideal: audit instcombine and the reassociate pass and split up the transforms, so we don't have redundancies - or at least the redundancies are justified and documented. Note that this goes beyond just the FAddCombine chunk of code. We also do reassociation in instcombine under the name SimplifyAssociativeOrCommutative(), and I think FAddCombine() has redundant code for things that are handled there too.

https://reviews.llvm.org/D45453