[PATCH] D55506: [RFC v2] Allow target to handle STRICT floating-point nodes

Mon Jun 3 14:57:28 PDT 2019

andrew.w.kaylor added a comment.

In D55506#1527992 <https://reviews.llvm.org/D55506#1527992>, @uweigand wrote:

> In D55506#1527651 <https://reviews.llvm.org/D55506#1527651>, @andrew.w.kaylor wrote:
>
> > Why is the FPBarrierChain getting involved with the BarrierChain for memory objects? Even without this patch I think we might be entangling the strict FP nodes with other chained nodes more than we should.
>
>
> The BarrierChain tracks calls, volatile/atomic memory accesses, and UnmodeledSideEffects.  I do believe that FP exceptions must indeed be kept stable relative to those.
>
> My patch does *not* entangle FP exceptions with regular memory accesses, which do *not* touch BarrierChain.

I see. Thanks for the explanation.

> 
> 
>> Are there any cases where mayRaiseFPException should not imply hasUnmodeledSideEffects? If we checked mayRaiseFPException() inside the hasUnmodeledSideEffects() implementation a lot of these changes wouldn't be needed, and it might help with people who didn't think about FP exceptions in future changes.
> 
> Well, my last version did have FP exceptions implying UnmodeledSideEffects, and you didn't like that :-)   Specifically, that would mean that FP exceptions could not be moved across any normal memory access (since UnmodeledSideEffect instructions cannot) ...

Sorry about that. I forgot what I said before. I've just re-read my earlier comments, and I think that line of reasoning was sound. As I said before, hasUnmodeledSideEffects is -- at least in theory -- a stronger barrier than mayRaiseFPExceptions. Looking at your current patch, I had the impression that hasUnmodeledSideEffects and mayRaiseFPExceptions were always coinciding, but even if that is currently the case it isn't necessarily required to be so. I apologize for the noise.

>> There is no way to distinguish between fpexcept.maytrap and fpexcept.strict after pattern matching. We could probably live with that, but it's less than ideal.
> 
> Well, to be honest I don't really see what the difference between those two at the MI level would be.  Can you explain e.g. how scheduling restrictions ought to differ?  If we have a need for that, we can just add a second MI flag.

With regard to scheduling there is no difference between strict and maytrap. The difference comes when we want to eliminate instructions. For example, if we have something like this:

  BB1:
    FMP %0, %1
    JCC %BB2, 4
    JMP %BB3
  BB2:
    JMP %BB3
  BB3:
    ...

Under "fpexcept.strict" we cannot eliminate the compare because it may raise an exception (though we can still get rid of the JCC), but under "fpexcept.maytrap" we can eliminate the compare because our only promise is that we won't raise spurious exceptions.

>> I'd like to see a solution that could handle rounding mode also. For rounding mode it will be even more important to have more than a single on/off flag. Do you have any ideas for that?
> 
> As I mentioned in the description, I believe rounding modes can be handled completely in the back-end, by simply modeling the FPC register (which I already committed on SystemZ).  Every FP instruction (strict or not) is modeled as a user of FPC, all assembler instruction that modify it (which are only a few special-purpose instruction) are modeled as a definition, and function calls within a FENV_ACCESS section should be marked by the front-end as having a variant ABI, which will cause the back-end to have them marked as clobbering FPC  (that last part is still missing, but is completely independent of everything that is done by this patch).  What aspect of rounding modes do you think is not covered by this approach?

First, I'd like to avoid modeling the FP control and status registers when we aren't in a constrained mode. I haven't done any tests to measure the effect of modeling these registers, but it does place some constraints on the backend that don't need to be there in the unconstrained mode and at least has the potential to degrade performance.

Second, there are some additional optimizations we could make in the cases where we know the rounding mode. Constant folding is an example. I'm not sure there are any backend optimizations that do constant folding after instruction selection, but there was a review recently where this exact issue came up for constant folding during ISel and the conclusion was that we could handle it if we knew the rounding mode but if not we'd have to just block the folding.

Third, there are some architectures where the rounding mode can be included as an operand. This is a potentially confusing point so I'm going to be verbose. I've said before that the rounding mode in constrained instructions is only a descriptive hint to the optimizer, declaring what the rounding mode is at that point rather than a  prescriptive operator that sets the rounding mode. However, in the case where we do know the rounding mode (i.e. the rounding mode operand is something other than "round.dynamic") we can use that information to select an instruction form that has a rounding mode operand. How important or unimportant this capability is will depend on the architecture, but in some cases I think it could be significant.

With regard to the patch as a whole, let me say that my biggest concern at this point is to make sure we aren't going toward a dead end. If we think that all of the issues I've raised can likely be addressed in some variation of the patch you've proposed here then I would be in favor of committing this patch now without solving all of those problems. I just don't want to put something in place that will need to be ripped out later in order to make progress. Of course I understand that there's a balance to be found here. I know some of what we have already will need to be replaced, and that's just the way things go. That is to say, I just want to make sure we're headed in more or less the right direction.

Thanks for your patience and persistence.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55506/new/

https://reviews.llvm.org/D55506