[PATCH] D67105: [TargetLowering] Fix another potential FPE in expandFP_TO_UINT

Sun Sep 8 07:09:40 PDT 2019

uweigand added a comment.

In D67105#1661418 <https://reviews.llvm.org/D67105#1661418>, @kpn wrote:

> Well, I like it. The problem is solved, the pseudocode is I find a little easier to read, and the System/Z instruction sequence looks like a sideways move performance-wise.

Thanks for having a look.

> An outsider's view is that the X86 instructions look worse, but in the strict case I think correctness should win. If that strictness isn't required then targets can still use the old non-strict sequence. I am surprised the X86 instruction sequences are that different.

Note that the large differences in instruction sequences are really only in the vector-constrained-intrinsics.ll test, and those are all due to the fact that the test case uses a constant argument to the intrinsics, and  --as discussed above--  while we do not currently perform constant folding for constrained intrinsics, after expanding the intrinsic to the "strict" sequence, constant folding still occurs on part of that resulting sequence, and it just so happens that the "old" sequence allows for more such folding than the "new" sequence.  I had initially tried to implement a sort of "partial constant folding" that would get us to the same level as before, but given the discussion with Eli above, I now also think this doesn't make much sense: we really should do the full constant folding, and until such time as this is implemented, we can probably live with the lack of this (accidental) "partial" constant folding we had previously.

The difference in the fp-intrinsic.ll test, which doesn't use constant arguments and therefore didn't have this accidental partial folding to begin with, are more in line with the SystemZ changes, and appear to direcly correspond to the changes in IR.  But I'd certainly appreciate X86 experts giving their opinion on the quality of those sequences, in particular since on X86, we're using the "strict" expansion by default even for regular FP opcodes.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67105/new/

https://reviews.llvm.org/D67105