[PATCH] D143106: [SDAG] fix miscompiles caused by using ValueTracking matchSelectPattern to create FMINIMUM/FMAXIMUM

Thu Feb 2 07:09:53 PST 2023

spatel added a comment.

In D143106#4099017 <https://reviews.llvm.org/D143106#4099017>, @samparker wrote:

> It looks like it could cause some unexpected performance hiccups though... So, for the targets that would return the input NaN, do you think it would be worth implementing a target API for ValueTracking so that it could make a decision, or is this something that have to be sunk into the DAGBuilder? Are there any other uses of matchSelectPattern that could also be making a bogus decision?

I agree that this seems likely to cause some perf regressions based on the test diffs. If we can find a way to avoid those, that works for me, but it's such a mess, I figured we should just remove this hunk of wrong code as a first step.

Let me try to summarize where we stand:

1. The select pattern in IR guarantees that we would return a NaN input bitwise exactly. The FMINIMUM/FMAXIMUM intrinsics/nodes don't guarantee that, and as noted in D143056 <https://reviews.llvm.org/D143056>, the targets that implement this operation tend to return a canonical NaN rather than propagate a NaN payload. Given that, changing semantics in IR or SDAG to accommodate this transform does not seem viable.

2. The select pattern in IR differs in its treatment of -0.0, so we have potential for real numeric miscompiles as shown in the test diffs here. It seems likely that we will add even more MIN/MAX node variations to match target behavior, but I'm not sure what that behavior is yet, so I don't know what a good solution will be. As an example of an existing codegen semantic variation of min/max, note that x86 has target-specific FMAXC/FMINC nodes.

3. We could look for 'nsz' and 'nnan' on the IR to make the transform sound, but it requires a questionable combination of FMF on the fcmp and select: https://alive2.llvm.org/ce/z/vjV9AC . This might get better with the `nofpclass` attribute proposed in D139902 <https://reviews.llvm.org/D139902>.

4. I don't think we need to worry about SNaN with this transform because we're not using strict math -- strict variants of the MIN/MAX nodes already exist -- but I'm also not sure how that path works yet. I was ignoring that complication for now. There is an attempt to clarify SNaN behavior with a LangRef edit in D143074 <https://reviews.llvm.org/D143074> in case that part wasn't clear enough.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143106/new/

https://reviews.llvm.org/D143106