<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/xhtml; charset=utf-8">
</head>
<body>
<div style="font-family:sans-serif"><div style="white-space:normal">
<p dir="auto">On 28 Oct 2021, at 20:22, Kaylor, Andrew wrote:</p>
</div>
<div style="white-space:normal"><blockquote style="border-left:2px solid #3983C4; color:#3983C4; margin:0 0 5px; padding-left:5px"><p dir="auto">Hi everyone,<br>
<br>
This is related to the recent thread about fp-contract and front end pragma controls, but I want to generalize the discussion in terms of how the target-independent codegen in the backend is implemented.<br>
<br>
Until sometime in 2017 (I think) the fast-math flags were not passed through to the Selection DAG, and so the only ways to control floating-point behavior were through settings in the TargetOptions or by settings function attributes. Since 2017, however, the fast-math flags have been attached to floating-point nodes in the selection DAG. This leads to some ambiguous situations where the TargetOptions or function attributes can override the absence of fast-math flags on individual nodes. An example of this is the fp-contract setting. If a source file is compiled with clang using the '-ffp-contract=fast' setting but the file contains either "#pragma STDC FP_CONTRACT OFF" or "#pragma clang fp contract(off)" the front end will generate IR without the 'contract' fast-math flag set, but the X86 backend will generate an FMA instruction anyway.<br>
<br>
<a href="https://godbolt.org/z/dov6EcE8G">https://godbolt.org/z/dov6EcE8G</a><br>
<br>
This is particularly bad in the case of CUDA, because CUDA uses fp-contract=fast by default. So, the user's code can explicitly say "don't generate fma here" and the compiler will respond, "meh, I think I will anyway."<br>
<br>
<a href="https://godbolt.org/z/c4h1nK9M3">https://godbolt.org/z/c4h1nK9M3</a></p>
</blockquote></div>
<div style="white-space:normal">
<p dir="auto">I don’t think the argument about <code>-ffp-contract</code> is directly applicable to the implementation-level decision here. For better or worse, we can maintain the existing <code>-ffp-contract</code> semantics on top of any representation, e.g. by just marking all FP instructions as fast-contractable and dropping all the IR that would come from pragmas. Your core point is that the current representation allows unnecessary ambiguities that lead to bugs, and that seems pretty indisputable.</p>
</div>
<div style="white-space:normal"><blockquote style="border-left:2px solid #3983C4; color:#3983C4; margin:0 0 5px; padding-left:5px"><p dir="auto">There are other cases where the backend code will check for TargetOption::UnsafeFPMath for things like reassociation that can be represented using fast-math flags.<br>
<br>
That brings me to the RFC part of my message. I'd like to start updating the backend so that it doesn't do things like this. As a general principle, I would say, "All semantics must be represented in the IR and the backend must respect the IR semantics." And a corollary: "Anything which can be represented at the instruction level must be represented at the instruction level." This corollary would eliminate potential conflicts between function attributes (like "unsafe-fp-math") and individual IR instructions.</p>
</blockquote></div>
<div style="white-space:normal">
<p dir="auto">It’s unclear to me whether you’re proposing this as a rule just for the Selection DAG or also for LLVM IR. Selection DAG is an internal IR of the backends, and strengthening rules there doesn’t have very many downsides. If you’re also proposing that LLVM IR should represent these with instruction-level rather than function-level flags, that’s going to be a lot more disruptive for frontends, because most frontends don’t need to support things like local pragmas and may not be setting instruction-level flags right now. That doesn’t mean we can’t still pursue that, but your proposal should be clear that that’s the goal.</p>
</div>
<div style="white-space:normal"><blockquote style="border-left:2px solid #3983C4; color:#3983C4; margin:0 0 5px; padding-left:5px"><p dir="auto">As a first step toward this goal, I've prepared a patch which closes the back door for fp-contract control.<br>
<br>
<a href="https://reviews.llvm.org/D112760">https://reviews.llvm.org/D112760</a><br>
<br>
This patch is currently incomplete, in as much as I didn't update failing tests for several target architectures. I did update the X86 and AMDGPU tests to provide examples of how they can be made to work. I will fix the rest if we decide this is the correct direction. There is a failing CUDA test in the clang front end that I think will require a different approach involving some driver changes to get clang to generate IR for the semantics it intends rather than setting an option and counting on the backend to disregard the IR.</p>
</blockquote></div>
<div style="white-space:normal">
<p dir="auto">Yeah, Clang shouldn’t be producing inconsistent IR like that.</p>
<p dir="auto">John.</p>
</div>
</div>
</body>
</html>