<div dir="ltr"><div dir="ltr">I see this approach is not supported, so I am trying to elaborate another solution.<div>Nevertheless I'd like to address some comments, just for references.</div><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Oct 4, 2019 at 1:43 AM Kaylor, Andrew <<a href="mailto:andrew.kaylor@intel.com">andrew.kaylor@intel.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US">
<div class="gmail-m_-3151700063811608419WordSection1">
<p class="MsoNormal">I’d like to emphasize that the constrained intrinsics prevent optimizations *<b>by default</b>*. We have a plan to go back and teach individual optimizations how to handle these intrinsics. </p></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_-3151700063811608419WordSection1"><p class="MsoNormal"> The idea is that if an optimization knows nothing
about the constrained intrinsics then it won’t try to transform them, but if an optimization has been taught to handle the intrinsics correctly then it isn’t limited by anything other than the semantics of the constraints. Once we’ve updated an optimization
pass, it will be able to do everything with a constrained intrinsic that has the “relaxed” settings (“fpexcept.ignore” and “fpround.tonearest”) that it would be able to do with the regular operation.</p></div></div></blockquote><div><br></div><div>This work is necessary for any approach, but for the current is is vital. As constrained intrinsics are used in entire function body, the code base where the solution must work correctly and fast is larger. The performance drop make this solution inappropriate for many users, they wouldn't use it until the performance become close to the case without constrained intrinsics. In contrast basic block attributes limit the constrained intrinsics with only part of function code. It would be easier to make the solution suitable for use in production code.</div><div><br></div><div>Of course, when reasoning about performance, it would be nice to have numbers. </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_-3151700063811608419WordSection1"><p class="MsoNormal">
<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">This philosophy is key to the way that we’re approaching FPENV support. One of the primary goals is that any optimization that isn’t specifically aware of the mechanisms we’re using will automatically get conservatively correct behavior.
The problem with relying on basic block attributes is that it requires teaching all current optimizations to look for the attribute.</p></div></div></blockquote><div><br></div><div>All these optimizations must be eventually modified in the current approach as well. If a transformation makes dangerous instruction or basic block move it must be taught to process constrained intrinsics correctly, or it becomes a source of performance drop.</div><div><br></div><div>But you are right, implementation of basic block attributes require implementation of mechanism that checks validity of instruction and basic block moves. After it is implemented, the search of the places where transformation require modification become simpler.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_-3151700063811608419WordSection1"><p class="MsoNormal"><u></u><u></u></p>
<div dir="ltr" class="gmail_attr">On Fri, Oct 4, 2019 at 1:54 AM Doerfert, Johannes <<a href="mailto:jdoerfert@anl.gov">jdoerfert@anl.gov</a>> wrote:<br></div><p class="MsoNormal"><u></u></p><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The way I understood it, the constraint intrinsics are not the only<br>
problem but the regular ones can be. That is, optimizations will move<br>
around/combine/replace/... regular floating operations in the presence<br>
of constraint intrinsics because they do not impact each other (other<br>
than def-use). If that understanding is correct, and this is a problem,<br>
then I doubt that we want basic block attributes.</blockquote></div></div></blockquote><div> <br></div><div>Basic block attributes allows to partition function code into realms, where FP operation is represented by either constrained intrinsic or by regular node. Code that moves instructions checks if particular instruction is allowed to pass realm boundary. This mechanism prevents from mixing constrained intrinsics with regular FP nodes, but still allows optimizations like inlining.</div><div><br></div><div><div dir="ltr" class="gmail_attr">On Thu, Oct 3, 2019 at 10:45 PM Doerfert, Johannes <<a href="mailto:jdoerfert@anl.gov">jdoerfert@anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 10/03, Serge Pavlov wrote:<br>> <br>
> Outlining is an interesting solution but unfortunately it is not an option<br>
> for processors for machine learning. Branching is expensive on them and<br>
> some processors do not have call instruction, all function calls must be<br>
> eventually inlined.<br>
<br>
Would "really late" inlining be an option?</blockquote><div><br></div><div>Late inlining means fewer optimization possibilities. If resulting code represents a single function (as in the case of kernels) it is usually more profitable to do early inlining.</div></div><div><br></div><div>Thanks,</div><div>--Serge</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_-3151700063811608419WordSection1"><div><div><div><div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote></div></div>