<div class="gmail_quote">On Sun, Apr 15, 2012 at 1:02 AM, Duncan Sands <span dir="ltr"><<a href="mailto:baldrick@free.fr">baldrick@free.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi Dmitry,<div class="im"><br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

    The kinds of transforms I think can reasonably be done with the current<br>

    information are things like: x + 0.0 -> x; x / constant -> x * (1 / constant) if<br>

    constant and 1 / constant are normal (and not denormal) numbers.<br>


<br>

The particular definition is not that important, as the fact that this<br>

definition exists :) I.e. I think we need a set of transformations to be defined<br>

(as enum the most likely, as Renato pointed out) and an interface, which accepts<br>

"fp-model" (which is "fast", "strict" or whatever keyword we may end up) and the<br>

particular transformation and returns true of false, depending whether the<br>

definition of fp-model allows this transformation or not. So the transformation<br>

would request, for example, if reassociation is allowed or not.<br>

</blockquote>

<br></div>

at some point each optimization will have to decide if it is going to be applied<br>

or not, so that's not really the point.  It seems to me that there are many many<br>

possible optimizations, and putting them all as flags in the metadata is out of<br>

the question.  What seems reasonable to me is dividing transforms up into a few<br>

major (and orthogonal) classes and putting flags for them in the metadata.<div class="im"><br></div></blockquote><div>Optimization decision to apply or not should be based on strict definition of what is allowed or not, but not on optimization interpretation of "fast" fp-model (for example). Say, after widely adopting "fast" fp-model in the compiler, you suddenly realize that  the definition is wrong and allowing some type of transformation is a bad idea (for any reason - being incompatible with some compiler or not taking into account some corner cases or for whatever other reason), then you'll have to go and fix one million places where the decision is made.</div>

<div><br></div><div>Alternatively, defining classes of transformation and making optimization to query for particular types of transformation you keep it under control.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im"><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Another point, important from practical point of view, is that fp-model is<br>

almost always the same for any instructions in the function (or even module) and<br>

tagging every instruction with fp-model metadata is quite a substantial waste of<br>

resources.<br>

</blockquote>

<br></div>

I measured the resource waste and it seems fairly small.<div class="im"><br>

<br>

 So it makes sense to me to have a default fp-model defined for the<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

function or module, which can be overwritten with instruction metadata.<br>

</blockquote>

<br></div>

That's possible (I already discussed this with Chandler), but in my opinion is<br>

only worth doing if we see unreasonable increases in bitcode size in real code.</blockquote><div><br></div><div>What is reasonable or not is defined not only by absolute numbers (0.8% or any other number). Does it make sense to increase bitcode size by 1% if it's used only by math library writes and a couple other people who reeeeally care about precision *and* performance at the same time and knowledgeable enough to restrict precision on particular instructions only? In my experience it's extremely rare case, when people would like to have more than compiler flags to control fp accuracy and ready to deal with pragmas (when they are available).</div>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I also understand that clang generally derives GCC switches and fp precision<br>

switches are not an exception, but I'd like to point out that there's a far more<br>

orderly way of defining fp precision model (IMHO, of course :-) ), adopted by MS<br>

and Intel Compiler (-fp-model [strict|precise|fast]). It would be nice to have<br>

it adopted in clang.<br>

<br>

But while adding MS-style fp-model switches is different topic (and I guess<br>

quite arguable one), I'm mentioning it to show the importance of an idea of<br>

abstracting internal compiler fp-model from external switches<br>

</blockquote>

<br></div>

The info in the meta-data is essentially a bunch of external switches which<br>

will then be used to determine which transforms are run.<div class="im"><br>

<br>

 and exposing<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

a querying interface to transformations. Transformations shouldn't care about<br>

particular model, they need to know only if particular type of transformation is<br>

allowed.<br>

</blockquote>

<br></div>

Do you have a concrete suggestion for what should be in the metadata?<br></blockquote><div><br></div><div>I would define the set of transformations, such as (i can help with more complete list if you prefer):</div><div><ul>

<li>reassociation</li><li>x+0.0=>x</li><li>x*0.0=>0.0</li><li>x*1.0=>x</li><li>a/b => a* 1/b</li><li>a*b+c=>fma(a,b,c)</li><li>ignoring NaNs in compare, i.e. (a<b) => !(a>=b)</li><li>value unsafe transformation (for aggressive fp optimizations, like a*b+a*c => a(b+c)) and other of the kind.</li>

</ul><div>and several aliases for "strict", "precise", "fast" models (which are effectively combination of flags above).</div></div><div><br></div><div>So that metadata would be able to say "fast", "fast, but no fma allowed", "strict, but fma allowed", I.e. metadata should be a base-level + optional set of adjustments from the list above.</div>

<div><br></div><div>And, again, I think this should be function level model, unless specified otherwise in the instruction, as it will be the case in 99.9999% of the compilations.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

Ciao, Duncan.<br></blockquote><div> </div><div>Dmitry.</div></div>