<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
<div class="moz-cite-prefix">On 4/24/13 10:56 AM, Chris Lattner
wrote:<br>
</div>
<blockquote
cite="mid:AE19DACE-53C4-40CD-9FFD-D6433536D080@apple.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<div>Sorry for chiming in late on this thread. </div>
<div><br>
</div>
<div>MHO is that this is still better to do on IR than in codegen.
Of course, I'm also of the opinion that we should take
Chandler's improvement as well (properly done, its been long
enough ago that I forget the specific objections) because making
the compiler better shouldn't wait for some theoretical
improvement that never happens. More below.</div>
<div><br>
</div>
On Apr 21, 2013, at 1:21 AM, Andrew Trick <<a
moz-do-not-send="true" href="mailto:atrick@apple.com">atrick@apple.com</a>>
wrote:
<div>
<blockquote type="cite">
<div style="letter-spacing: normal; orphans: auto; text-align:
start; text-indent: 0px; text-transform: none; white-space:
normal; widows: auto; word-spacing: 0px;
-webkit-text-stroke-width: 0px;">This is different than
Chandler's case, because we know the MI "early" if-converter
doesn't currently handle Arnold's optimization. Also, Arnold
has not yet proposed any target level heuristics that
attempt to predict cpu behavior, which was the main
objection.<br>
</div>
</blockquote>
<div><br>
</div>
<div>This specific transformation (if converting stores) has two
goals: 1) improve micro architectural performance
characteristics (less branch prediction etc), and 2) enable
mid-level optimizations to remove loads and stores.</div>
<div><br>
</div>
<div>In my mental cost model, eliminating loads and stores is
always goodness: it is generally always good for performance,
and it unblocks other secondary optimizations. They are a
great canonicalization.</div>
<div><br>
</div>
<div>Because this is a canonicalization of this sort, it seems
clearly good to do on IR, and early. Doing something like
this at the codegen level specifically for micro-architectural
reasons could also make sense, but I don't see that
eliminating the usefulness of doing it early as well.</div>
</div>
</blockquote>
Introducing a "select" at IR level dose not necessarily means
CodeGen convert the "select" with predicated instruction like cmov.<br>
cmov is not necessary inexpensive, for example, on Pentium 4, the
latency of cmov is about 10+ cycle. <br>
On this platform, If the compiler blindly convert a well predictable
branch to cmov on this platform, it only see degradation. <br>
<br>
That said, I think it makes some sense to perform force-if-cvt at IR
level if the algorithm rely on straight line code. <br>
<br>
<blockquote
cite="mid:AE19DACE-53C4-40CD-9FFD-D6433536D080@apple.com"
type="cite">
<div>
<div><br>
</div>
<blockquote type="cite">
<div style="letter-spacing: normal; orphans: auto; text-align:
start; text-indent: 0px; text-transform: none; white-space:
normal; widows: auto; word-spacing: 0px;
-webkit-text-stroke-width: 0px;">That said, we should have a
reason to if-convert before lowering other than optimizing
for a machine's cpu pipeline.<br>
<br>
Are we all convinced that if-converting a single store is
the proper canonical form?<br>
</div>
</blockquote>
<div><br>
</div>
I am, at least in this specific benchmark's case. You *can't*
legally do the if conversion if you are introducing a memory
access that otherwise would not have done. Doing this can have
lots of semantic effects. In order for this to be *legal* at
all (ignoring profitability) you have to prove that a subsequent
store is happening to the memory location.</div>
<div><br>
</div>
<div>In this case, the *profitability* comes down to being able to
obviously, locally, eliminate a load from the address. It's
true that GVN/PRE can eliminate part of the load in principle,
but in practice this doesn't happen.<br>
</div>
</blockquote>
GVN get rid of all the loads for the cases this if-cvt is trying to
catch. <br>
</body>
</html>