<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Feb 9, 2016 at 8:07 AM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span class="">
<br>
<br>
<div>On 02/09/2016 06:57 AM, Jonas Wagner
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi,
<div><br>
</div>
<div>I'm coming back to this old thread with data about the
performance of NOPs. Recalling that I was considering
transforming NOP instructions into branches and back, in order
to dynamically enable code. One use case for this was
enabling/disabling individual sanitizer checks (ASan, UBSan)
on demand.</div>
<div><br>
</div>
<div>I wrote a pass which takes an ASan-instrumented program,
and replaces each ASan check with an
llvm.experimental.patchpoint intrinsic. This intrinsic inserts
a NOP of configurable size. It has otherwise no effect on the
program semantics. It does prevent some optimizations,
presumably because instructions cannot be moved across the
patchpoint.</div>
<div><br>
</div>
<div>Some results:</div>
<div>- On SPEC, patchpoints introduce an overhead of ~25%
compared to a version where ASan checks are removed.</div>
<div>- This is almost half of the cost of the checks themselves.</div>
<div>- The results are similar for NOPs of size 1 and 5 bytes.</div>
<div>- Interestingly, the results are similar for NOPs of 0
bytes, too. These are patchpoints that don't insert any code
and only inhibit optimizations. I've only tested this on one
benchmark, though.</div>
<div><br>
</div>
<div>To summarize, only part of the cost of NOPs is due to
executing them. Their effect on optimizations is significant,
too. I guess this would hold for branches and sanitizer checks
as well.</div>
</div>
</blockquote></span>
I don't think you can really draw strong conclusions from the
experiments you described. What you've ended up measuring is nearly
the impact of not optimizing over patchpoints at the check
locations. This doesn't really tell you much about what a check
(which is likely to inhibit optimization much less) costs over a nop
at the same position. <br>
<br>
One bit of data you could extract from the experiment as constructed
would be the relative cost of extra nops. You do mention that the
results are similar for sizes 1-5 bytes, but similar is very vague
in this context. Are the results statistically indistinguishable?
Or is there a noticeable but small slowdown that results? (Numbers
would be great here.)</div></blockquote><div><br></div><div>In this same vein, try inserting 1,2,3,4,5,6,... nops and measure the performance impact (the total size of nops is also interesting but is more difficult to measure reliably). I've used this kind of technique successfully in the past for e.g. measuring the cost of "stat" syscalls on windows. I call the technique "stuffing". Basically, make a plot of the performance degradation as you insert more and more redundant stuff (e.g. 1 nop, 2 nops, 3 nops, etc.). If the result is a strong linear trend, then you can pretty confidently extrapolate backward to the "0 nop" case to see the overhead of inserting 1 nop.</div><div><br></div><div>-- Sean Silva</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><span class=""><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>Best,</div>
<div>Jonas</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div class="gmail_quote">
<div dir="ltr">On Thu, Jan 21, 2016 at 11:52 PM Jonas Wagner
<<a href="mailto:jonas.wagner@epfl.ch" target="_blank">jonas.wagner@epfl.ch</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div>
<p style="margin:1.2em 0px!important">Hello,</p>
</div>
</div>
<div dir="ltr">
<div>
<blockquote style="margin:1.2em 0px;border-left-width:4px;border-left-style:solid;border-left-color:rgb(221,221,221);padding:0px 1em;color:rgb(119,119,119);quotes:none">
<blockquote style="margin:1.2em 0px;border-left-width:4px;border-left-style:solid;border-left-color:rgb(221,221,221);padding:0px 1em;color:rgb(119,119,119);quotes:none">
<p style="margin:1.2em 0px!important">There is
some data on this, e.g, in <a href="http://dslab.epfl.ch/proj/asap/#publications" target="_blank">“High System-Code Security
with Low Overhead”</a>. In this work we found
that, for ASan as well as other instrumentation
tools, most overhead comes from the checks.
Especially for CPU-intensive applications, the
cost of maintaining shadow memory is small.</p>
</blockquote>
<p style="margin:1.2em 0px!important">How did you
measure this? If it was measured by removing the
checks before optimization happens, then what you
may have been measuring is not the execution
overhead of the branches (which is what would be
eliminated by nop’ing them out) but the effect on
the optimizer.</p>
</blockquote>
</div>
</div>
<div dir="ltr">
<div>
<p style="margin:1.2em 0px!important">Interesting.
Indeed this was measured by removing some checks and
then re-optimizing the program.</p>
<p style="margin:1.2em 0px!important">I’m aware of
some impact checks may have on optimization. For
example, I’ve seen cases where much less inlining
happens because functions with checks are larger. Do
you know other concrete examples? This is definitely
something I’ll have to be careful about. Philip
Reames confirms this, too.</p>
<p style="margin:1.2em 0px!important">On the other
hand, we’ve also found that the benefit from
removing a check is roughly proportional to the
number of cycles spent executing that check’s
instructions. Our model of this is not very precise,
but it shows that the cost of executing the check’s
instructions matters.</p>
<p style="margin:1.2em 0px!important">I'll try to
measure this, and will come back when I have data.</p>
<p style="margin:1.2em 0px!important">Best,<br>
Jonas</p>
<div title="MDH:SGVsbG8sPGJyPjxicj4mZ3Q7Jmd0OyBUaGVyZSBpcyBzb21lIGRhdGEgb24gdGhpcywgZS5nLCBpbiA8YSBocmVmPSJodHRwOi8vZHNsYWIuZXBmbC5jaC9wcm9qL2FzYXAvI3B1YmxpY2F0aW9ucyI+
IkhpZ2ggU3lzdGVtLUNvZGUgU2VjdXJpdHkgd2l0aCBMb3cgT3ZlcmhlYWQiPC9hPi4gSW4gdGhp
cyB3b3JrIHdlIGZvdW5kIHRoYXQsIGZvciBBU2FuIGFzIHdlbGwgYXMgb3RoZXIgaW5zdHJ1bWVu
dGF0aW9uIHRvb2xzLCBtb3N0IG92ZXJoZWFkIGNvbWVzIGZyb20gdGhlIGNoZWNrcy4gRXNwZWNp
YWxseSBmb3IgQ1BVLWludGVuc2l2ZSBhcHBsaWNhdGlvbnMsIHRoZSBjb3N0IG9mIG1haW50YWlu
aW5nIHNoYWRvdyBtZW1vcnkgaXMgc21hbGwuPGRpdj48YnI+Jmd0OyZuYnNwO0hvdyBkaWQgeW91
IG1lYXN1cmUgdGhpcz8gSWYgaXQgd2FzIG1lYXN1cmVkIGJ5IHJlbW92aW5nIHRoZSBjaGVja3Mg
YmVmb3JlIG9wdGltaXphdGlvbiBoYXBwZW5zLCB0aGVuIHdoYXQgeW91IG1heSBoYXZlIGJlZW4g
bWVhc3VyaW5nIGlzIG5vdCB0aGUgZXhlY3V0aW9uIG92ZXJoZWFkIG9mIHRoZSBicmFuY2hlcyAo
d2hpY2ggaXMgd2hhdCB3b3VsZCBiZSBlbGltaW5hdGVkIGJ5IG5vcCdpbmcgdGhlbSBvdXQpIGJ1
dCB0aGUgZWZmZWN0IG9uIHRoZSBvcHRpbWl6ZXIuPGJyPjxkaXY+PGJyPjwvZGl2PjxkaXY+SW50
ZXJlc3RpbmcuIEluZGVlZCB0aGlzIHdhcyBtZWFzdXJlZCBieSByZW1vdmluZyBzb21lIGNoZWNr
cyBhbmQgdGhlbiByZS1vcHRpbWl6aW5nIHRoZSBwcm9ncmFtLjwvZGl2PjxkaXY+PGJyPjwvZGl2
PjxkaXY+SSdtIGF3YXJlIG9mIHNvbWUgaW1wYWN0IGNoZWNrcyBtYXkgaGF2ZSBvbiBvcHRpbWl6
YXRpb24uIEZvciBleGFtcGxlLCBJJ3ZlIHNlZW4gY2FzZXMgd2hlcmUgbXVjaCBsZXNzIGlubGlu
aW5nIGhhcHBlbnMgYmVjYXVzZSBmdW5jdGlvbnMgd2l0aCBjaGVja3MgYXJlIGxhcmdlci4gRG8g
eW91IGtub3cgb3RoZXIgY29uY3JldGUgZXhhbXBsZXM/IFRoaXMgaXMgZGVmaW5pdGVseSBzb21l
dGhpbmcgSSdsbCBoYXZlIHRvIGJlIGNhcmVmdWwgYWJvdXQuIFBoaWxpcCBSZWFtZXMgY29uZmly
bXMgdGhpcywgdG9vLjwvZGl2PjxkaXY+PGJyPjwvZGl2PjxkaXY+T24gdGhlIG90aGVyIGhhbmQs
IHdlJ3ZlIGFsc28gZm91bmQgdGhhdCB0aGUgYmVuZWZpdCBmcm9tIHJlbW92aW5nIGEgY2hlY2sg
aXMgcm91Z2hseSBwcm9wb3J0aW9uYWwgdG8gdGhlIG51bWJlciBvZiBjeWNsZXMgc3BlbnQgZXhl
Y3V0aW5nIHRoYXQgY2hlY2sncyBpbnN0cnVjdGlvbnMuIE91ciBtb2RlbCBvZiB0aGlzIGlzIG5v
dCB2ZXJ5IHByZWNpc2UsIGJ1dCBpdCBzaG93cyB0aGF0IHRoZSBjb3N0IG9mIGV4ZWN1dGluZyB0
aGUgY2hlY2sncyBpbnN0cnVjdGlvbnMgbWF0dGVycy48L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2
PkJlc3QsPC9kaXY+PGRpdj5Kb25hczwvZGl2PjwvZGl2Pg==" style="min-height:0;width:0;max-height:0;max-width:0;overflow:hidden;font-size:0em;padding:0;margin:0"></div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<br>
</span></div>
</blockquote></div><br></div></div>