<div dir="ltr">Hi,<div><br></div><div>I'm coming back to this old thread with data about the performance of NOPs. Recalling that I was considering transforming NOP instructions into branches and back, in order to dynamically enable code. One use case for this was enabling/disabling individual sanitizer checks (ASan, UBSan) on demand.</div><div><br></div><div>I wrote a pass which takes an ASan-instrumented program, and replaces each ASan check with an llvm.experimental.patchpoint intrinsic. This intrinsic inserts a NOP of configurable size. It has otherwise no effect on the program semantics. It does prevent some optimizations, presumably because instructions cannot be moved across the patchpoint.</div><div><br></div><div>Some results:</div><div>- On SPEC, patchpoints introduce an overhead of ~25% compared to a version where ASan checks are removed.</div><div>- This is almost half of the cost of the checks themselves.</div><div>- The results are similar for NOPs of size 1 and 5 bytes.</div><div>- Interestingly, the results are similar for NOPs of 0 bytes, too. These are patchpoints that don't insert any code and only inhibit optimizations. I've only tested this on one benchmark, though.</div><div><br></div><div>To summarize, only part of the cost of NOPs is due to executing them. Their effect on optimizations is significant, too. I guess this would hold for branches and sanitizer checks as well.</div><div><br></div><div>Best,</div><div>Jonas</div><div><br></div><div><br></div><div><div class="gmail_quote"><div dir="ltr">On Thu, Jan 21, 2016 at 11:52 PM Jonas Wagner <<a href="mailto:jonas.wagner@epfl.ch">jonas.wagner@epfl.ch</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><p style="margin:1.2em 0px!important">Hello,</p></div></div><div dir="ltr"><div>

<blockquote style="margin:1.2em 0px;border-left-width:4px;border-left-style:solid;border-left-color:rgb(221,221,221);padding:0px 1em;color:rgb(119,119,119);quotes:none">

<blockquote style="margin:1.2em 0px;border-left-width:4px;border-left-style:solid;border-left-color:rgb(221,221,221);padding:0px 1em;color:rgb(119,119,119);quotes:none">

<p style="margin:1.2em 0px!important">There is some data on this, e.g, in <a href="http://dslab.epfl.ch/proj/asap/#publications" target="_blank">“High System-Code Security with Low Overhead”</a>. In this work we found that, for ASan as well as other instrumentation tools, most overhead comes from the checks. Especially for CPU-intensive applications, the cost of maintaining shadow memory is small.</p>

</blockquote>

<p style="margin:1.2em 0px!important">How did you measure this? If it was measured by removing the checks before optimization happens, then what you may have been measuring is not the execution overhead of the branches (which is what would be eliminated by nop’ing them out) but the effect on the optimizer.</p>

</blockquote>

</div></div><div dir="ltr"><div><p style="margin:1.2em 0px!important">Interesting. Indeed this was measured by removing some checks and then re-optimizing the program.</p>

<p style="margin:1.2em 0px!important">I’m aware of some impact checks may have on optimization. For example, I’ve seen cases where much less inlining happens because functions with checks are larger. Do you know other concrete examples? This is definitely something I’ll have to be careful about. Philip Reames confirms this, too.</p>

<p style="margin:1.2em 0px!important">On the other hand, we’ve also found that the benefit from removing a check is roughly proportional to the number of cycles spent executing that check’s instructions. Our model of this is not very precise, but it shows that the cost of executing the check’s instructions matters.</p><p style="margin:1.2em 0px!important">I'll try to measure this, and will come back when I have data.</p>

<p style="margin:1.2em 0px!important">Best,<br>Jonas</p>

<div title="MDH:SGVsbG8sPGJyPjxicj4mZ3Q7Jmd0OyBUaGVyZSBpcyBzb21lIGRhdGEgb24gdGhpcywgZS5nLCBp

biA8YSBocmVmPSJodHRwOi8vZHNsYWIuZXBmbC5jaC9wcm9qL2FzYXAvI3B1YmxpY2F0aW9ucyI+

IkhpZ2ggU3lzdGVtLUNvZGUgU2VjdXJpdHkgd2l0aCBMb3cgT3ZlcmhlYWQiPC9hPi4gSW4gdGhp

cyB3b3JrIHdlIGZvdW5kIHRoYXQsIGZvciBBU2FuIGFzIHdlbGwgYXMgb3RoZXIgaW5zdHJ1bWVu

dGF0aW9uIHRvb2xzLCBtb3N0IG92ZXJoZWFkIGNvbWVzIGZyb20gdGhlIGNoZWNrcy4gRXNwZWNp

YWxseSBmb3IgQ1BVLWludGVuc2l2ZSBhcHBsaWNhdGlvbnMsIHRoZSBjb3N0IG9mIG1haW50YWlu

aW5nIHNoYWRvdyBtZW1vcnkgaXMgc21hbGwuPGRpdj48YnI+Jmd0OyZuYnNwO0hvdyBkaWQgeW91

IG1lYXN1cmUgdGhpcz8gSWYgaXQgd2FzIG1lYXN1cmVkIGJ5IHJlbW92aW5nIHRoZSBjaGVja3Mg

YmVmb3JlIG9wdGltaXphdGlvbiBoYXBwZW5zLCB0aGVuIHdoYXQgeW91IG1heSBoYXZlIGJlZW4g

bWVhc3VyaW5nIGlzIG5vdCB0aGUgZXhlY3V0aW9uIG92ZXJoZWFkIG9mIHRoZSBicmFuY2hlcyAo

d2hpY2ggaXMgd2hhdCB3b3VsZCBiZSBlbGltaW5hdGVkIGJ5IG5vcCdpbmcgdGhlbSBvdXQpIGJ1

dCB0aGUgZWZmZWN0IG9uIHRoZSBvcHRpbWl6ZXIuPGJyPjxkaXY+PGJyPjwvZGl2PjxkaXY+SW50

ZXJlc3RpbmcuIEluZGVlZCB0aGlzIHdhcyBtZWFzdXJlZCBieSByZW1vdmluZyBzb21lIGNoZWNr

cyBhbmQgdGhlbiByZS1vcHRpbWl6aW5nIHRoZSBwcm9ncmFtLjwvZGl2PjxkaXY+PGJyPjwvZGl2

PjxkaXY+SSdtIGF3YXJlIG9mIHNvbWUgaW1wYWN0IGNoZWNrcyBtYXkgaGF2ZSBvbiBvcHRpbWl6

YXRpb24uIEZvciBleGFtcGxlLCBJJ3ZlIHNlZW4gY2FzZXMgd2hlcmUgbXVjaCBsZXNzIGlubGlu

aW5nIGhhcHBlbnMgYmVjYXVzZSBmdW5jdGlvbnMgd2l0aCBjaGVja3MgYXJlIGxhcmdlci4gRG8g

eW91IGtub3cgb3RoZXIgY29uY3JldGUgZXhhbXBsZXM/IFRoaXMgaXMgZGVmaW5pdGVseSBzb21l

dGhpbmcgSSdsbCBoYXZlIHRvIGJlIGNhcmVmdWwgYWJvdXQuIFBoaWxpcCBSZWFtZXMgY29uZmly

bXMgdGhpcywgdG9vLjwvZGl2PjxkaXY+PGJyPjwvZGl2PjxkaXY+T24gdGhlIG90aGVyIGhhbmQs

IHdlJ3ZlIGFsc28gZm91bmQgdGhhdCB0aGUgYmVuZWZpdCBmcm9tIHJlbW92aW5nIGEgY2hlY2sg

aXMgcm91Z2hseSBwcm9wb3J0aW9uYWwgdG8gdGhlIG51bWJlciBvZiBjeWNsZXMgc3BlbnQgZXhl

Y3V0aW5nIHRoYXQgY2hlY2sncyBpbnN0cnVjdGlvbnMuIE91ciBtb2RlbCBvZiB0aGlzIGlzIG5v

dCB2ZXJ5IHByZWNpc2UsIGJ1dCBpdCBzaG93cyB0aGF0IHRoZSBjb3N0IG9mIGV4ZWN1dGluZyB0

aGUgY2hlY2sncyBpbnN0cnVjdGlvbnMgbWF0dGVycy48L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2

PkJlc3QsPC9kaXY+PGRpdj5Kb25hczwvZGl2PjwvZGl2Pg==" style="min-height:0;width:0;max-height:0;max-width:0;overflow:hidden;font-size:0em;padding:0;margin:0"></div></div></div></blockquote></div></div></div>