[cfe-dev] [LLVMdev] proposal for exploiting undefined behavior much more aggressively
Dean Sutherland
dsutherland at cert.org
Mon Jul 30 07:51:30 PDT 2012
There's actually one possibly-interesting use-case where it's reasonable to have "exploiting undefined behavior" as a *goal*. Maximally-perverse results from code that depends on undefined behavior can be useful for users who are attempting to stamp-out undefined behavior from their code. This is only desirable, of course, when it's been explicitly requested by the user.
You can find an example of this in the GNAT Ada front end. Finding a safe elaboration order for static variables is undecidable, so Ada compilers use pretty-good heuristics to address the problem. Smart programmers also test their code in a mode where the compiler reverses its heuristic and *pessimizes* the elaboration order. When your code works both ways, you're in pretty good shape (on that issue, at least). One could imagine similar treatment for UB.
I rather expect that this use case isn't particularly interesting for CLANG.
Dean F. Sutherland
On Jul 27, 2012, at 4:44 AM, Chandler Carruth wrote:
On Fri, Jul 27, 2012 at 1:35 AM, <annulen at yandex.ru<mailto:annulen at yandex.ru>> wrote:
27.07.12, 03:30, "Chris Lattner" <clattner at apple.com<mailto:clattner at apple.com>>:
>On Jul 26, 2012, at 9:58 AM, John Regehr wrote:
>> http://blog.regehr.org/archives/761
>
>Its an interesting post, but Id like to point out that it is a non-goal for the project to be actively hostile to users of the compiler. :) It is useful to have debugging tools for people who really care, but 'exploiting' undefined behavior just for the sake of breaking code is a non-goal.
>
>A specific example is code like this (which is quite common):
>
>int ftoi(float F) {
> return *(int*)&F;
>}
>
>This is a violation of the C spec, due to type-based aliasing issues (the right approach is to use a union). That said, we go out of our way to not break this sort of idiom, because it is obvious to the compiler and actively hostile to a widely used pattern in dusty deck code.
>
This behavior could be made optional, e.g. if someone has checked the code and found it UB-free he could allow aggressive UB exploiting.
I think we're approaching this a bit backwards.
We should first see if there is some non-trivial performance gain to be had by leveraging the undefined behavior. If so, we should then evaluate whether it is reasonable to hope for real world programmers to avoid such undefined behavior, and whether the code we have today is reasonably free of it.
If all three of these hold true (the first is actually the one i find least likely to be true in many cases), only then should we implement an optimization which leverages the UB, and we should first make some attempt to implement reasonable warnings and runtime checking of the UB to help users who run afoul of the optimization.
I think[1] that the primary issue is that it should never be the *goal* to exploit undefined behavior. The goal should be faster generated code, smaller generated code, or some other valuable thing for a compiler. Then, if the undefined behavior gives a particular opportunity to reach that goal, we should consider taking that opportunity. To simply willfully transform code with undefined behavior code into ludicrous constructs is to put the cart before the horse.
[1]: Of course, perhaps Chris is thinking something else. ;] This is just my two cents.
-Chandler
<ATT00001.c>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120730/0d951c4c/attachment.html>
More information about the cfe-dev
mailing list