[LLVMdev] load widening conflicts with AddressSanitizer

Duncan Sands baldrick at free.fr
Wed Jan 25 00:27:10 PST 2012


Hi Chandler,

> On Tue, Jan 24, 2012 at 1:36 PM, Kostya Serebryany <kcc at google.com
> <mailto:kcc at google.com>> wrote:
>
>     ASAN *can* be modified this way (it will actually make instrumentation ~10%
>     cheaper).
>     But this mode will miss some bugs that the current mode finds.
>     I've seen at least a couple of such *real* bugs.
>
>     And these bugs are not only about exploitability, but also about correctness.
>     If a program reads garbage, there is no simple way to statically prove that
>     this garbage does not affect the program's behavior.
>
>
> We could go back to my original proposed fix -- come up with a specific way to
> model a "read past the end" in the LLVM IR. Essentially capture the act of load
> widening in the IR. I'm imagining 'load i8* %ptr as i64'. This is a load that
> reads an i64 value from memory, but only fills the low 8 bits with a value, the
> high bits are undef.

how about moving the load widening transformation to the code generators
instead (which is in essence where you are moving it)?  Unlike your suggestion,
this would have the disadvantage that you wouldn't get "knock on" improvements
from the transform of the kind that the IR optimizers can do.  But are those
common/significant?

Ciao, Duncan.

>
> Then the analysis knows that the program cannot rely on the value in the high
> bits, and does not flag an error. The code generator knows that the high bits
> can be undef, and can emit the widened load to memory as appropriate. We can
> teach the optimizers to *try* to transform C code forming these patterns into
> such a load for canonicalization, and we can provide a __builtin_....(...)
> syntax for explicitly performing such a load.
>
> PS: The syntax might equally well be "load i64* %ptr as i8'; it all depends on
> what the most natural way to form this pattern in IR is -- produce an i8 value
> or an i64 with undef high bits? require the pointer to be the wide type or the
> narrow type? i'd have to look at how these would play out in IR to tell what the
> best pattern is...
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list