[LLVMdev] load widening conflicts with AddressSanitizer

Tue Jan 24 13:53:41 PST 2012

On Tue, Jan 24, 2012 at 1:36 PM, Kostya Serebryany <kcc at google.com> wrote:

> ASAN *can* be modified this way (it will actually make instrumentation
> ~10% cheaper).
> But this mode will miss some bugs that the current mode finds.
> I've seen at least a couple of such *real* bugs.
>
> And these bugs are not only about exploitability, but also about
> correctness.
> If a program reads garbage, there is no simple way to statically prove
> that this garbage does not affect the program's behavior.
>

We could go back to my original proposed fix -- come up with a specific way
to model a "read past the end" in the LLVM IR. Essentially capture the act
of load widening in the IR. I'm imagining 'load i8* %ptr as i64'. This is a
load that reads an i64 value from memory, but only fills the low 8 bits
with a value, the high bits are undef.

Then the analysis knows that the program cannot rely on the value in the
high bits, and does not flag an error. The code generator knows that the
high bits can be undef, and can emit the widened load to memory as
appropriate. We can teach the optimizers to *try* to transform C code
forming these patterns into such a load for canonicalization, and we can
provide a __builtin_....(...) syntax for explicitly performing such a load.

PS: The syntax might equally well be "load i64* %ptr as i8'; it all depends
on what the most natural way to form this pattern in IR is -- produce an i8
value or an i64 with undef high bits? require the pointer to be the wide
type or the narrow type? i'd have to look at how these would play out in IR
to tell what the best pattern is...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120124/217b3743/attachment.html>