[LLVMdev] load widening conflicts with AddressSanitizer
John Criswell
criswell at illinois.edu
Tue Jan 24 13:08:15 PST 2012
On 1/24/12 2:31 PM, Duncan Sands wrote:
> Hi Kostya,
>
>> As far as I can see the C and C++ standards are not relevant. ASAN works on
>> LLVM IR, not on C or C++. Lots of different languages have LLVM frontends. I
>> personally turn Ada and Fortran into LLVM IR all the time for example. Clearly
>> the C standard is not relevant to LLVM IR coming from such languages. What
>> matters is how LLVM IR is defined. As far as I know this construct is perfectly
>> valid in LLVM IR.
The issue here is that a load that reads data past the end of an alloca
can occur at the LLVM IR level in one of three ways:
1) Because the program at the original source-code level does it and is
incorrect.
2) Because the program at the original source-code level does it and is
correct (although that must be a pretty wacky language).
3) Load-widening introduces it when processing loads from allocas that
are properly aligned.
As it is today, an analysis cannot look at the LLVM IR and know which
condition is causing the load to read data past the end of the memory
object. As such, tools like SAFECode and ASAN don't know when to relax
their run-time checks to permit such out-of-bounds reading; they either
have to relax it for all such loads (in which case a bug in the C source
code might slip through), or they have to report it all the time (and
report false positives for correct C programs).
I assume Kostya's new attribute is a way to permit the LLVM IR to
specify whether such an out-of-bounds read is intentional or not.
In my opinion, I don't think we should bother with an attribute.
Load-widening's behavior does not introduce exploitable code into the
program on commonly-used machines and operating systems(*), and
incorrect source code at the C source level that exhibits identical
behavior isn't exploitable, either. SAFECode can be enhanced so that
the run-time checks for loads relax their guarantees for aligned allocas
that are subject to load-widening; I imagine ASAN can be similarly modified.
We won't catch some bugs in C/C++ code, but that's a natural consequence
of deciding to permit certain out-of-bounds loads at the LLVM IR level,
IMHO.
My two cents.
-- John T.
(*) All bets are off for unconventional systems, though.
>>
>>
>> Asan will not work for Fortran and Ada anyway (at least, out of the box).
>> I am not even sure that anything like asan is needed for Ada (it has bounds
>> checking built-in, the dynamic memory allocation is much more restrictive).
>> The tool is rather specific to C/C++ (and ObjectiveC probably, although we have
>> almost no tests for ObjectiveC, nor much knowledge in it).
>> Yes, the IR transformations are done on the LLVM level, but the asan run-time
>> library heavily depends on the C/C++ semantics and even implementation,
>> and you can't really separate the asan instrumentation pass from the run-time.
> it's pretty disappointing to hear that asan is basically just for C. But since
> it is, I won't bother you anymore about this attribute (though I still don't
> like it much).
>
> Ciao, Duncan.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list