[llvm-dev] BoundsChecking Pass

Nuno Lopes via llvm-dev llvm-dev at lists.llvm.org
Mon May 23 15:06:35 PDT 2016


>> It's true there's little documentation about it (only mentioned in: 
>> http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#availablle-checks). 
>> You can run it with 'clang -fsanitize=bounds' or 'opt -bounds-checking'.
>> The BoundsChecking pass, AddressSanitizer and BaggyBoundsCheck are all 
>> different code bases, each exploring a different set of tradeoffs.  The 
>> goal of the BoundsChecking pass was that the runtime penalty should be 
>> low enough to enable usage in production.
>>
>> Some information about the BoundsChecking pass:
>> - It is intra-procedural only. If you dereference a pointer that was 
>> passed as argument, then it is not checked (with some exceptions).
>> - It supports heap allocations, provided that these allocations are done 
>> using 1) standard functions that LLVM recognizes (malloc, new, strdup, 
>> etc) or 2) functions are annotated with alloc_size 
>> (https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html)
>> - It's helpful to compile with -O2, otherwise the pass will get confused 
>> very quickly.  The design of the analysis assumes at least a few 
>> simplifications were done before.
>
>
> OK, I just compiled it with -O2 and the heapoverflow protection have been 
> triggered. Though, I don't know what is the simplification required for 
> the pass to run correctly?

Most optimizations/analyses expect SROA (mem2reg) to be run, otherwise the 
IR is too messy to analyze. InstCombine also does nice cleanups.  These two 
are always a good idea to run, at least.


>> - Sometimes LLVM transforms loops into intrinsics, like memcpy or memset. 
>> Right now these are not checked (but should, though)
>> - Guards are mostly not hoisted out of loops by LLVM; this needs 
>> improvement otherwise perf may suffer quite a bit.
>
> Are you still working on it? If yes, what is it that you are trying to do? 
> I would like to work on this Pass during summer (until end of August). 
> That would be great if you could lead me a little bit =)

I'm not actively working on it at the moment, but I'm still interested.
I can certainly provide guidance and review patches.


>> - The analysis code is in lib/Analysis/MemoryBuiltins.cpp
>
> I have a question on this. As I read the code I was wondering how the 
> run-time part was implemented. I was looking for something like a 
> redefinition of malloc&free functions but I found no clue. Now I'm 
> wondering if it's reduced to the run-time action of the 
> ObjectSizeOffsetEvaluator class? This one is used to get the size&offset 
> of the current array pointer.

No, malloc/free functions are not redefined. LLVM simply knows that 
'malloc(x)' returns an object with size 'x' and offset 0. It then has to 
propagate this information all over (think of fat-pointers).  An alternative 
would be as you say to replace malloc/free and this is in fact the approach 
taken by the two other passes you mentioned.  BoundsChecking has no runtime; 
everything needed is inlined within the user's code.
ObjectSizeOffsetEvaluator builds expressions to give you the object 
size/offset at any given location.  For example, if you have a loop 
iterating over a pointer variable, this class will create code that tracks 
how object size/offset evolves throughout the loop iterations.


Nuno 



More information about the llvm-dev mailing list