[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

Mon Nov 26 15:24:09 PST 2012

Hi Kevin,

Thanks for your interest and for your deep analysis.

Unfortunately, your approach doesn't catch all bugs and is vulnerable to an 
attack.
Consider the following case:

 ...................... | ----- obj --- | |
  end ^                        ptr ^   ^ end-of-memory

The scenario is as follows:
 - an object is allocated in the last page of the address space
 - obj is byte addressable (e.g., a char buffer)
 - ptr points to the last few bytes of the address space (with a large 
offset, but starting in bounds)
 - the information read/written is large and therefore there's an overflow 
in the memory addresses that are accessed.

In this case, you'll have ptr > lowerbound and end < upperbound. The bad 
part is that end < ptr.

I thought a bit on the subject and I couldn't find any solution with just 2 
branches.
The exception I found is if the size of an allocated object is always 
smaller than half of the address space (which seems a reasonable assumption 
to me, but I didn't discussed it with anyone else).

BTW, LLVM is supposed to be able to eliminate one of the comparisons on x86, 
since we can take advantage of the overflow flag that is set by the 'sub' 
instruction.  Last time I checked (in July), LLVM was not performing this 
optimization, but it may have been fixed in the meantime. There are still 3 
jumps, though.

Nuno

----- Original Message ----- 
From: "Schoedel, Kevin P" <kevin.p.schoedel at intel.com>
To: <llvmdev at cs.uiuc.edu>
Sent: Monday, November 26, 2012 3:56 PM
Subject: [LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

>I am investigating changing BoundsChecking to use address-based rather
> than size- & offset-based tests.
>
> To explain, here is a short code sample cribbed from one of the tests:
>
>  %mem = tail call i8* @calloc(i64 1, i64 %elements)
>  %memobj = bitcast i8* %mem to i64*
>  %ptr = getelementptr inbounds i64* %memobj, i64 %index
>  %4 = load i64* %ptr, align 8
>
> Currently, the IR for bounds checking this load looks like this:
>
>  %size = mul i64 8, %elements
>  %offset = mul i64 %index, 8
>  %objsize = sub i64 %size, %offset
>
>  %cmp2 = icmp ult i64 %size, %offset
>  %cmp3 = icmp ult i64 %objsize, 8
>  %cmp1 = icmp slt i64 %offset, 0
>
>  %9 = or i1 %cmp2, %cmp3
>  %11 = or i1 %cmp1, %9
>  br i1 %11, label %trap, label %12
>
>                      ┆       ┆
>                      │       │
>    ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┢━━━━━━━┪╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶
>    ↑                 ┃       ┃              ↑      ↑
>    ·                 ┇       ┇              ·      ·
>    ·                 ┃       ┃              ·      ·
>    ·        ╴╴╴╴╴╴╴╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶    ObjSize  ·
>    ·        ↑        ┃       ┃       ↑      ·      ·
>    ·     data item   ┃       ┃  NeededSize  ·    Size
>    ·        ↓        ┃       ┃       ↓      ↓      ·
>    ·        ╴╴╴╴Ptr╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶      ·
>    ·                 ┃       ┃              ↑      ·
>  memory object       ┇       ┇            Offset   ·
>    ↓                 ┃       ┃              ↓      ↓
>    ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┡━━━━━━━┩╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶
>                      │       │
>                      │       │  Cmp1: Offset ≥ 0 (unless constant)
>                      │       │  Cmp2: Size ≥ Offset
>                      ┆       ┆  Cmp3: ObjSize ≥ NeededSize
>
>
> I am looking at generating IR like this:
>
>  %upperbound = getelementptr inbounds i64* %memobj, i64 %elements
>  %end = getelementptr inbounds i64* %ptr, i64 1
>
>  %ilowerbound = ptrtoint i64* %memobj to i64
>  %iupperbound = ptrtoint i64* %upperbound to i64
>  %iptr = ptrtoint i64* %ptr to i64
>  %iend = ptrtoint i64* %end to i64
>
>  %cmpl = icmp ult %iptr, %ilowerbound
>  %cmpu = icmp ult %iupperbound, %iend
>
>  %9 = or i1 %cmpl, %cmpu
>  br i1 %11, label %trap, label %12
>
>                      ┆       ┆
>                      │       │
>    ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┢━━━━━━━┪╶╶╶╶╶UpperBound
>    ↑                 ┃       ┃
>    ·                 ┇       ┇
>    ·                 ┃       ┃
>    ·        ╴╴╴╴╴╴╴╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶End
>    ·        ↑        ┃       ┃
>    ·     data item   ┃       ┃
>    ·        ↓        ┃       ┃
>    ·        ╴╴╴╴Ptr╴╴┣━━━━━━━┫╶╶╶╶╶╶╶╶╶╶╶╶Ptr
>    ·                 ┃       ┃
>  memory object       ┇       ┇
>    ↓                 ┃       ┃
>    ╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┡━━━━━━━┩╶╶╶╶╶LowerBound
>                      │       │
>                      │       │  Cmp1: Ptr ≥ LowerBound
>                      ┆       ┆  Cmp2: UpperBound ≥ End
>
> Potential advantages that I see for address-based tests:
> - Always performs two comparisons, rather than (sometimes) three;
> - No need to recompute Offset and ObjSize for varying Ptr within the
>  same memory object;
> - getelementptr, for UpperBound and End, is typically very cheap on
>  contemporary processors (and ptrtoint typically free).
>
> I have prototyped address-based testing (with one wart that makes it
> not production grade) but not done benchmarking and analysis. Before
> I do that, I'm looking for any reasons that this method of bounds
> checking would be a "no go".
>
> -- 
> Kevin Schoedel kevin.p.schoedel at intel.com +1-519-772-2580
> SSG-DPD-ECDL-DMP - Intel Dynamic Mobility and Parallelism