[PATCH] D75695: [StackProtector] Catch direct out-of-bounds when checking address-takenness

Wed Mar 11 10:13:07 PDT 2020

efriedma added inline comments.

================
Comment at: llvm/lib/CodeGen/StackProtector.cpp:202
+      APInt MaxOffset(TypeSize, DL.getTypeStoreSize(
+                        AI->getType()->getPointerElementType()));
+      if (!GEP->accumulateConstantOffset(DL, Offset) || Offset.uge(MaxOffset))
----------------
john.brawn wrote:
> efriedma wrote:
> > john.brawn wrote:
> > > efriedma wrote:
> > > > I think AllocaInst has a getAllocatedType, or something like that?
> > > > 
> > > > Should use getTypeAllocSize, since that's what actually determines the number of bytes allocated.
> > > > I think AllocaInst has a getAllocatedType, or something like that?
> > > AI isn't necessarily an AllocaInst, e.g. we could be looking at a GEP of a GEP or a GEP of a PHI.
> > > 
> > > > Should use getTypeAllocSize, since that's what actually determines the number of bytes allocated.
> > > Will do.
> > > 
> > Oh, I see, the function recurses.
> > 
> > I don't think the recursion works correctly in general.  The values that actually matter are the current offset, determined recursively, the size of the original alloca, and the size of any memory operations using the result of the GEP.  Using the size of the GEP's operand type doesn't work correctly in general; in particular, if there's another GEP or bitcast in the recursive chain of uses, you can come up with the wrong result.
> I think adding a check that bitcast doesn't bitcast to a larger type should be enough to handle this. If we have a GEP of a GEP then each one will make sure that the offset is within the bounds of the specific type it's handling. Doing it like this does mean that for something like
> 
> ```
> int fn() {
>   int arr[4][4];
>   return arr[0][5];
> }
> ```
> we say it's an out-of-bounds access when it doesn't actually access outside of the underlying object, but I don't think handling this kind of case is worth the added complexity.
That's still not enough given the way you're doing the check.  Consider if someone writes something like

```
struct X { int a[10][10]; int b; };
struct X x;
int (*p)[10] = &x.a[10];
int z = a[5];
```

This lowers to two GEPs; neither is "out-of-bounds" in the sense you're checking.

You could try to enforce that each index of each GEP is in the valid range, I guess.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75695/new/

https://reviews.llvm.org/D75695