[clang] [BoundsSafety] Allow 'counted_by' attribute on pointers in structs in C (PR #90786)

Dan Liew via cfe-commits cfe-commits at lists.llvm.org
Thu May 9 15:58:49 PDT 2024


delcypher wrote:

@bwendling

> I've been thinking about this restriction. Why is this necessary? My assumption was that applying counted_by to a pointer causes a bounds check on an index into the pointer rather than its underlying type.

@rapidsna Please add additional points if I accidentally miss something.

I can try to explain the restrictions from the perspective of the `-fbounds-safety` perspective. Essentially the reason these constructs are illegal in `-fbounds-safety` is because at runtime we need to be able to construct a wide pointer from `__counted_by()` pointers.  In the cases I've made illegal in this PR, this is either impossible to do correctly if we don't know the size of the pointee.

What I mean specifically by wide pointer is `__bidi_indexable` which is `-fbounds-safety`'s wide pointer that can be indexed positively and negatively. In `-fbounds-safety`'s model all pointers on the stack are `__bidi_indexable`  by default and all `__counted_by()` pointers are implicitly converted to wide pointers when they are used inside a function.

`-fbounds-safety`'s `__bidi_indexable` is **very roughly** equivalent to this C++ code

```c++
template <class T> struct WidePtr {
  T *ptr; // The address that is dereferenced
  // The bounds of the memory that forms the range of valid bytes to access
  // [lower_bound, upper_bound)
  void *lower_bound;
  void *upper_bound;

  WidePtr(T * /*__counted_by(count)*/ p, size_t count) {
    ptr = p;
    lower_bound = (void *)p;
    // NEED sizeof(T) here!
    upper_bound = ((char *)p) + sizeof(T) * count;
  }

  T& operator*() {
    check_bounds();
    return *ptr;
  }

  void check_bounds() {
    if (ptr < lower_bound)
      __builtin_trap();

    uintptr_t LastByteAccessedPlusOne;
    // NEED sizeof(T) here!
    if (__builtin_uadd_overflow(ptr, sizeof(T), &LastByteAccessedPlusOne))
      __builtin_trap();

    if (LastByteAccessedPlusOne > upper_bound)
      __builtin_trap();
  }

  // Lots of other operators are overloaded (e.g. +, -[], ++, -- and ->).
};
```

We need to know what `sizeof(T)` is in two places:

1. Everywhere you want to do a bounds check. E.g. when dereferencing the pointer.
2. When converting a `T* __counted_by()` pointer to a `WidePointer<T>` (or `T* __bidi_indexable` in `-fbounds-safety` parlance). If you look at the constructor **we need to know** `sizeof(T)` when computing the upper bound of the wide pointer. Given that this conversion happens whenever a pointer of the type `T* __counted_by()` is used inside a function in the `-fbounds-safety` model,  the model requires that `T` have a known size. Technically this PR is being even stricter. It's forbidding declaring a `T* __counted_by()` even if it isn't used. However, this is a restriction we can lift later once more of the implementation is in place.

I've tried to annotate your code below with the specific reasons they are forbidden.

```c
// Both `foo` and `bar` are legal
struct foo;
struct bar {
  int a;
  int fam[] __counted_by(a);
};

struct x {
    // Illegal in `-fbounds-safety`. We can't compute the upper bound of this pointer
    // because `sizeof(void)*count` isn't valid. In `-fbounds-safety` the attribute you
    // would use here is `__sized_by()` which is a byte count rather than an element count.
    void *p __counted_by(count); 
    
    // Illegal in `-fbounds-safety`. We can't compute the upper bound of this pointer because
    // `sizeof(struct foo)*count` isn't valid. In particular `sizeof(struct foo)` is invalid because
    // `struct foo` is an incomplete type.
    struct foo *f __counted_by(count);
    
    // Illegal in `-fbounds-safety`. While we can technically compute `sizeof(struct bar)*count`
    // that computation is completely wrong unless the `bar::a` is `0`. To actually get the correct
    // count we would need to walk every `struct bar` object in the buffer pointed to by `b` to
    //  read `b::a`. This could be expensive (and prone to race conditions) so we simply
    // don't support this right now.
    struct bar *b __counted_by(count);
    
    int count;
};
```

https://github.com/llvm/llvm-project/pull/90786


More information about the cfe-commits mailing list