[llvm] [clang] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

Richard Smith via cfe-commits cfe-commits at lists.llvm.org
Fri Jan 19 16:36:58 PST 2024


zygoloid wrote:

> My answer for the question "what's the semantics of GCC's builtin X?" has always been "whatever GCC does." It's the best we can rely upon. But then we get into situations like this, where you and @nikic have one interpretation of their documentation and I have another. I can point to their behavior to back up my claim, but in the end it's probably not exactly clear even to GCC.

[@nikic demonstrated](https://github.com/llvm/llvm-project/pull/78526#issuecomment-1900439850) that our current behavior is already compatible with GCC's behavior. If GCC's behavior is the spec, then we are allowed to return 48 rather than only 40 or -1 (or presumably 0 if `argc` is out of bounds) for the original example, because in some cases GCC does so.

> My concern is that we want to use this for code hardening. Without precise object sizes, we're hampered in our goal. The unfortunate reality is that we can only get that size via these `__builtin_[dynamic_]object_size` functions.

That's a totally understandable desire, but I think it's not realistic to expect precise *sub*object sizes in the same cases that GCC can provide them, due to the different architectural choices in the two compilers. If we had a mid-level IR for Clang that still had frontend information, we could do better by evaluating BOS there, so maybe that's one long term path forward to consider. And in the short term, while there are cases where we won't be able to match GCC, I think Clang should do better than it currently does in the frontend, specifically in cases like the one in the bug report where there's an obvious better answer that doesn't require any sophisticated analysis to discover.

> > Here, `f` ideally would return 4, but at the LLVM IR level, `p` and `q` are identical values and the `&p->a` operation is a no-op. In cases like this, the best we can realistically do is to return 8.
> 
> The sub-object for `&p->a` and even `&p->b` is `struct X`, not the integers themselves. If you want that, you'll have to use casts: `&((char *)p->b)[2];`. (I had to take care to get that correct.) So `f` should return `8` (note it's likely to get `8` from the `alloc_size` attribute on `malloc` in your example).

GCC disagrees with you: https://godbolt.org/z/s4P74oEqx

https://github.com/llvm/llvm-project/pull/78526


More information about the cfe-commits mailing list