[clang] [clang][Sema] Fix type of an statement expression ending with an atomic type (PR #119711)

Aaron Ballman via cfe-commits cfe-commits at lists.llvm.org
Tue Feb 4 11:32:23 PST 2025


Alejandro =?utf-8?q?Álvarez_Ayllón?=,
Alejandro =?utf-8?q?Álvarez_Ayllón?=,
Alejandro =?utf-8?q?Álvarez_Ayllón?Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/119711 at github.com>


================
@@ -15919,10 +15919,17 @@ ExprResult Sema::ActOnStmtExprResult(ExprResult ER) {
   if (Cast && Cast->getCastKind() == CK_ARCConsumeObject)
     return Cast->getSubExpr();
 
+  auto Ty = E->getType().getUnqualifiedType();
----------------
AaronBallman wrote:

> > > As far as this patch goes, my longstanding thought is that _Atomic is clearly a qualifier.
> > 
> > 
> > Errrr, I don't agree. I think `_Atomic` is a type specifier. `_Atomic int` and `int` are not the same type
> 
> `const int` and `int` are also not the same type.

Maybe I'm being imprecise, sorry for that. `const int` and `int` are both "morally" of type `int` in that the allowed accesses to either type will give the same code generation and behavior. There's no difference between read accesses to `int` and `const int`: https://godbolt.org/z/GzhKWr8W6

I think `int` and `_Atomic int` are not morally the same types; they have different behaviors due to the accesses: https://godbolt.org/z/xf8WEzaM5, they can have different sizes, they can have different alignments, etc.

I guess I view `int` vs `_Atomic int` being closer in behavior to `int` vs `long long int`.

> > (heck, they can even have different sizes and alignments!).
> 
> Size and alignment is the (back-justified) reason we draw the line where we currently do. It is not a particularly sensible way to do it. Most of our existing qualifiers affect code generation and ABI, just in ways that happen to not affect size and alignment. Several qualifiers affect the ABI rules for containing aggregates, for example.

Do you have an example of impacting the ABI rules in C?

> > It's somewhat like a qualifier in that it impacts how you access the underlying object (so in that regard, it is a bit like `volatile int` and `int`), but I think the fundamental thing is that you rarely want `int` semantics directly when the type is spelled `_Atomic int`.
> 
> I'm not quite sure what you're saying here.

Loading an int to get its value is a different operation than loading an atomic int to get its value; the latter may require entirely different instruction selection. Probably `int` is a bad example, perhaps `struct` values makes the distinction more clear?

> On the implementation level, the compiler can _never_ ignore qualifiers when performing an access to an object; the whole point of qualifiers is that they affect the semantics of accesses. On the language design level, I agree with you that programmers should never perform normal accesses to `_Atomic` objects instead of using the intrinsics. However, that is an argument that it was an unfortunate language design decision that `_Atomic` behaves like a qualifier. In the language that we've actually got, `_Atomic` has most of the characteristics of qualifiers, and there is no benefit to the compiler implementation in pretending otherwise.

Maybe I'm odd (it's known to happen), but I guess I view `_Atomic int` and `int` in C the same way as I view `std::atomic<int>` and `int` in C++. While there's a lot of similarities between how atomics work and how qualifiers work, I instinctively view them as completely separate types.

> The common characteristics of qualifiers:
> 
>     * They convey an aspect of how a value is stored or accessed other than the abstract value itself.

Agreed.

>     * They are typically disallowed or silently dropped at the top level in type contexts that primarily describe an abstract value rather than storage, such as casts, return values, and parameters (when determining the function type).

Agreed, at least sort of. Qualifiers on parameters and return types are deeply weird (IMO): https://godbolt.org/z/zxo3b9WK8

>     * R-value expressions represent an abstract value and are (for the most part) unqualified at the top level. Using a qualified l-value in a context that requires an r-value drops all qualifiers as part of the conversion to an r-value. The qualifiers typically affect the code-generation of the load.

Agreed.

>     * Assignment consumes an abstract value and therefore expects an r-value on the RHS, which (per above) is of unqualified type. The qualifier typically affects code-generation of the store.

Agreed.

>     * The qualifier (if it can apply to structs and unions at all) propagates along member accesses, such that the type of `x.member` has all the top-level qualifiers of both `x` and the member.

Agreed.

> Most of these apply to `_Atomic`. The most notable exception is the last, which C makes UB. But it is not uncommon for qualifiers to have tweaks to various of these rules, like how C++ allows qualified return values of class type for some reason.

Another notable exception is that `_Atomic` is not silently dropped at the top level in type for return values and parameters when determining the function type, as mentioned above.

> The subtyping rules for CV qualifiers — the ability to e.g. convert an `int**` to `const int * const *` — do not generally work for other qualifiers. For example, even just considering semi-standardized qualifiers, you clearly cannot add an address space in a nested position this way. I believe you have to understand this as a useful special case for the CV qualifiers. (It really should not work for `restrict`, although I can't remember what rule we actually implement.)

Agreed.

> > > So my suggestion is that common functions like getUnqualifiedType() should just be looking through _Atomic, which would have fixed this bug before it happened. If there's a need for a narrower API that only looks through non-_Atomic qualifiers (or perhaps some subset of "non-ABI-affecting" qualifiers?), we can add that with a nice name that suggests when it should be used.
> > 
> > 
> > On the one hand, this specific bug has bitten us so many times I want to agree with your proposed approach.
> > On the other hand, that's going to be a massive undertaking that's pretty risky.
> 
> Yeah, I don't disagree with this, and I'm not trying to force the problem to be solved immediately. I would like to try to get consensus on the right thing to do, though.

Agreed, and I appreciate the conversation on this!

> > We have times when we definitely _do not_ want the qualified type to drop the atomic qualifier. For example, when merging function types, we call `getUnqualifiedType()` to determine whether two parameters are compatible. `void func(const int i);` and `void func(int i);` are compatible redeclarations, while `void func(_Atomic int i);` and `void func(int i);` are not. There are other times when we do want the atomic qualifier dropped. It's a frustrating mess and I'm certain we have numerous bugs in this area.
> 
> In this case, I have to wonder if the standard is in the same place Clang is — if this was really an intentional allowance for `_Atomic T` as a parameter in function types or just an oversight. It's hard to imagine what an atomic argument could possibly mean. Perhaps it was an oversight that they no longer feel comfortable fixing.

Doesn't it matter in the same way it matters for `const` parameters: it impacts the accesses to the parameter within the function body itself? One contrived use case is:
```
thrd_t spawn_thread_to_do_things_to(_Atomic int *i);

void func(_Atomic int i) {
  thrd_t t = spawn_thread_to_do_things_to(&i);
  i = 12;
  thrd_join(t, NULL);
}
```

> > So the question to me becomes: do we fix this narrow issue and kick the can down the road? Or do we try to find someone willing to undergo the more significant cleanup?
> 
> If we agree that the compiler probably ought to be defaulting to treating atomic types as qualifiers, I think the path is:
> 
>     1. Introduce a function that looks through `_Atomic` as well as the other qualifiers.
> 
>     2. Make a best effort to switch existing call sites of `getUnqualifiedType()` that immediately also look through `_Atomic` to instead use the common function. Probably ought to be the same patch as (1).

I think we've already done this though. We have `getAtomicUnqualifiedType()` which we use when we want to get a non-atomic, unqualified type.

>     3. Rework this patch to just use the new function.
> 
>     4. Introduce a function that explicitly does not look through `_Atomic`.

Which is `getUnqualifiedType()` currently.

>     5. Gradually go through the remaining `getUnqualifiedType()` uses and figure out which of the explicit functions they should use, adding tests as necessary.
> 
>     6. When that's done, decide if we've got compelling evidence that `getUnqualifiedType()` should just default to looking through `_Atomic`.
> 
> 
> I think asking Alejandro to do steps 1-3 as part of this bug fix is reasonable, and then the rest can be an open project.

So I think what I'm getting is: we should audit all existing calls to `getUnqualifiedType()` and `getAtomicUnqualifiedType()` to determine whether we're correctly handling atomics or not, and if there are more uses of `getAtomicUnqualifiedType()` after correcting issues, then maybe we want `getUnqualifiedType()` to look through atomics and rename `getAtomicUnqualifiedType()` to something else?

https://github.com/llvm/llvm-project/pull/119711


More information about the cfe-commits mailing list