[clang] [lldb] [Clang] Introduce OverflowBehaviorType for fine-grained overflow control (PR #148914)

Oliver Hunt via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 14 17:27:00 PDT 2025


ojhunt wrote:

> > > This one, however, I think it not what we want, but I may be missing something.
> > > ```
> > > int __ob_trap src = bigger_than_255;
> > > ...
> > > char dest = src;
> > > ```
> > 
> > 
> > I don't think that's correct, it's equivalent to saying
> > ```
> > int64 __ob_trap src = bigger_than_2_to_32;
> > ...
> > int dest = src;
> > ```
> > 
> > Should not trap, and similarly
> 
> Yes, this should not trap.
> 
> > ```
> > void f(int);
> > int64 __ob_trap src = bigger_than_2_to_32;
> > f(src);
> > ```
> > Would also not trap.
> 
> Also correct, this should not trap.

Then I have a fundamental disagreement here. I am unsure what the semantics are meant to be to allow this to happen?

It implies _any_ implicit conversion drops the obt annotations, and reverts to UB overflow. The above `f(src)` indicates that an implicit narrowing cast drops obt annotations, e.g instead of being equivalent to `f((int)(int __ob_mode)src)` it becomes `f((int)(int64_t)src)` -- e.g overflow becomes UB again.

> 
> > It also raises questions about what happens with
> > ```
> > int __ob_wrap src = bigger_than_2_to_32;
> > ...
> > char dest = src;
> > ```
> 
> `src` takes the wrapped value of `bigger_than_2_to_32` without complaint since its Overflow Behavior is to wrap. Then `dest` gets a potentially truncated value from `src` because `dest` does not specify any Overflow Behavior, so it falls back to the behavior in the standard: it is unsigned, so it wraps.

The above are signed not unsigned, I _believe_ this means that the behavior is UB.
> 
> > If ob_trap is ignored, ob_wrap should similarly be ignored, and thus we have UB overflow again.
> 
> I don't agree: the UB is related to using non-OBT variables. That is unchanged. (And to be really careful, it's actually not UB, but rather the larger set of "Overflow Behavior": unsigned overflow isn't UB.)

The above examples were signed types for the specific reason of requiring consideration of this behavior's interaction in cases where overflow is UB. The behavior you are requesting means that the overflow is UB for signed types, despite explicit annotations.
 
> For the OBT variable, overflows must follow a specific behavior on assignment: OBT indicates the behavior when _unable to represent a given value in the variable's storage_ It cannot mean anything else; it's not a conceptual enforcement: it needs to be about managing the failure to represent a value in a given storage. The storage is exactly the variable itself, so only that variable can be involved.

The conversion is from an annotated type to an unannotated type. The annotations are a safety and security  so having the silent/implicit behavior be the unsafe option seems the entirely incorrect approach. we have decades of evidence now saying "implicit unsafe behavior by default" is the wrong choice.

> Specifically, for roll out in Linux, we need to be able to migrate to new types, which means we can't have assignments to a non-OBT variable suddenly gain instrumentation.
> 
> For example, if a codebase has this:
> 
> ```
> int big = runtime_1024;
> ...
> char byte = big;
> ```
> 
> If we make only `byte` an trapping OBT, there is no surprise: we're asking that `byte` be specifically instrumented to trap if the value to be assigned cannot be represented:

No, you're asking for trapping behavior if an overflow would occur, and here that trap would occur. There is nothing in that assignment that indicates an overflow is expected, or that the input is expected be out of bounds during correct operations. Honestly (and this is the problem) there's nothing here to indicate that narrowing may even be occurring.

> 
> ```
> int big = runtime_1024;
> ...
> char __ob_trap byte = big; // traps: "byte" would overflow
> ```

Why should this overflow trap when the prior does not: In the first example a person has annotated the type to say "the high bits are important and should not be silently discarded" and you're saying they should be discarded silently, so why is it not just as reasonable here? The author of `big` has not said the high bits are important so why do they matter now?

> 
> If we make only `big` an OBT, we cannot have `byte` trap because it is _not_ marked to care about overflow behavior.
> 
> ```
> int __ob_trap big = runtime_1024;
> ...
> char byte = big; // must not trap: "big" did not overflow!
> ```
> 
> OBT needs to apply only to OBT annotated variable assignments, either explicit (`=`) or implicit (arithmetic intermediaries). The only place where we pollute this is during arithmetic operations where non-OBT gains OBT during implicit casting/promotion. The way OUT of that is the `__strict` (or `__strong typedef`) behavior where no implicit casts can involve an annotated variable.
> 
> If we don't follow what I'm describing, this becomes meaningless:
> 
> ```
> int __ob_trap big = runtime_1024;
> ...
> char __ob_wrap byte = big; // is this supposed to wrap or trap? I say wrap -- we're describing how "byte" behaves
> ```
> 
> How can we perform an assignment where we _want_ a wrapping behavior if suddenly `ob_trap` dominates even from the RHS?

Im not sure I understand? Why would `(type __ob_wrap)expression` not work?

> 
> And this gets into the explicit cast stuff I mentioned:
> 
> ```
> int big = runtime_1024;
> ...
> return (char __ob_trap)big;
> ```
> 
> This creates a `char __ob_trap` intermediate, and gets assigned the value of `big`, so it must trap. Explicit casts must behave as assignments for OBT. Without this, adding `__strong typedef` becomes impossible as there is no way to perform the overflow checking to get matching sizes for arithmetic:
> 
> ```
> __strong typedef u16 __ob_trap tu16;
> 
> int big;
> u8 __ob_wrap small;
> ...
> tu16 result = (tu16)big * (tu16)small;
> ```
> 
> This needs to catch any `big` value that cannot be represented by `u16` at cast time, otherwise we silently lose visibility into the overflow.

I don't understand what you are saying/asking for? None of these examples seem to be changed by having the default behavior of implicit conversions being unsafe?

the above example behaves as seems expected to me:
```
tu16 result = (tu16)big * (tu16)small;
```
becomes (because <int size values are widened to int)
```
tu16 result = (tu16)((int __ob_trap)(tu16)big * (int)(tu16 __ob_trap)small); 
```
if the int arithmetic overflows it traps, rather than being UB - because the overflow trapping behaviour has propagated - then truncation back down to u16 similarly captures anything outside of those bounds.

If we adopt a "silently discard the trapping behavior" a typo like

```
u16 result = (tu16)big * (tu16)small;
```

will silently discard any overflow, e.g big = 65535, small = 2 -> produces 131070 as `int __ob_trap` which is in bounds, but outside of the result bounds, but by unintentionally dropping the qualifier developer loses the overflow protection, and might hit ub? (I can never remember the spec ordering of signed->unsigned and narrowing conversions)

If an author has a value of type `wider_t __obt_trap` and wants to change to a narrower wrapping type, they'd simply cast through `wider_t __obt_wrap` first - separating out the ambiguous semantics of narrowing vs wrap/trap.

If the _default_ case of implicit conversion of a trap-on-overflow type is that narrowing silently discards the overflow it implies that discarding that overflow is safe. That runs counter to my experience of what people want when using trapping arithmetic, or code that has had to manually perform such logic.

It essentially means that any interface that is not using the widest arithmetic type will implicitly (and silently) ignore overflow from/during prior computation if at any point in that computation the types involved become wider than the result type, and that loss can occur due to unexpected/unrelated changes elsewhere. Take:

```cpp
unsigned some_func_or_struct_field_or_variable();
void *alloc(unsigned);
... {
  unsigned __ob_trap v = ...;
  return alloc(v * some_func_or_struct_field_or_variable());
}
```

Currently an overflow in the multiply traps, but now someone comes along and does

```cpp
-unsigned some_func_or_struct_field_or_variable();
-size_t some_func_or_struct_field_or_variable();
```

Now the overflow is silently ignored, because the overflow arithmetic is performed on size_t, and excess bits are silently dropped due to the implicit narrowing.

https://github.com/llvm/llvm-project/pull/148914


More information about the cfe-commits mailing list