[PATCH] D100037: [clang][UBSan] Passing underaligned pointer as a function argument is undefined behaviour

Wed Apr 7 11:36:09 PDT 2021

lebedev.ri added a comment.

In D100037#2674610 <https://reviews.llvm.org/D100037#2674610>, @rsmith wrote:

>> This is somewhat related to the following RFC by @rjmccall:
>> https://lists.llvm.org/pipermail/llvm-dev/2016-January/094012.html
>
> It looks to me like this direction is in conflict with that RFC. In particular, it says:
>
>> It is not undefined behavior to create a pointer that is less aligned
>>  than its pointee type.  Instead, it is only undefined behavior to
>>  access memory through a pointer that is less aligned than its pointee
>>  type.
>
> ... and gives as an example that passing an underaligned pointer to a function that doesn't perform an underaligned access is OK under the proposed rule.

Yes. We now have a motivational reason to do so, it wouldn't be "just expose the UB for the sake of miscompiling the program".

> Putting this behind a `-fstrict-*` flag (as this patch does) seems OK to me, but I think enabling it by default on any target would be inconsistent with the RFC and would need broader discussion and a change in policy.

I suppose so.

> I think it is reasonable to add stricter UBSan checks for underaligned pointers, but that should be available regardless of what properties our implementation happens to treat as defined; part of the point of UBSan is to check for portability issues and to check whether problems would arise as a prerequisite for enabling the corresponding `-fstrict` flag. So I do not think the checks should be gated behind the `-fstrict` flag. I don't think this is analogous to `-fwrapv`, which actually opts into a different set of language rules where integer overflow is defined; this is more like `-fstrict-enums`, for which we will still sanitize even in cases where we happen to choose to not optimize, or the sanitizer for floating-point division by zero, for which we still sanitize (because it's formally UB) but don't include in `-fsanitize=undefined` (because our implementation defines it). Generally-speaking, things controlled by `-fstrict-*` flags are checked by UBSan and in particular by `-fsanitize=undefined` regardless of whether the `-fstrict-*` flag is specified, and I think there are some cases where `-fsanitize=undefined` already diagnoses misalignment that is valid under John's RFC, so adding the new checks to the `-fsanitize=undefined` sanitizer group (regardless of `-fstrict`) seems consistent to me.

Aha. I was also very uneasy about doing the check behind the `-fstrict`.
Will change it to be unconditional!

> If we want to add a sanitizer that checks for misaligned pointers existing, instead of checking (as our current alignment sanitizer does) for misaligned accesses, then I think it would be more precise (and add a lot fewer checks) to enforce the language rule as written, that is, check for pointer and reference casts to a more-aligned type, rather than checking function arguments. I assume your intent here is to eventually add LLVM parameter attributes indicating the function argument is correctly-aligned, which is why you're checking on call boundaries?

It's complicated. I would like this to be more strict, and catch any misaligned pointers, yes.
But, two things:

1. it's going to be much more noisy than only dealing with function arguments :)
2. even if we are okay with that, handling this *just* on the callers side (when the pointer is created) is a wrong solution here. we still need to do this on the callee side. because we can't be sure that the caller will be compiled with sanitizer, but we *know* that the callee will be micompiled by us.

So at most i could handle it in more cases, but this one should stay..
Does this make sense?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100037/new/

https://reviews.llvm.org/D100037