[PATCH] D100037: [clang][UBSan] Passing underaligned pointer as a function argument is undefined behaviour

Wed Apr 7 11:16:04 PDT 2021

rsmith added a comment.

> This is somewhat related to the following RFC by @rjmccall:
> https://lists.llvm.org/pipermail/llvm-dev/2016-January/094012.html

It looks to me like this direction is in conflict with that RFC. In particular, it says:

> It is not undefined behavior to create a pointer that is less aligned
>  than its pointee type.  Instead, it is only undefined behavior to
>  access memory through a pointer that is less aligned than its pointee
>  type.

... and gives as an example that passing an underaligned pointer to a function that doesn't perform an underaligned access is OK under the proposed rule. Putting this behind a `-fstrict-*` flag (as this patch does) seems OK to me, but I think enabling it by default on any target would be inconsistent with the RFC and would need broader discussion and a change in policy.

I think it is reasonable to add stricter UBSan checks for underaligned pointers, but that should be available regardless of what properties our implementation happens to treat as defined; part of the point of UBSan is to check for portability issues and to check whether problems would arise as a prerequisite for enabling the corresponding `-fstrict` flag. So I do not think the checks should be gated behind the `-fstrict` flag. I don't think this is analogous to `-fwrapv`, which actually opts into a different set of language rules where integer overflow is defined; this is more like `-fstrict-enums`, for which we will still sanitize even in cases where we happen to choose to not optimize, or the sanitizer for floating-point division by zero, for which we still sanitize (because it's formally UB) but don't include in `-fsanitize=undefined` (because our implementation defines it). Generally-speaking, things controlled by `-fstrict-*` flags are checked by UBSan and in particular by `-fsanitize=undefined` regardless of whether the `-fstrict-*` flag is specified, and I think there are some cases where `-fsanitize=undefined` already diagnoses misalignment that is valid under John's RFC, so adding the new checks to the `-fsanitize=undefined` sanitizer group (regardless of `-fstrict`) seems consistent to me.

If we want to add a sanitizer that checks for misaligned pointers existing, instead of checking (as our current alignment sanitizer does) for misaligned accesses, then I think it would be more precise (and add a lot fewer checks) to enforce the language rule as written, that is, check for pointer and reference casts to a more-aligned type, rather than checking function arguments. I assume your intent here is to eventually add LLVM parameter attributes indicating the function argument is correctly-aligned, which is why you're checking on call boundaries?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100037/new/

https://reviews.llvm.org/D100037