[PATCH] D67122: [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour

Roman Lebedev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Sep 6 06:32:39 PDT 2019


lebedev.ri marked 5 inline comments as done.
lebedev.ri added inline comments.
Herald added a subscriber: ychen.


================
Comment at: clang/lib/CodeGen/CGExprScalar.cpp:4703-4720
+    // 2) The sign of the difference between the computed address and the base
+    // pointer matches the sign of the total offset.
+    llvm::Value *ValidGEP;
+    auto *NoOffsetOverflow = Builder.CreateNot(OffsetOverflows);
+    if (SignedIndices) {
+      auto *PosOrZeroValid = Builder.CreateICmpUGE(ComputedGEP, IntPtr);
+      auto *PosOrZeroOffset = Builder.CreateICmpSGE(TotalOffset, Zero);
----------------
vsk wrote:
> lebedev.ri wrote:
> > lebedev.ri wrote:
> > > This makes me ick every time i look at it.
> > > I wonder if this can be sanely rewritten via `.with.overflow` intrinsic..
> > Hm, 
> > ```
> > Name: add unsigned
> >   %computed = add i8 %base, %offset
> >   %PosOrZeroValid = icmp uge i8 %computed, %base
> > =>
> >   %agg = uadd_overflow i8 %base, %offset
> >   %computed = extractvalue i8 %agg, 0
> >   %ov = extractvalue i1 %agg, 1
> >   %PosOrZeroValid = xor i1 %ov, -1
> > 
> > Name: sub unsigned
> >   %computed = add i8 %base, %offset
> >   %PosOrZeroValid = icmp ule i8 %computed, %base
> > =>
> >   %negated_offset = sub i8 0, %offset
> >   %agg = usub_overflow i8 %base, %negated_offset
> >   %computed = extractvalue i8 %agg, 0
> >   %ov = extractvalue i1 %agg, 1
> >   %PosOrZeroValid = xor i1 %ov, -1
> > 
> > Name: add signed
> >   %computed = add i8 %base, %offset
> >   %PosOrZeroValid = icmp uge i8 %computed, %base
> >   %PosOrZeroOffset = icmp sge i8 %offset, 0
> >   %NegValid = icmp ult i8 %computed, %base
> >   %OKay = select i1 %PosOrZeroOffset, i1 %PosOrZeroValid, i1 %NegValid
> > =>
> >   %uadd.agg = uadd_overflow i8 %base, %offset
> >   %uadd.ov = extractvalue i1 %uadd.agg, 1
> >   %negated_offset = sub i8 0, %offset
> >   %usub.agg = usub_overflow i8 %base, %negated_offset
> >   %usub.ov = extractvalue i1 %usub.agg, 1
> >   %computed = add i8 %base, %offset
> >   %PosOrZeroOffset = icmp sge i8 %offset, 0
> >   %NotOKay = select i1 %PosOrZeroOffset, i1 %uadd.ov, i1 %usub.ov
> >   %OKay = xor i1 %NotOKay, -1
> > ```
> > 
> > That's not great, but i wonder what is more friendly to middle-end.
> > https://godbolt.org/z/ORSU_L
> I suggest splitting out changes to the existing pointer wraparound check in a follow-up patch. This may well be an improvement, but there are enough moving parts to this patch as-is.
I'll just precommit them, they are NFC.


================
Comment at: clang/test/CodeGen/catch-nullptr-and-nonzero-offset-blacklist.c:1
+// RUN: %clang_cc1 -fsanitize=pointer-overflow -fsanitize-recover=pointer-overflow -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s -implicit-check-not="call void @__ubsan_handle_pointer_overflow" --check-prefixes=CHECK,CHECK-NULL-IS-INVALID-PTR
+// RUN: %clang_cc1 -fno-delete-null-pointer-checks -fsanitize=pointer-overflow -fsanitize-recover=pointer-overflow -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s -implicit-check-not="call void @__ubsan_handle_pointer_overflow" --check-prefixes=CHECK,CHECK-NULL-IS-VALID-PTR
----------------
vsk wrote:
> lebedev.ri wrote:
> > vsk wrote:
> > > Should the test filename be "ignore-nullptr-and-nonzero-..."?
> > `-blacklist.c` is the filename i have i used for all similar testfiles in all previous sanitizers.
> > "black list" means here "list of cases where no sanitization will be emitted".
> This is minor. I only raised the nit as I felt that this test was about ignoring "foo", not about catching "foo", and that "blacklist" already means something else in the context of sanitizers. Happy to defer to your preference here.
Nitpicking is good, less chance to omit something that wasn't intentional and is thus bad :)


================
Comment at: clang/test/CodeGen/ubsan-pointer-overflow.m:48
 void pointer_array_unsigned_indices(int **arr, unsigned k) {
   // CHECK: icmp uge
   // CHECK-NOT: select
----------------
vsk wrote:
> lebedev.ri wrote:
> > vsk wrote:
> > > I'm curious about what happens here with the new null-status-change check. If array indices are unsigned, there is no need for a separate null-status-change check: if the result of the GEP is indeed null, that will be detected by the pointer wraparound check. We just need to check that the base pointer itself isn't null.
> > > 
> > > That generalizes to addition of unsigned offsets to base pointers, I think. So e.g. I wouldn't expect the 'preinc' function to contain a null-status-change check either.
> > Ugh, i still hate those codegen-time optimizations :)
> > Not only do they result in a maze that is 'hard' to comprehend and argue about,
> > the middle-end isn't "forced" to know these folds so the other code stays unoptimized,
> > worse yet, unlike miscompile that will be detected, if the logic error is made here,
> > it will likely "only" result in false-negatives :(
> > 
> > > We just need to check that the base pointer itself isn't null.
> > 
> > I don't believe that is sound for `unsigned add`: https://rise4fun.com/Alive/2J5 - will consider that `base=0 offset=0` is invalid.
> > Likewise, does not seem valid for `unsigned sub`: https://rise4fun.com/Alive/VEr - will not detect when pointer becomes null.
> > Likewise, does not seem valid for `signed add/sub`: https://rise4fun.com/Alive/97B - will not detect when pointer becomes null.
> > 
> > So, i'm not going to touch this. Unless i'm getting the gist of your comment wrong, or wrote those checks wrong?
> Totally see your point and understand the frustration :), we all want a tidy/simple frontend. And thanks for sharing these counter-examples. The context for my comment here is that -O0 -g and -Os -g compile-time and run-time performance (yes! -O0 run-time performance, really) matters a fair amount for Xcode users, and tend to regress with sanitizer changes. If there are simple (and correct :) ways to avoid regressing these, it's a plus.
I mean, yeah, `-O0` is unreplaceable for //best// debugging expirience, and it is pretty sluggish.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67122/new/

https://reviews.llvm.org/D67122





More information about the cfe-commits mailing list