[PATCH] D48958: [clang][ubsan] Implicit Cast Sanitizer - integer truncation - clang part

Fri Jul 27 08:28:45 PDT 2018

lebedev.ri added a comment.

Oops, forgot to submit the inline comments.
(It is inconvenient that they aren't submitted with the rest.)

================
Comment at: docs/ReleaseNotes.rst:292
+          store = store + 768; // before addition, 'store' was promoted to int.
+        (void)consume((unsigned int)val); // OK, the truncation is implicit.
+      }
----------------
rsmith wrote:
> implicit -> explicit
Whoops.

================
Comment at: docs/UndefinedBehaviorSanitizer.rst:95
+     of bigger bit width to smaller bit width, if that results in data loss.
+     That is, if the demoted value, after casting back to the original width,
+     is not equal to the original value before the downcast. This issue may
----------------
rsmith wrote:
> rsmith wrote:
> > erichkeane wrote:
> > > I think the last 2 commas in this sentence are unnecessary?  
> > I would parenthesize the "where the demoted value [...] would have a different value from the original" clause, since it's just explaining what we mean by "data loss".
> Is this really the right rule, though? Consider:
> 
> ```
> unsigned int x = 0x81234567;
> int y = x; // does the sanitizer catch this or not?
> ```
> 
> Here, the value of `x` is not the same as the value of `y` (assuming 32-bit int): `y` is negative. But this is not "data loss" according to the documented meaning of the sanitizer. I think we should produce a sanitizer trap on this case.
I've reverted this to my original text.
It should now convey the correct idea, but i'm not sure this is correct English.

> unsigned int x = 0x81234567;
> int y = x; // does the sanitizer catch this or not?
No, it does not. It indeed should. I do plan on following-up with that,
thus i've adding the group (``-fsanitize=implicit-conversion``), not just one check.

================
Comment at: include/clang/Basic/Sanitizers.def:134-136
 SANITIZER_GROUP("integer", Integer,
                 SignedIntegerOverflow | UnsignedIntegerOverflow | Shift |
                 IntegerDivideByZero)
----------------
rsmith wrote:
> `-fsanitize=integer` should include `-fsanitize=implicit-integer-truncation`.
Wow. Ok.

================
Comment at: lib/CodeGen/CGExprScalar.cpp:954-955
+  // We only care about int->int casts here, and ignore casts to/from pointer.
+  if (!(isa<llvm::IntegerType>(SrcTy) && isa<llvm::IntegerType>(DstTy)))
+    return;
+  // We do not care about booleans.
----------------
rsmith wrote:
> Check the Clang types here, not the LLVM types. There is no guarantee that only integer types get converted to LLVM `IntegerType`s. (But if you like, you can assert that `SrcTy` and `DstTy` are `IntegerType`s after checking that the clang types are both integer types.)
Interesting, ok.

================
Comment at: lib/CodeGen/CGExprScalar.cpp:956-958
+  // We do not care about booleans.
+  if (SrcType->isBooleanType() || DstType->isBooleanType())
+    return;
----------------
rsmith wrote:
> I believe this is redundant: we don't get here for an integer to boolean conversion, and a boolean to integer conversion would always be caught by the bit width check below.
Uhm, i'll replace it with an assert then.

================
Comment at: lib/CodeGen/CGExprScalar.cpp:962-964
+  // This must be truncation. Else we do not care.
+  if (SrcBits <= DstBits)
+    return;
----------------
rsmith wrote:
> I think you should also catch casts that change signedness in the case if the sign bit is set on the value. (Though if you want to defer this to a follow-up change and maybe give the sanitizer a different name, that's fine with me.)
Yes, thank you for bringing this up.
That is certainly the plain, but i always planned to add that later on.

Repository:
  rC Clang

https://reviews.llvm.org/D48958