[PATCH] D48958: [clang][ubsan] Implicit Cast Sanitizer - integer truncation - clang part

Tue Jul 24 11:47:33 PDT 2018

vsk accepted this revision.
vsk added a comment.
This revision is now accepted and ready to land.

LGTM, although I think it'd be helpful to have another +1 just to be safe.

I did two small experiments with this using a Stage1 Rel+Asserts compiler:

Stage2 Rel+Asserts build of llvm-tblgen:
ninja llvm-tblgen  384.27s user 14.98s system 1467% cpu 27.203 total

Stage2 Rel+Asserts build of llvm-tblgen with implicit-cast checking:
ninja llvm-tblgen  385.15s user 15.02s system 1472% cpu 27.170 total

With caveats about having a small sample size here and testing with an asserts-enabled stage1 build, I don't see any red flags about the compile-time overhead of the new check. I would have liked to measure the check against more code, but I couldn't complete a stage2 build due to a diagnostic which might plausibly point to a real issue in tblgen:

  /Users/vsk/src/llvm.org-lldbsan/llvm/utils/TableGen/RegisterInfoEmitter.cpp:604:17: runtime error: implicit cast from type 'int' of value -1 (32-bit, signed) to type 'const unsigned short' changed the value to 65535 (16-bit, unsigned)

With -fno-sanitize-recover=all disabled, I found a few more reports:

  /Users/vsk/src/llvm.org-lldbsan/llvm/include/llvm/Object/Archive.h:278:38: runtime error: implicit cast from type 'int' of value -1 (32-bit, signed) to type 'uint16_t' (aka 'unsigned short') changed the value to 65535 (16-bit, unsigned)
  --> uint16_t FirstRegularStartOfFile = -1;

  /Users/vsk/src/llvm.org-lldbsan/llvm/lib/Analysis/MemorySSA.cpp:199:12: runtime error: implicit cast from type 'size_t' (aka 'unsigned long') of value 4969132974595412838 (64-bit, unsigned) to type 'unsigned int' changed the value to 3765474150 (32-bit, unsigned)
  --> hash_combine() result casted to unsigned

  /Users/vsk/src/llvm.org-lldbsan/llvm/lib/CodeGen/TargetLoweringBase.cpp:1212:30: runtime error: implicit cast from type 'unsigned int' of value 512 (32-bit, unsigned) to type 'unsigned char' changed the value to 0 (8-bit, unsigned)
  --> NumRegistersForVT[i] = getVectorTypeBreakdownMVT(...)

  /Users/vsk/src/llvm.org-lldbsan/llvm/lib/Transforms/Scalar/EarlyCSE.cpp:136:12: runtime error: implicit cast from type 'size_t' (aka 'unsigned long') of value 16583795711468875482 (64-bit, unsigned) to type 'unsigned int' changed the value to 3116347098 (32-bit, unsigned)
  --> hash_combine() result casted to unsigned
  ...

These four at least don't look like false positives:

- Maybe we should consider special-casing assignments of "-1" to unsigned values? This seems somewhat idiomatic.
- At least a few of these are due to not being explicit about dropping the high bits of hash_combine()'s result. Given that this check is opt-in, that that seems like a legitimate diagnostic (lost entropy).
- The TargetLoweringBase.cpp diagnostic looks a bit scary.

================
Comment at: lib/CodeGen/CGExprScalar.cpp:986
+  for (auto CastRef : llvm::reverse(CastExprStack)) {
+    const CastExpr *Cast = &(CastRef.get());
+    // Was there a previous cast?
----------------
nit, extra parens?

Repository:
  rC Clang

https://reviews.llvm.org/D48958