[PATCH] D156312: [analyzer] Upstream BitwiseShiftChecker

Balázs Benics via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 10 03:12:17 PDT 2023


steakhal added a comment.

I haven't gone deep into the implementation, but by only looking at the quality of the code and the structure I assume it implements the right behavior.
I've checked the tests though, and they look good.
I only found irrelevant nits. Awesome work.

---

Have you considered checking the shift operands for taint?
I wonder if we could warn if we cannot prove the correct use of the shift operator when either of the operands is tainted.



================
Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:135
+                  "If set to true, the checker reports undefined behavior even "
+                  "if if is supported by most compilers. (This flag has no "
+                  "effect in C++20 where these constructs are legal.)",
----------------
Check for subsequent duplicated words. They are usually typos.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:9-11
+// This file defines BitwiseShiftChecker, which is a path-sensitive checker
+// that looks for undefined behavior caused by incorrect use of the bitwise
+// shift operators '<<' and '>>'.
----------------
Please elaborate what counts as "incorrect use".


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:29-30
+
+using namespace clang;
+using namespace ento;
+
----------------
This is used quite often and hinders the readability by indenting the args a lot. Making the spelling shorter, would somewhat mitigate this.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:35
+
+typedef std::unique_ptr<PathSensitiveBugReport> BugReportPtr;
+
----------------



================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:42
+
+constexpr NoteTagTemplate NoteTagTemplates[4] = {
+  {"", "right operand of bit shift is less than "},
----------------



================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:53
+class BitwiseShiftValidator {
+  // Primary mutable state:
+  CheckerContext &Ctx;
----------------
Where does this comment corresponds to?


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:57-59
+  // Secondary mutable state, used only for note tag creation:
+  enum { NonNegLeft = 1, NonNegRight = 2 };
+  unsigned NonNegFlags = 0;
----------------
How about using the same enum type for these two? It would signify that these are used together.
Also, by only looking at the possible values, it's not apparent that these are bitflag values. A suffix might help the reader.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:60-63
+  // If there is an upper bound, it is <= the size of a primitive type
+  // (expressed in bits), so this is a very safe sentinel value:
+  enum { NoUpperBound = 999 };
+  unsigned UpperBound = NoUpperBound;
----------------
Have you considered using `std::optional`?
Given that the dimension of this variable is "bits", I think it would be great to have that in its name.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:81
+
+  bool shouldPerformPedanticChecks() {
+    // The pedantic flag has no effect under C++20 because the affected issues
----------------
Check if your member functions can be declared as `const`.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:106-109
+  if (auto BRP = checkOvershift()) {
+    Ctx.emitReport(std::move(BRP));
+    return;
+  }
----------------
I'd prefer `BR` or `Report`. Since we know already that we are in the path-sensitive domain, `P` has less value there.
I'd apply this concept everywhere in the patch.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:124-127
+    // Report a bug if we are left shifting a concrete signed value and it
+    // would overflow (e.g. 2 << 31 with a 32-bit int) or reach the sign bit
+    // under C (e.g. 1 << 31 with a 32-bit int; under C++ this is a
+    // well-defined way to get INT_MIN):
----------------
To me it feels as if this comment should be at the definition of `checkLeftShiftOverflow`, and only keep here a short reminder. Similarly to how you did with the previous cases.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:148
+
+  const auto Val = Ctx.getSVal(operandExpr(Side));
+  const auto LimitVal = SVB.makeIntVal(Limit, Ctx.getASTContext().IntTy);
----------------



================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:190-191
+  std::string Msg =
+      llvm::formatv("The result of {0} shift is undefined because the right "
+                    "operand{1} is not smaller than {2}, the capacity of '{3}'",
+                    isLeftShift() ? "left" : "right", RightOpStr, LHSBitWidth,
----------------



================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:218
+                                       Side == OperandSide::Left ? "Left" : "Right",
+                                       isLeftShift() ? "left" : "right")
+                             .str();
----------------
The `isLeftShift() ? "left" : "right"` expression appears 4 times in total.
I'd also recommend to not distinguish the capitalized "left" and "right" strings, and just post-process the messages to ensure that the first character is capitalized inside the `createBugReport()`.

Also note that you can reuse the same placeholder, so you would only need to pass the `side()` only once.
```lang=C++
static void capitalizeBeginning(std::string &Str) {
  if (Str.empty())
    return;
  Str[0] = llvm::toUpper(Str[0]);
}

StringRef side() const {
  return isLeftShift() ? "left" : "right";
}
```


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:240
+  const QualType LHSTy = LHS->getType();
+  const unsigned LeftBitWidth = Ctx.getASTContext().getIntWidth(LHSTy);
+
----------------
This will ensure that the definition of `LeftAvailableBitWidth` does not wrap.
And, more importantly, `assert(LeftBitWidth >= UsedBitsInLeftOperand)` also holds, asserted later.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:252-254
+  // If the Left operand is negative, we report that and don't reach this
+  // function:
+  assert(Left->getValue().isNonNegative());
----------------
Okay, so this is an invariant.
How about rephrasing this to better express that:

```
We should have already reported if the left operand of the shift was negative, thus this cannot be negative here.
```


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:263
+
+  const SVal Right = Ctx.getSVal(Op->getRHS());
+
----------------
I'd move this closer to the first "use" place.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:315
+    default:
+      llvm_unreachable("this checker does not use other comparison operators");
+  }
----------------



================
Comment at: clang/test/Analysis/bitwise-shift-common.c:108
+  *p += 1 << a;
+  // expected-note at -1 {{Assuming right operand of bit shift is non-negative but less than 32}}
+  *p += 1 << (a + 32);
----------------
Hmm, does this "but" actually mean "and" here?
Anyway, both using "but" and "and" there looks awkward.
It might be only me though.


================
Comment at: clang/test/Analysis/bitwise-shift-common.c:129
+  *p = newval;
+  // expected-note at -1 {{The value 33 is assigned to 'right'}}
+}
----------------
It's an unrelated comment, but always bothers me to see 'right' mentioned here, where there is nothing called like that...


================
Comment at: clang/test/Analysis/bitwise-shift-common.c:148
+
+int allow_funny_code(void) {
+  // These are all legal under C++ 20 and many compilers accept them under
----------------



================
Comment at: clang/test/Analysis/bitwise-shift-sanity-checks.c:20-22
+// RUN:    -Wno-shift-count-negative -Wno-shift-negative-value \
+// RUN:    -Wno-shift-count-overflow -Wno-shift-overflow \
+// RUN:    -Wno-shift-sign-overflow
----------------
How about simplifying these lines to `-Wno-shift`? Have you considered it?


================
Comment at: clang/test/Analysis/bitwise-shift-sanity-checks.c:57-58
+__int128 large_int(void) {
+  // Ensure that we do not crash on values that do not fit in 64 bits.
+  return (__int128) 1 << 63 << 10 << 10; // no-crash
+}
----------------
Let's test if the shift RHS operand is bigger than 64.


================
Comment at: clang/test/Analysis/bitwise-shift-state-update.c:30-35
+  clang_analyzer_eval(left >= 0); // expected-warning {{TRUE}}
+  clang_analyzer_eval(left > 0); // expected-warning {{UNKNOWN}}
+  clang_analyzer_eval(right >= 0); // expected-warning {{TRUE}}
+  clang_analyzer_eval(right > 0); // expected-warning {{UNKNOWN}}
+  clang_analyzer_eval(right < 31); // expected-warning {{UNKNOWN}}
+  clang_analyzer_eval(right < 32); // expected-warning {{TRUE}}
----------------
Have you considered using `clang_analyzer_value(left)` for inspecting the assumed range of the operands?
IMO that leads to more readable tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156312/new/

https://reviews.llvm.org/D156312



More information about the cfe-commits mailing list