[PATCH] D146712: [clang-tidy] Add portability-non-portable-integer-constant check
Whisperity via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Jul 6 08:34:53 PDT 2023
whisperity requested changes to this revision.
whisperity added a comment.
Earlier (IIRC in March) we did an internal test of the check and the following results were obtained. The results are kinda //weird//, to say the least. Numbers are //after// CodeChecker has done a serverside uniqueing (most likely things such as //"if **`X`** in `H.h` is reported in the compilation/analysis of `A.cpp` and `B.cpp` then store **`X`** only once"//) of the bugs.
- memcache-d: 43
- tmux: 4
- curl: 283
- twin: 161
- vim: 2 206
- OpenSSL: **23 724**
- sqlite: 752
- ffmpeg: **87 121**
- postgres: **116 635**
- tinyxml2: 6
- libwebm: 34
- xerces: **48 732**
- bitcoin: 4 429
- protobuf: 1 405
- codechecker-ldlogger: 0
- ACID: 4 831
- Qt: **34 973**
- Contour v2: **14 246**
- LLVM: 6 018
No crashes (of the checker) were observed. Most of the large outlier numbers are attributable to heavy mathematics (ffmpeg, Qt), crypto (OpenSSL, PostgreSQL), or character encoding (Xerces, Postgres, Contour) operations in the project.
Unfortunately, this is a very significant amount of false positives to the point where the consumption of results is slowed down not on the mental load side, but also the technical side (such as triggering server hangs in CodeChecker and whatnot...). However, after putting some thoughts into it and trying to hand-clusterise some of the reports I've found a few significant opportunities where this check should be improved.
**Note:** These improvements might be worthwhile to implement as a //follow-up patch// which is merged together with this current review. **As it stands right now, this check is way too noisy (on maths-based projects) to be merged!**
1. Enums
--------
One is when the literal is used to initialise an enum. So far I don't know whether simply just ignoring enums is good, or would it be worth to somehow emit a "unified" warning for the entire enum that follows the pattern as shown below.
F28149214: image.png <https://reviews.llvm.org/F28149214>
2. `(u)?int([\d]+)_t`
---------------------
The other are cases when the user explicitly initialises a **fixed-width** (or *constrained-width*) variable, such as `int64_t` or `uint16_t` with a literal. Right now, the implementation hard matches only the `IntegerLiteral` instances, i.e., lines such as `static constexpr std::uint64_t X = 0x07F000000` where the width and characteristics of the literal might just as well *exactly* and *properly* match the variable it is assigned to (granted, we only match and silence `Decl`s for this, and lose expressivity on expressions), and do not report then.
3. Large arrays of magic literals
---------------------------------
While this is not the silencing of a "valid" use case where the check, as of the current implementation, it is too verbose for, but I believe we could reasonably cull the noisiness of the check by simply saying that if there are array initialisers or //brace-init-lists// or whatever that contain multiple non-portable literals, then **either** we simply suck it up and do not warn for **any** of the elements... or for the sake of collegiality give **one** warning for the user. The former option would be the most silencing of all, while the latter would allow the user to do a single `// NOLINT` for the array instead of having to deal with `/* NOLINTBEGIN */ ... /* NOLINTEND */`.
Several chunks of noisy warnings come out of places like SHA-256, FFT, and various other "very mathematical" contexts where a large constant array is created with very specific proven values, all of which expressed and reported as "non-portable literals". It is no use to do 500000 warnings for a single array. (This is perhaps after all the same(-ish) family as the "enum" problem.)
================
Comment at: clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.cpp:87
+ } else {
+ return { StrippedLiteral, 0, 0, 10 };
+ }
----------------
(Unrealistic nit: This will consume hypothetical additional literals such as `0!ZZZbb=` as if they were decimal literals.)
================
Comment at: clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.cpp:153-154
+ "non-portable integer literal: integer literal with leading zeroes");
+ // Matches only the most significant bit,
+ // eg. unsigned value 0x80000000.
+ } else if (IsMSBBitUsed) {
----------------
(Style nit: These lines could be joined and still fit 80-col, I think.)
================
Comment at: clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.h:16
+
+/// Finds integer literals that are being used in a non-portable manner.
+///
----------------
I am still not entirely convinced whether this one-line summary makes too much sense to a user who does not know the Standard out of heart... As we hunt for bit widths more or less perhaps we could weave this notion into the single sentence summary saying that integers with "non-portable representations because of bit width"... Will need to give this some thought.
================
Comment at: clang-tools-extra/test/clang-tidy/checkers/portability/non-portable-integer-constant.cpp:1
+// RUN: %check_clang_tidy %s -std=c++17-or-later portability-non-portable-integer-constant %t
+
----------------
PiotrZSL wrote:
> no need to limit this check to c++17 or later.
> when it's targeting all versions of C and C++
Binary literals require C++17 and some version of newer C standard (think C11?). However, technically it could work to use `c++11-or-later` and then use the `__cplusplus` macro to only compile the binary literal test when appropriately C++17.
================
Comment at: clang-tools-extra/test/clang-tidy/checkers/portability/non-portable-integer-constant.cpp:72-74
+
+
+
----------------
(Style nit: Did we lose a test here, or is this just an accidental contiguous area of newlines?)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D146712/new/
https://reviews.llvm.org/D146712
More information about the cfe-commits
mailing list