<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/57102>57102</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[clang][-Wformat] consider moving truncating format flags to -Wformat-pedantic
</td>
</tr>
<tr>
<th>Labels</th>
<td>
clang,
clang:diagnostics
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
nickdesaulniers
</td>
</tr>
</table>
<pre>
Vaguely reminiscent of #40812.
We've been having a hard time enabling -Wformat for clang within the Linux kernel.
https://github.com/ClangBuiltLinux/linux/issues/378
https://lore.kernel.org/llvm/CAHk-=wivP4zipYnwNWCLF5cd24GLs3m8=Sp7M-CmmPva_UC+3Q@mail.gmail.com/
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=21f9c8a13bb2a0c24d9c6b86bc0896542a28c197https://lore.kernel.org/llvm/CAHk-=wgJA=e-CLcvU5LRKu0bMLeAewXtOM6as1hFVeQAVkMPbg@mail.gmail.com/
Consider the following hypothetical test case:
```c
#include <stdio.h>
void bar (int unused, int x) {
// -Wformat: char format, arg is int
printf("bottom word is %hhd\n", x);
}
```
When compiled with `-Wformat -O2 -mskip-rax-setup` (`-mskip-rax-setup` just to make the output marginally clearer, `-mskip-rax-setup` _is_ used by the linux kernel.) we get the following warning:
```
warning: format specifies type 'char' but the argument has type 'int' [-Wformat]
printf("bottom word is %hhd\n", x);
~~~~ ^
%d
```
And produces the following disassembly:
```asm
bar: # @bar
movl $.L.str, %edi
xorl %eax, %eax
jmp printf # TAILCALL
.L.str:
.asciz "bottom word is %hhd\n"
```
Let's consider informing the compiler that this was intentioned by adding an explicit cast to show we meant to truncate in the caller.
```diff
#include <stdio.h>
void bar (int unused, int x) {
// -Wformat: char format, arg is int
- printf("bottom word is %hhd\n", x);
+ printf("bottom word is %hhd\n", (signed char)x);
}
```
This silences the warning, and shows intent. Maybe we add a comment for reviewers + maintainers to understand we intend to truncate the value here.
Oh, but look at the disassembly:
```diff
bar:
+ movzbl %sil, %esi
movl $.L.str, %edi
xorl %eax, %eax
jmp printf
```
This is adding overhead to the caller, when the callee (`printf`) is still going to just look at the first byte due to the continued use of `%hhd` format flag. That's kind of a waste.
Here's a diff of the llvm ir as well:
```diff
define dso_local void @foo(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
- %2 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull dereferenceable(1) getelementptr inbounds ([21 x i8], [21 x i8]* @.str, i64 0, i64 0), i32 noundef %1)
+ %3 = and i32 %1, 255
+ %4 = tail call i32 (i8*, ...) @printf(i8* noundef nonnull dereferenceable(1) getelementptr inbounds ([21 x i8], [21 x i8]* @.str, i64 0, i64 0), i32 noundef %3)
ret void
}
```
One might argue "just declare `x` as `signed char` then." But I think we have quite a few use cases in the kernel (primarily debugging) where we might want to retain a full words (or more) worth of storage but then print individual bytes.
Can we consider moving truncating format instances of `-Wformat` to either `-Wformat-pedantic` or perhaps a new sub-flag in `-Wformat` group?
cc @zygoloid @AaronBallman
`CheckPrintfHandler::checkFormatExpr` is the method in clang that does the format flag warning logic. We might be able to consider moving some more cases to `-Wformat-pedantic`.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzdWF1z27oR_TX0C0YaCtTngx4kOW5uKze5bW7SPnlAAiIRgQBLkJKdX3_PgpQsObGb2-n0oRqJJIAFsNjdc3ap1Mmn5WeRt8o8sVqV2mqfKdswt2MRT8bxfMSHUXwbxavu-kVFfHZQLFXKskIctM2ZwEMtWaNLxZQVqaHOwZedq0vRMNxYZgS6jroptGVNodhW2_aR7VVtlenXL5qm8lGyivgdvjlk23SYuRKNDU1ft9o0YR56TH_X3rfK4yGZzX-0jHG1GvbbuDqnHnMIS67e7wdRcnvUh4_jb7r6pz3-9ctmezfJJB__aeuTco7Rv1ez-8GmLD8exMNvm4ivk1-jcVwKbYZ5uHb6vXKA642rNsXVZ-WF-p1AJ41r4-qDMNKfJIZdNzYpw0OU3GkJtfhot8jmYpSkKRdxxsdykU3T-TTN4vliOhlzwefZaDH7o7bI_7zCTQ022-zw22T7t7-0cXq_VSt1_Efz4X4q_Ki4-6x-XX3e339M89ct0V03znotVR0cvnPGuCMFRvFUOfQ0OhOGNco3LBNekZbd3GncfbO-zRNtM9NKxaJk4xup3bCIkneXOx2cliwVNUJ2rhG8rW29khHfMGrB0AsWzdadLKNPZ5BzjGJzliGGWd_ERFHnTHua_zytqtHcYY-I89Q1jSvZ0SHwIRfxSVHIaLKxGKP5tGmU9HtGs9sXh-vBVABEMFyljZIBHgxjZ-QMPnA2KP1eV4NaPA68atoK43RKEvvByNcW5mwcK8VeBbO7tqnaBu0611YYgDwzStSqJh1fWeRB-wdGBmTpU1jEXIEVxjwqlqvmhV-Pora4f-_Irvk83FuZ-UpleqeVZ81TBe_yGfkAN5a23eLQui2JjQrxLEQugUw0WZ_dN7n9LziJvf6JZu_O365j8u7fT-IT-UNLrKyElk62GR39yoZSe-G9KlPz9L0dhS-7HkQ6mfEnPgAPA0xpwpW6pTuYbnw83A5904UDnyiprwUfXd0LTpR4PEnh6Urqa1mxZ9u_rc6n1S_bzWq77Vbodz-d9SQ4FD7T32jC2z78kXm3iuLDA1c9_WhLYULmJWP3cCNWEhRlWPMoAtARaNrZLuyFlCGzWaYeK6MzHVgqYMsX7kgIKJWwoaOpW5uJRrE-t4HXsP7whXJS73b9Id-mNNbf_gekNnh22n_Canz9R-dDyuucbNxBfXGNwNd48hN5ycNr9gSZE5vQoQAn8snJh0N2L55SRS6CF1GcUPokEqFCpFYHrY6qJt3WoEVMwY_acGRrES6-oQWPqltNXjmYdkaCbhUrFHLpZQr6UJAuRFzGuT0THYG9ieeLiOgQfWXWHqbfUhPQh9Of0OdfYPTn0fyTcP4Oz687Bd8eKu6g6kKJzmBnFNAWR8py5y7Vp68-bLAaApnc22hjWO4CTl2XyS5NudNwDaAJN0g44LSLA2hti4ACOELRSguGqEMmO1WgRuRDxj4VomOGvYZfISsI-c21H9_DsUFIMPIPiYUUiEqJ6ZqBKY7KmLecKdUOEcWkdw_GUY0TgAwW3jlHUE44s45CjSrsSRwAfd03IpOEuQ-ttaJU8gFGJiJI4mfMD4IbOSjkliGITTBvWIp2ARpXtPRwGDI2tj_jNIyd97PO2hYTEfpqhx8ghgoeNpgHNZDnlVGEn6ohKk1pmg8-nKz5iD0yLIf8S6F03bGiTU_RqKdjFl88LF459jME0JGEsxEcu1ORxIbxyeRKavz_Y4HkbAEGpmpC5LxdQfbkg3grdV40oWIihPGAH6nw4lUrAsUj4QHRi9slA6MT4W1hIc7YGuz1C6VEuycCxNudYv9qNRAn2E4dA8SoVvenTNeVhGQKmBY1pkZ9KVXa5nng5gVBvw5M3Gl37DMmzgbWpVXJ7pQtgj3Bz6Uj-C2oD8UwwOfxTiRydSoJbcdJUEDqg5Yt4EWU4K8wvEHaPqrnAgD8GFilo3F67IlBW2J7yikdc5xTJ5nFMYWCHNMvBgaVkjiDzkgC6lbgPFERWVjYx7fpgLiGzPNitbx2KK6Tq_ejLKP4-PaUO9MzxErUzq4RxCVOcCmKJTaFyvYfQwS_BySIW0FCySqj_ruw0bvHKnhUdzmyVE3hJCnTvX6Hgke6c9F55sZTNgXn5DoDU345OQxplKBAxnhpTe_wtk_u6kMCIq8YanijlqPpNAZwE85v5DKRi2QhbhrdGLUEZoJ6hJirkv4n_EfKh52_2_amrc3yjX8U-jdfug1Qi39VWXP5b8JkNor5TbFc8KkYz0SSzTi0j5PxPJ1kCyHmM4FWxm-MSJXxdAggqDvHqdLpm8lKapFbh_yWeRqc3N7oJY85j-ejEa48Hg2lGmdZspjKHZ_GsVwgFlR4qyYF6WX9pl4GXYEtj0GjfeOfB1FjEKaDMWl90cLz9dLqbC-VF62xGoXNTTjeMpztd8F2bhw">