<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/94769>94769</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[dfsan] sscanf function incorrectly ignores ordinary characters in the format string
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
thurstond
</td>
</tr>
</table>
<pre>
### Example
Illustrated in https://github.com/llvm/llvm-project/pull/94700:
```
char buf[256] = "10000000000-100000000000 rw-p 00000000 00:00 0";
long rss = 0;
// This test exposes a bug in DFSan's sscanf, that leads to flakiness
// in release_shadow_space.c (see
// https://github.com/llvm/llvm-project/issues/91287)
if (sscanf(buf, "Garbage text before, %ld, Garbage text after", &rss) == 1) {
printf("Error: matched %ld\n", rss); // THIS ERROR HAPPENS WITH DFSAN
return 1;
}
```
### Implications
* It caused a failure in the release_shadow_space.c test (https://github.com/llvm/llvm-project/issues/91287).
* The reason for the failure is that DFSan's sscanf is ignoring ordinary characters in the format string. DFSan's release_shadow_space.c test relies on sscanf to scrape the RSS from /proc/maps output and is therefore scraping numbers from irrelevant output (e.g., base addresses), leading to test flakiness.
* The test can be fixed by filtering the sscanf matches e.g., by using strstr to check for 'Rss: '.
* This will also change the semantics of instrumented programs that use sscanf, because text may erroneously match.
* This probably needs a real fix in DFSan's sscanf.
### Relevant code in DFSan's scan_buffer:
```
static int scan_buffer(char *str, size_t size, const char *fmt,
dfsan_label *va_labels, dfsan_label *ret_label,
dfsan_origin *str_origin, dfsan_origin *ret_origin,
va_list ap) {
...
if (*formatter.fmt_cur != '%') {
// Ordinary character. Consume all the characters until a '%' or the end
// of the string.
for (; *(formatter.fmt_cur + 1) && *(formatter.fmt_cur + 1) != '%';
++formatter.fmt_cur) {
// EDITOR'S NOTE: SHOULD THIS CHECK AGAINST THE INPUT STRING?
}
retval = formatter.scan();
dfsan_set_label(0, formatter.str_cur(),
formatter.num_written_bytes(retval));
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJykVkuP4jAS_jXmUpoomEfIgQPPabSr7hEw2iNykgp4x7Ejl9Pd7K9f2QmPpntHsxqEkliu-lz11cuCSB414pSN5my07InGnYydulNjyRld9DJTnKeMD9o_rN5FVStk8ZLFs_a5UaohZ4XDAqSGk3M1scGM8TXj66N0pyaLclMxvlbq9fL6Vlvzb8wd4-u6UYrxdTpM4tjr3UGzcdz9wxIgPwkLWVOy0ZyPxmy0BDZYAuO8H19_3-6-Y7Bv32q4rsIJ_s04Z4P5BVYZfQRLFNDiu43WCdifJIFDcoDvtSEkEJA1R-_ucr0TmvGEgCgXumR8Ae4kHCgUBYEzUCrxS2okegCVGiwqFIQHOonCvB2oFjlGOTA-IcQH8f-bV0nUIHlm-3ySMJ5eAGUZTujMnXg6-cKT-F3YTBwRHL47yLA0FtudkSr8x4d9UTq0nsYgMbZEjKeeQM9hP3wnFx5rK7XzZzHOV9YaywYzqITLT1h08KOF7sBaJDaYX9l_2uxgtd2-bOFp9uPH6nkH_9rsnzz1s-fuBIuusRr697FLll9mUfe8pvSmqpXMhZNG00eJGWwc5KIhLEBAKaRqLPq4uRP-r9iFLGF88vfhim4JMIN9OFGQ0VAaGwy42kNtwj1mot-QR22s1EcwtpBa2HOoIJE7tHRxpDS2Eg7IecHoDuZ3HlpUEgmMvhzmDFBuRY0Bc7vbQWlN5UNYW5Mzvq5ETWAaVzcOhC5as9GGLGtVvZ26qTJvW1CW1pvwKrS7KDI-wegY-TTJBCGIorBIhCFj-CIUnYdxpjXzWnufyAzbudCQIZTyHQvIzlBK5TDw5Z3oPGvzlOB68Bka8jLkLDnrz8pPmP8KgWE82ZKPu_-KbokUOsibVAqEIq8g9LGlirAS2smcwJQgNTnbVKh9M62tOVpRdeFtCO9aTIYhL9tSrMQZ0Fqj0TSkzq3FDx5L8niZyNQZNGLhW5hFobzzX_Wx6Oti2V4CkpsCH_RyoQ9ZU5Zob238Y-mRE07mILX7IM0noa8zPiNnvXck_4MHF15-mRvtg9XJlJVjfHHx7stfUZLQByUyVF7jVbTf5MEe9iy6dvFnmMbKo9Sdqd3qhnrb9bDX3d_ieuMkORD1x54JAFEU3RZt1_b-h3p1aKOycoe88aT02zGYMD4Kzweg6xB5-dQGIlgYTU2FIJQKCXnXIRrtpAJxQ4au96AuvkA3ZZvRbSe5F2hLY9I29Rnjk6-cmHdjg48ZH_-J3IPXg_knohmfMz7_BPIFQXdurJab_cuW8WQHzy_7lS_m3dPLz38u20m0eFot_gGz77PN824P-6cVbJ5__NzDbr_dPH9ng_UHZi5DqF1adK9ChWvGzShfCiG06YMLbVLRLUMnsc-1O01nW3cmbf_rlG8CuqkOb1Y6h_qQnZ3vk5PWhqBwO_ChUHvFdFCkg1T0cNpP-kmSpPE47p2mST9NY9FPxUgUwzSdpEk_RjHCIebDYTYqenLKYz6Mx3HSH4_6cRwlcT-b8Ek-ToZ5PExGbBhjJaSK_PyLjD32wuSbpsNknPbaOg23Uc41vkHY9DeD0bJnp2FmZs2R2DD2VUM3FCedCtfYwJq_Gnb9u2x07oc7SJ0bazF36twORj_A_nAu9hqrpn8x04Nzr1P-3wAAAP__aApywg">