[llvm-dev] Inexplicable ASAN report. Code generation bug?
Greg Stark via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 11 18:43:21 PST 2015
On Thu, Nov 12, 2015 at 2:18 AM, Greg Stark <stark at mit.edu> wrote:
> So it looks to me like -O2 is causing the optimizer to turn the 2-byte
> read into a 4-byte read and overrun the allocated object. But I
> haven't tried looking at the assembly yet.
Fwiw the assembly is obviously a 4-byte load but identifying the
mapping from the C code reading two of those bytes to the assembly is
a bit beyond my level of familiarity with x86 assembly:
#define NUMERIC_WEIGHT(n) (NUMERIC_HEADER_IS_SHORT((n)) ? \
(((n)->choice.n_short.n_header & NUMERIC_SHORT_WEIGHT_SIGN_MASK ? \
~NUMERIC_SHORT_WEIGHT_MASK : 0) \
| ((n)->choice.n_short.n_header & NUMERIC_SHORT_WEIGHT_MASK)) \
: ((n)->choice.n_long.n_weight))
...
0x000000000144a6ba <+1050>: mov %r15d,%ecx
0x000000000144a6bd <+1053>: and $0x7,%ecxn
0x000000000144a6c0 <+1056>: add $0x3,%ecx
0x000000000144a6c3 <+1059>: movsbl %al,%eax
0x000000000144a6c6 <+1062>: cmp %eax,%ecx
0x000000000144a6c8 <+1064>: jl 0x144a441 <numeric_out+417>
0x000000000144a6ce <+1070>: mov %r15,%rdi
0x000000000144a6d1 <+1073>: callq 0x526350 <__asan_report_load4>
==> 0x000000000144a6d6 <+1078>: mov $0x2e02680,%edi
0x000000000144a6db <+1083>: mov %r14,0x10(%rbx)
0x000000000144a6df <+1087>: mov %rsi,%r14
0x000000000144a6e2 <+1090>: callq 0x53b1b0 <__sanitizer_cov()>
0x000000000144a6e7 <+1095>: mov %r14,%rsi
0x000000000144a6ea <+1098>: mov 0x10(%rbx),%r14
0x000000000144a6ee <+1102>: jmpq 0x144a4a1 <numeric_out+513>
0x000000000144a6f3 <+1107>: mov %edi,%ecx
0x000000000144a6f5 <+1109>: and $0x7,%ecx
0x000000000144a6f8 <+1112>: add $0x3,%ecx
0x000000000144a6fb <+1115>: movsbl %al,%eax
...
So I'm guessing the logic is that the struct as that VLA so the
compiler sees that the struct size is at least two more bytes so it
assumes memory will always be allocated for the whole object and feels
free to reference those extra bytes? In practice this is a false
positive since the object will always be aligned so those two extra
bytes will always be on the same page. We already have another code
site with a similar false positive but it's a much narrower
circumstance. If this can happen anywhere there's a non 4-byte access
then that will remove a lot of the value in asan (fwiw msan doesn't
complain about this)
--
greg
More information about the llvm-dev
mailing list