<div dir="ltr">Hello everybody,<div><br></div><div>I've run into some strange behavior with memory sanitizer that I can't explain and hope somebody with more knowledge of the implementation would be able to help me out or at least point me into the right direction. </div>
<div><br></div><div>For background, I'm using memory sanitizer to check Julia (<a href="http://julialang.org">julialang.org</a>), which uses (or at least will once I track down a few bugs) MCJIT for the code compilation. So far I have rebuilt the runtime and all dependencies (including LLVM, libcxx, etc.) with memory sanitizer enabled and added the instrumentation pass in the appropriate place in the julia code generator. </div>
<div><br></div><div>I'm now going through the usual bootstrap which basically loads the standard library and compiles it, does inference, etc. This works fine for several hours (this is usually much faster - by which I mean several hundred time - I suspect the issue is with MCJIT having to process a ton more relocations and code and being inefficient at it, but I can't prove that). That's not the issue however. Eventually, I get</div>
<div><br></div><div><div>==17150== WARNING: MemorySanitizer: use-of-uninitialized-value</div><div> #0 0x7f417cea3189 in bitvector_any1 /home/kfischer/julia-san/src/support/bitvector.c:177</div></div><div>[ snip ]</div>
<div><br></div><div><div> Uninitialized value was created by a heap allocation</div><div> #0 0x7f41815de543 in __interceptor_malloc /home/kfischer/julia-san/deps/llvm-svn/projects/compiler-rt/lib/msan/msan_interceptors.cc:854</div>
<div> #1 0x7f417cc7d7f1 in alloc_big /home/kfischer/julia-san/src/gc.c:355</div></div><div>[snip]</div><div><br></div><div>Now, by going through it in the debugger, I see</div><div><br></div><div><div>(gdb) f 3</div><div>
#3 0x00007f417cea318a in bitvector_any1 (b=0x60c000607240, b@entry=<optimized out>, offs=0, offs@entry=<optimized out>, nbits=256, nbits@entry=<optimized out>)</div><div> at bitvector.c:177</div><div>
177 if ((b[0] & mask) != 0) return 1;</div><div>(gdb) p __msan_print_shadow(&b,8)</div><div>ff ff ff ff ff ff ff ff</div><div> o: 3f0010a6 o: 80007666</div></div><div><br></div><div>which seems to indicate that the local variable b has uninitialized data. I'm having a hard time believing that though, since if I look at the functions before it, the place where it's coming from is initialized:</div>
<div><br></div><div><div>#4 0x00007f41755208a8 in julia_isempty248 ()</div><div>#5 0x00007f417c163e3d in jl_apply (f=0x606000984d60, f@entry=<optimized out>, args=0x7fff9132da20, args@entry=<optimized out>, nargs=1,</div>
<div> nargs@entry=<optimized out>) at ./julia.h:1043</div></div><div><br></div><div>(here's the code of that julia function for reference)</div><div><br></div><div><div>isempty(s::IntSet) =</div><div> !s.fill1s && ccall(:bitvector_any1, Uint32, (Ptr{Uint32}, Uint64, Uint64), s.bits, 0, s.limit)==0</div>
</div><div><br></div><div>Looking at where that value is coming from:</div><div><br></div><div><div>(gdb) f 5</div><div>#5 0x00007f417c163e3d in jl_apply (f=0x606000984d60, f@entry=<optimized out>, args=0x7fff9132da20, args@entry=<optimized out>, nargs=1,</div>
<div> nargs@entry=<optimized out>) at ./julia.h:1043</div><div>1043 return f->fptr((jl_value_t*)f, args, nargs);</div></div><div><div>(gdb) p ((jl_array_t*)((void**)args[0])[1])->data</div><div>$43 = (void *) 0x60c000607240</div>
<div>(gdb) p __msan_print_shadow(((jl_array_t*)((void**)args[0])[1]),0x30)</div><div>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00</div>
<div> o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496 o: d800496</div></div><div><br></div><div>There are no uninitialized values to be seen anywhere and the `b` value isn't touched before that line, so I'm a little stumped.</div>
<div><br></div><div>One note I should make is that I did have to implement TLS support myself in MCJIT for this to work (I'll upstream the patch soon), so I may have made a mistake, but I haven't found anything wrong yet. If nothing looks unusual, I'd also appreciate pointers on what to look for in the TLS variables.</div>
<div><br></div><div>Thank you for your help,</div><div>Keno</div><div><br></div></div>