<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 14, 2021 at 10:22 AM Mehdi AMINI <<a href="mailto:joker.eph@gmail.com">joker.eph@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi,<div><br></div><div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 14, 2021 at 10:02 AM David Blaikie via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">What was the original motivation for making the change - given the<br>
gains are unclear? Might help frame what the right path forward is.<br>
<br>
If the goal is to experiment to see if making this change is<br>
sufficiently valuable to either regress heap leak checkers or create a<br>
divergence where this is configurable/appears only in certain modes -<br>
then I'd say having a flag (maybe even an internal/cc1 only flag) is<br>
the right first step, enabling data to be gathered to help inform the<br>
design discussion.<br>
<br>
On Wed, Apr 14, 2021 at 9:39 AM Sterling Augustine via llvm-dev<br>
<<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>
><br>
> [Continuing discussion from <a href="https://reviews.llvm.org/D69428" rel="noreferrer" target="_blank">https://reviews.llvm.org/D69428</a>]<br>
><br>
> Llvm is fairly conservative when eliminating global variables (or fields of such) that may point to dynamically allocated memory. This behavior is entirely to help leak checking tools such as Valgrind, Google's HeapLeakChecker, and LSAN, all of which treat memory that is reachable at exit as "not leaked", even though it will never be freed. Without these global variables to hold the pointer, the leak checkers can't determine that it is actually reachable, and will report a leak. Global variables that dynamically allocate memory but don't clean themselves up are fairly common in the wild, and various leak checkers have long not reported errors.<br>
><br>
> This behavior was added all the way back in 2012 in <a href="https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html" rel="noreferrer" target="_blank">https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html</a>.<br>
><br>
> <a href="https://reviews.llvm.org/D69428" rel="noreferrer" target="_blank">https://reviews.llvm.org/D69428</a> removed this behavior, and I subsequently reverted it when many internal Google tests started failing, but I believe many other users who use leak checking will encounter errors when this hits more mainstream releases.<br>
><br>
> So: What to do?<br>
><br>
> Preventing a valid transformation (the global variables are never read and can be eliminated) to help the leak checkers leaves some performance and code size on the table. Just how much is unclear.<br>
><br>
> On the other hand, having leak checkers suddenly start reporting failures where they didn't before also seems suboptimal. Cleaning this somewhat common scenario up is surprisingly difficult at the user level.<br>
><br>
> Some possibilities:<br>
><br>
> 1. Only do this at high optimization levels, say -O3. This would give aggressive users all the performance we can, but also make leak checkers report leaks sometimes, but not others.<br>
><br>
> 2. Hide it behind a flag or configurable option. Users who care can set it as they prefer. Creates more confusing options, different testing matrices and such, but everyone can get the behaviour that they want.<br>
><br>
> 3. Do it all the time, and users who encounter issues can clean up their code. Users get the most performance they possibly can, but have to clean up code or drop leak checking. Seems a little user hostile.<br></blockquote><div><br></div><div>It seems to me that to some extent this change may expose more leaks which aren't necessarily false positives, which is a net benefit. And in cases that are indeed false positives, shouldn't the users just add `volatile` on these pointers (or `__attribute__(__used__)`, but compared to volatile it does not guarantee that the stores won't be elided I think)?</div></div></div></div></div></div></div></blockquote><div><br></div><div>James points to me that the more universally accepted definition of leaks is not "memory allocated that isn't deallocated" as I naively thought, but that the memory isn't reachable at program termination.</div><div>Under this definition, it seems like we ought to be more conservative in the optimizer instead, so I retract what I wrote previously :)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div class="gmail_quote"><div>So I'm curious to know how widely affected the users of the leak detection tools will be? Because this is still quite a problem of course if the amount of false positive leaks exposed is so important that most users would then just not use the tool instead of adding the volatile annotation to these "intentional leaks". I don't know if you have some data about this?</div><div><br></div><div>Best,</div><div><br></div><div>--</div><div>Mehdi</div><div><br></div></div></div></div></div></div></div>
</blockquote></div></div>