[llvm-dev] Eliminating global memory roots (or not) to help leak checkers

Tue Apr 20 09:26:14 PDT 2021

+Chris Lattner <clattner at nondot.org> for the last two updates as well. I
really don't think this optimization is getting the overall win that it's
thought to and breaking roughly every kind of leak checker mechanism I've
seen.

-eric

On Tue, Apr 20, 2021 at 12:13 PM Sterling Augustine via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> In order to understand how much benefit this change gives to code size, I
> built clang and related files with and without the patch, both with
> CMAKE_BUILD_TYPE=Release.
>
> clang itself gets about 0.4% smaller (from 145217124 to 144631078)
> lld gets about 1.85% smaller (from 101129181 to 99243810 bytes)
>
> Spot checking a few other random targets, in general, it seems that the
> benefits depend heavily on the coding style, but roughly, bigger the
> binary, the less benefit this brings.
>
> I suspect that a more aggressive pass as described by Philip Reames could
> capture a significant part of the benefit without sacrificing
> functionality. But that would require a more detailed study to be sure.
>
>
> On Mon, Apr 19, 2021 at 4:52 PM Sterling Augustine <saugustine at google.com>
> wrote:
>
>> There may be other ways to disable leak checkers, but they put a burden
>> on users that was not there before. Further, users who try leak checking
>> with and without optimization will get different answers. The bug report
>> will read: "clang at -O2 makes my program leak". And, as James notes,
>> whether or not you need to suppress the leak depends on whether or not the
>> optimizer does away with the variable. Subtle changes to the code that have
>> nothing to do with memory allocation will appear to add or fix leaks. That
>> is not something I would care to explain or document.
>>
>> Although the code could definitely be improved, the comments that it is
>> brittle or fragile seems overstated. Google uses this code without issue
>> against against many millions of lines of code every day, in
>> tens-of-thousands of targets, and has since 2012. It may be ugly, but it
>> does work. We are open to improving it, if it is only the ugliness that is
>> of concern.
>>
>> I am working on quantifying the benefit the change gives, but I won't
>> have results until tomorrow.
>>
>> On Mon, Apr 19, 2021 at 5:24 AM James Y Knight <jyknight at google.com>
>> wrote:
>>
>>> There is no problem (no leaks) in the code that users wrote, so adding
>>> code annotations (sanitizer suppression file, or attributes) is not a good
>>> solution to this issue. The problem is that this optimization introduces a
>>> "leak" (from the point of view of the leak checker), which wasn't there
>>> before. And in practice, this seems to cause a large number of false
>>> positives.
>>>
>>> This can apparently happen simply by having the code which reads from a
>>> global variable happen to be unused or optimized away in the current
>>> compilation. Users would be effectively randomly annotating variables
>>> to work around the compiler, not for any reason apparent in the code they
>>> wrote.
>>>
>>> I don't know what the right answer should be -- I've not looked into it
>>> at all. But I don't think it's correct to just dismiss this as
>>> not-a-problem, even if the current solution is not a good solution.
>>>
>>>
>>> On Mon, Apr 19, 2021, 12:16 AM Chris Lattner via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi Sterling,
>>>>
>>>> I agree with the others: therIe are better (and more robust ways) to
>>>> disable leak checkers.  This only addresses one narrow case.  The code is
>>>> also super verbose and fragile looking.  This is also eliminating an
>>>> optimization.
>>>>
>>>> I’d recommend dropping the code.
>>>>
>>>> -Chris
>>>>
>>>> On Apr 14, 2021, at 9:38 AM, Sterling Augustine via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> [Continuing discussion from https://reviews.llvm.org/D69428]
>>>>
>>>> Llvm is fairly conservative when eliminating global variables (or
>>>> fields of such) that may point to dynamically allocated memory. This
>>>> behavior is entirely to help leak checking tools such as Valgrind, Google's
>>>> HeapLeakChecker, and LSAN, all of which treat memory that is reachable at
>>>> exit as "not leaked", even though it will never be freed. Without these
>>>> global variables to hold the pointer, the leak checkers can't determine
>>>> that it is actually reachable, and will report a leak. Global variables
>>>> that dynamically allocate memory but don't clean themselves up are fairly
>>>> common in the wild, and various leak checkers have long not reported errors.
>>>>
>>>> This behavior was added all the way back in 2012 in
>>>> https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20120625/145646.html
>>>> .
>>>>
>>>> https://reviews.llvm.org/D69428 removed this behavior, and I
>>>> subsequently reverted it when many internal Google tests started failing,
>>>> but I believe many other users who use leak checking will encounter errors
>>>> when this hits more mainstream releases.
>>>>
>>>> So: What to do?
>>>>
>>>> Preventing a valid transformation (the global variables are never read
>>>> and can be eliminated) to help the leak checkers leaves some performance
>>>> and code size on the table. Just how much is unclear.
>>>>
>>>> On the other hand, having leak checkers suddenly start reporting
>>>> failures where they didn't before also seems suboptimal. Cleaning this
>>>> somewhat common scenario up is surprisingly difficult at the user level.
>>>>
>>>> Some possibilities:
>>>>
>>>> 1. Only do this at high optimization levels, say -O3. This would give
>>>> aggressive users all the performance we can, but also make leak checkers
>>>> report leaks sometimes, but not others.
>>>>
>>>> 2. Hide it behind a flag or configurable option. Users who care can set
>>>> it as they prefer. Creates more confusing options, different testing
>>>> matrices and such, but everyone can get the behaviour that they want.
>>>>
>>>> 3. Do it all the time, and users who encounter issues can clean up
>>>> their code. Users get the most performance they possibly can, but have to
>>>> clean up code or drop leak checking. Seems a little user hostile.
>>>>
>>>> Other possibilities?:
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210420/bbb5bc60/attachment.html>