[llvm-dev] [EXTERNAL] Re: Eliminating global memory roots (or not) to help leak checkers

Fri Apr 23 03:49:50 PDT 2021

My 2 cents:

  1.  2% of code size improvement is alone a strong argument to revert r160529. In embedded world it sometimes determines whether image fits the ROM or not.

  1.  Code/data size reduction often correlates with performance improvements, due to smaller number of cache refills. I suspect data segment was reduced somewhat more than those 2%.

  1.  I don’t use LSAN and the whole idea that “what is reachable is not a leak” isn’t applicable to my use cases quite well. There are other ways to find leaking memory which do not prevent compiler optimizations. I suspect many people also don’t use LSAN, why should they suffer?

  1.  As it was already mentioned patch introduced by r160529 is of substandard quality and is not properly covered by test cases. This is yet another argument for removal.

  1.  I still can’t understand why LSAN instrumentation can’t prevent globals from being removed, delegating this to globalopt instead. Can someone explain this to me, please?

From: Sterling Augustine via llvm-dev<mailto:llvm-dev at lists.llvm.org>
Sent: 23 апреля 2021 г. 3:44
To: Chris Lattner<mailto:clattner at nondot.org>
Cc: llvm-dev<mailto:llvm-dev at lists.llvm.org>
Subject: [EXTERNAL] Re: [llvm-dev] Eliminating global memory roots (or not) to help leak checkers

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.  If you suspect potential phishing or spam email, report it to ReportSpam at accesssoftek.com
Have any of the proponents of the patch tried lsan against a reasonable large code base both with and with out it?

On Thu, Apr 22, 2021 at 4:28 PM Chris Lattner <clattner at nondot.org<mailto:clattner at nondot.org>> wrote:

> On Apr 19, 2021, at 5:24 AM, James Y Knight <jyknight at google.com<mailto:jyknight at google.com>> wrote:
>
> There is no problem (no leaks) in the code that users wrote, so adding code annotations (sanitizer suppression file, or attributes) is not a good solution to this issue. The problem is that this optimization introduces a "leak" (from the point of view of the leak checker), which wasn't there before. And in practice, this seems to cause a large number of false positives.

I think that “from the point of view of the leak checker” is the key thing there.

Code that this triggers *is* leaking memory, it was just silenced because the leak was spuriously reachable from a global.  Global variables aren’t a preferred way to silence leak detectors, they have other ways to do so :)

> On Apr 19, 2021, at 4:52 PM, Sterling Augustine <saugustine at google.com<mailto:saugustine at google.com>> wrote:
>
> There may be other ways to disable leak checkers, but they put a burden on users that was not there before. Further, users who try leak checking with and without optimization will get different answers. The bug report will read: "clang at -O2 makes my program leak". And, as James notes, whether or not you need to suppress the leak depends on whether or not the optimizer does away with the variable. Subtle changes to the code that have nothing to do with memory allocation will appear to add or fix leaks. That is not something I would care to explain or document.

I can see that concern, but this isn’t a battle that we can win: optimizations in general can expose leaks.

IMO, If someone doesn’t want the global to be removed, they should mark it volatile.  If they do want it removable, then they can use leak detector features to silence the warning.

> On Apr 20, 2021, at 9:12 AM, Sterling Augustine <saugustine at google.com<mailto:saugustine at google.com>> wrote:
>
> In order to understand how much benefit this change gives to code size, I built clang and related files with and without the patch, both with CMAKE_BUILD_TYPE=Release.
>
> clang itself gets about 0.4% smaller (from 145217124 to 144631078)
> lld gets about 1.85% smaller (from 101129181 to 99243810 bytes)
>
> Spot checking a few other random targets, in general, it seems that the benefits depend heavily on the coding style, but roughly, bigger the binary, the less benefit this brings.
>
> I suspect that a more aggressive pass as described by Philip Reames could capture a significant part of the benefit without sacrificing functionality. But that would require a more detailed study to be sure.

A ~2% reduction in code size is a huge win.  I agree with your comment about it being different with different coding styles.  I suspect that this is the sort of thing that will pay particularly for high abstraction code bases.

I don’t see why we would punish general code to make “code that is leaking where formerly not detected, and where users don’t want to mark the root as volatile”.  This seems really unprincipled to me, and a slippery slope we can’t go down.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210423/f3f2501b/attachment.html>