[llvm-dev] Eliminating global memory roots (or not) to help leak checkers

Chris Lattner via llvm-dev llvm-dev at lists.llvm.org
Wed May 5 16:20:12 PDT 2021


Sorry for the delay, been preoccupied with other things.  Responding to a couple of points:

On Apr 22, 2021, at 4:40 PM, David Blaikie <dblaikie at gmail.com> wrote:
>> Code that this triggers *is* leaking memory, it was just silenced because the leak was spuriously reachable from a global.  Global variables aren’t a preferred way to silence leak detectors, they have other ways to do so :)
> 
> "spuriously" reachable is quite questionable here. This way of writing
> code is to make the allocation quite intentionally reachable from a
> global.
> 
> For instance, one of the ways to disable global destruction: (
> https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables
> )
> 
> "If all else fails, you can create an object dynamically and never
> delete it by using a function-local static pointer or reference (e.g.,
> static const auto& impl = *new T(args...);)."
> 
> (all else fails in a large enough codebase rather often... )

Yes, I think that it is a pragmatic point that there is a lot of code out here doing this, and it isn’t practical to expect people to mark everything with leak checkers.  This was reinforced by other comments, e.g.:

> On Apr 23, 2021, at 7:18 AM, James Y Knight <jyknight at google.com> wrote:
> That's the definition of "leak" which is always used by leak checkers. (This is not anything new, it's been how leak checkers work for decades, and how they must work.)
> 
> Therefore, C++ code that allocates memory and assigns it to a global is not a leak, and it's _still_ not a leak even if it so happens in some instantiation of the program that all of the users of the global have been removed by the optimizer.

There are several other good comments that are making similar observations. I would paraphrase this as saying that “C/C++ in practice use the ‘assign pointer to a global to silence leak checkers’” as a sort of extended contract that makes these leak checkers work.  While use of volatile or attribute((used)) could theoretically be used to deal with this, I think we can all agree there is a lot of code in the wild and it isn’t reasonable to change it all, and I have no interest in breaking the world.


But here’s the problem: LLVM isn’t a C compiler.  It is used for a wide variety of purposes, including many languages that are not C.  LLVM contains a set of optimizations that can be used to do many different things, and the guiding principle is that it has an IR which defines which set of transformations are safe.  Deleting these “roots” is an entirely reasonable thing for GlobalOpt to do given the IR as is defined in LangRef.  Even C and C++ dump out global variable metadata (e.g. RTTI info or tables etc) which contain pointers to static data that is conceptually safe to squeeze according to the existing “as if” rule.


I see two paths forward here, the first of which is more principled:

1) Introduce a new “may contain a root" IR attribute that can be applied to a global variable that says “don’t delete pointer members that may root heap allocated memory”.  This captures the property in the IR, and then of course GlobalOpt would look at this to gate its transformation.  By defining this property carefully, we can allow global opt to delete pointers to static memory and other things that aren’t a problem.  Given this IR attribute, clang would generously slather it on IR global variables in the right place - we could easily add a clang command line flag to control this if needed.

2) We can change clang to tip toe around this, e.g. by adding a flag to the GlobalOpt pass itself, and having Clang run the GlobalOpt pass in a mode that disables this behavior.


Either of these would keep LLVM-the-mid-level-optimizer true to what it should be doing, without breaking the C community and associated tools.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210505/edb44888/attachment.html>


More information about the llvm-dev mailing list