[PATCH] Inconsistency in Preprocessor::ReleaseMacroInfo(MacroInfo *MI)

Yaron Keren yaron.keren at gmail.com
Tue Aug 5 11:29:40 PDT 2014


Hi,

For clang, peak memory usage could be measured.
For cling you'd want to minimize the 'leakage' after running all the tests
or similar.

>From the clang allocations, I think cling iterate over Decls only. There
are lots of other auxiliary data types which are not freed until
destruction of the bump allocator.

I can't speak for the clang community...
I'd suppose that performance is considered much more important that
lowering peak memory usage. Memory is cheap while time is not so any change
must not slow the compilation.

Under this condition it would be nice to recycle memory while compiling to
reduce peak memory usage. This may also improve memory locallity and
actually gaining performance.

The memory manager must be very fast. You may wish to use the
RecyclingAllocator (fixed size, very fast) for various classes such as
*Decl. This would help with the memory 'leakage' in cling while not slowing
compilation.

Yaron




2014-08-04 23:46 GMT+03:00 Vassil Vassilev <vvasilev at cern.ch>:

>  On 04/08/14 13:35, Yaron Keren wrote:
>
>  Slab allocator (for example Support\RecyclingAllocator.h) will be fast
> enough and would work certainly for fixed size classes. However not only
> Decls are dynamically allocated but a large number of other classes, this
> startegy will end up with tens of fixed size allocators in addition to a
> variable-sized catch-all for the less popular sizes and dynamic-sized
> allocations. Some examples:
>
>  NamedDecl **NamedChain = new
> (SemaRef.Context)NamedDecl*[Chaining.size()];
> TemplateArgument *ArgumentPack = new (S.Context)
> TemplateArgument[Pack.New.size()];
>
> I see. For me the big unknown is how to measure memory footprints
> improvement/degradation when playing with the allocators. Is there anything
> that can be used as a golden reference?
>
>
>   Since the recycled memory is not shared between the different-sized
> fixed and the dynamic allocators there will be significant waste. Not all
> allocations go through ASTConext. Search for '.Allocate<' to find some that
> use the allocator directly.
>
>
>  Not all allocs are paired with a free. That could be found using the
> regular tools, if taught to watch for it.
>
> In my case, I iterate over the entities and I can call the dealloc. Maybe
> it can be generalized for mainline also.
>
>
>  Revising clang memory management is quite a task, while keeping top
> performance.
> It does have the potential to lower clang peak memory usage while
> compiling big programs.
>
> Would clang community be interested in work in this direction?
>
>
>  There is an excellent discussion of small object allocation in Chapter 4
> of "Modern C++ Design: Generic Programming and Design Patterns Applied" /
> Andrei Alexandrescu.
>
> Thanks for the pointer. Would this be the preferred approach in
> implementing the memory management?
> Vassil
>
>
>  Yaron
>
>
>
>
>
>
>
>
>  2014-08-04 13:42 GMT+03:00 Vassil Vassilev <
> vasil.georgiev.vasilev at cern.ch>:
>
>>  Hi Yaron,
>>   Thanks for raising those very interesting points!
>>
>> On 08/04/2014 12:12 PM, Yaron Keren wrote:
>>
>>  Hi,
>>
>>  You should not destruct SmallVector explicitly as its destructor will
>> be called from ~MacroInfo from ~MacroInfoChain from ~Preprocessor. If you
>> want to reclaim the Tokens memory use in MacroInfo use you need to do as
>> before, call ~MacroInfo and remove it from the chain.
>>
>>  Understood.
>>
>>
>>  The source of the memory usage comes much more from AST memory
>> "leakage" (leakage in the sense the memory is not freed until destruction
>> which does not happen in cling) and other allocations all around the code
>> rather than the bit of memory lost to the macros.
>>
>>  Yes, I was planning to replace the bump alloc with a slab alloc and
>> measure the performance penalty. This is on my todo list since some time.
>> What would be your guess?
>>
>>
>>  I have looked into this issue for Ceemple which has similar need as
>> cling for memory reclaim and gave it up for now.
>> It's actually quite hard to make clang reclaim memory before destruction,
>> since
>>
>>  I couldn't agree more.
>>
>>
>>  1) BumpPtrAllocator does not reuse free memory. Could be replaced by a
>> MallocAllocator or other custom allocation but this would reduce
>> compilation performance. It's hard to compete with BumpPtrAllocator
>> performance.
>>
>>  I think using a slab alloc may be not that bad for us.
>>
>>
>>  2) Freeing the memory slows down performance even more.
>> BumpPtrAllocator free is a no-op.
>>
>>  For our use-cases this is not an issue. This is on the error path, there
>> we can be slower. Memory is much much more important.
>>
>>
>>  3) Actually releasing the memory may cause use-after-free bugs which
>> are not seen now since the AST memory is never really released.
>>
>>  Another issue to tackle. I was thinking to borrow some ideas from llvm
>> like AssertingVH to track use-after-delete. Again I need to measure the
>> penalties. And codegen cleanup is a monster in this respect. I saw some
>> light in the tunnel with the recent changes (haven't updated clang since a
>> while).
>>
>>
>>  4)  BumpPtrAllocator is used everywhere, sometimes without calling the
>> no-op free, so even with a real allocator there would still be leaks
>> (meaning non-reused memory, in destruction all is freed to the system)
>> unless every allocation is matched with a free.
>>
>>  This should be fixable, but I agree not easy to debug. AFAIK for AST
>> nodes it happens only in the ASTContext.
>> Vassil
>>
>>
>>
>>  Yaron
>>
>>
>>
>>
>>  2014-08-04 12:26 GMT+03:00 Vassil Vassilev <
>> vasil.georgiev.vasilev at cern.ch>:
>>
>>>  Hi Yaron,
>>>
>>> On 08/04/2014 10:31 AM, Yaron Keren wrote:
>>>
>>>  Hi Vassil,
>>>
>>>  If you decide to keep the last code posted as a cling patch, it could
>>> do with 'I' only instead of 'I' and 'current', and when MI is the first
>>> node the code should set MIChainHead but not set its Next.
>>>
>>> Thanks for pointing out, will do.
>>>
>>>
>>>  To the point, ReleaseMacroInfo just releases the SmallVector Tokens
>>> memory if it wasn't small.
>>> It did not modify anything else. You could still removeMacro without
>>> ReleaseMacroInfo.
>>>
>>>  Thanks for explaining. My code looks like this:
>>>
>>> void Preprocessor::removeMacro(IdentifierInfo *II, const MacroDirective
>>> *MD) {
>>>   assert(II && MD);
>>>   assert(!MD->getPrevious() && "Already attached to a MacroDirective
>>> history.");
>>>
>>>   //Release the MacroInfo allocated space so it can be reused.
>>>   MacroInfo* MI = MD->getMacroInfo();
>>>   if (MI) {
>>>     ReleaseMacroInfo(MI);
>>>   }
>>>   Macros.erase(II);
>>> }
>>>
>>> IIUC I need to check if the small vector isSmall and if not then do a
>>> ReleaseMacro, or even this is redundant?
>>>
>>>
>>>  There's lots of places in clang where memory is allocated and not
>>> released until destruction for performance.
>>> The whole AST for starters...
>>>
>>>
>>>  It would be nice to early release the Tokens but In this context it
>>> would hardly move the needle.
>>>
>>>  I agree. So I need to somehow implement it.
>>>
>>>
>>>  cling memory use should going up every iteration due to this startegy,
>>> no?
>>>
>>>  Yes, it grows. The context I want things removed is support of 'code
>>> unloading'. Say:
>>> [cling] #include "MyFile.h"
>>> [cling] MyClass m; m.do();
>>> // Figure out that do is not what I want. I edit the file and do:
>>> [cling] #include "MyFile.h" // It would undo everything up to #include
>>> "MyFile.h" (inclusively). I want the memory to be reduced also. This is why
>>> I need to delete the macros and not only undef them. (The same holds for
>>> the AST)
>>> [cling] MyClass m; m.do(); // Here do and MyClass may have completely
>>> different implementation.
>>>
>>> Vassil
>>>
>>>
>>>  Yaron
>>>
>>>
>>>
>>>
>>>
>>>  2014-08-04 10:47 GMT+03:00 Vassil Vassilev <
>>> vasil.georgiev.vasilev at cern.ch>:
>>>
>>>>  Hi Richard,
>>>>   Thanks for the fix!
>>>>
>>>>   Unfortunately it doesn't help for cling case. I implement a
>>>> removeMacro routine using ReleaseMacroInfo. ReleaseMacroInfo allowed me to
>>>> implement efficiently the removal of a macro instead of dragging a long def
>>>> undef chains, for example.
>>>>   IIUC it allowed some memory reduction in some cases for clang, too.
>>>> Is there any chance to keep the ReleaseMacroInfo upstream?
>>>>  Vassil
>>>>
>>>> On 08/04/2014 01:50 AM, Richard Smith wrote:
>>>>
>>>> Fixed in a much more simple way in r214675. Thanks for reporting!
>>>>
>>>>
>>>> On Sun, Aug 3, 2014 at 11:52 AM, Vassil Vassilev <vvasilev at cern.ch>
>>>> wrote:
>>>>
>>>>>  I will try just one more time and then shut up :)
>>>>>
>>>>>
>>>>> diff --git a/lib/Lex/PPDirectives.cpp b/lib/Lex/PPDirectives.cpp
>>>>> index 5f38387..000ea7a 100644
>>>>> --- a/lib/Lex/PPDirectives.cpp
>>>>> +++ b/lib/Lex/PPDirectives.cpp
>>>>> @@ -94,6 +94,19 @@
>>>>> Preprocessor::AllocateVisibilityMacroDirective(SourceLocation Loc,
>>>>>  /// error in the macro definition.
>>>>>  void Preprocessor::ReleaseMacroInfo(MacroInfo *MI) {
>>>>>    // Don't try to reuse the storage; this only happens on error paths.
>>>>> +
>>>>> +  // If this is on the macro info chain, avoid double deletion on
>>>>> teardown.
>>>>> +  MacroInfoChain *current = MIChainHead;
>>>>> +  while (MacroInfoChain *I = current) {
>>>>> +    if (&(I->MI) == MI) {
>>>>> +      I->Next = (I->Next) ? I->Next->Next : 0;
>>>>> +      if (I == MIChainHead)
>>>>>  +        MIChainHead = I->Next;
>>>>>
>>>>> +      break;
>>>>> +    }
>>>>> +    current = I->Next;
>>>>> +  }
>>>>> +
>>>>>    MI->~MacroInfo();
>>>>>  }
>>>>>
>>>>>
>>>>>   On 03/08/14 20:47, Vassil Vassilev wrote:
>>>>>
>>>>> Hi,
>>>>>   Sorry overlooked, thanks for pointing it out!
>>>>>   I hope this is what we want.
>>>>> Vassil
>>>>>
>>>>> diff --git a/lib/Lex/PPDirectives.cpp b/lib/Lex/PPDirectives.cpp
>>>>> index 5f38387..000ea7a 100644
>>>>> --- a/lib/Lex/PPDirectives.cpp
>>>>> +++ b/lib/Lex/PPDirectives.cpp
>>>>> @@ -94,6 +94,19 @@
>>>>> Preprocessor::AllocateVisibilityMacroDirective(SourceLocation Loc,
>>>>>  /// error in the macro definition.
>>>>>  void Preprocessor::ReleaseMacroInfo(MacroInfo *MI) {
>>>>>    // Don't try to reuse the storage; this only happens on error paths.
>>>>> +
>>>>> +  // If this is on the macro info chain, avoid double deletion on
>>>>> teardown.
>>>>> +  MacroInfoChain *current = MIChainHead;
>>>>> +  while (MacroInfoChain *I = current) {
>>>>> +    if (&(I->MI) == MI) {
>>>>> +      I->Next = (I->Next) ? I->Next->Next : 0;
>>>>> +      if (I == MIChainHead)
>>>>> +        MIChainHead = I;
>>>>> +      break;
>>>>> +    }
>>>>> +    current = I->Next;
>>>>> +  }
>>>>> +
>>>>>    MI->~MacroInfo();
>>>>>  }
>>>>>
>>>>> On 03/08/14 20:28, Yaron Keren wrote:
>>>>>
>>>>>  Hi,
>>>>>
>>>>>  MIChainHead is a pointer to the head of a linked list
>>>>> of MacroInfoChain nodes, each containing a MacroInfo and MacroInfoChain*.
>>>>>
>>>>>  Why does the while loop modify MIChainHead on every iteration?
>>>>> MIChainHead should be modified only if it points to the node
>>>>> containing the removed MacroInfo MI. In all other cases it should not
>>>>> change.
>>>>>
>>>>>  As it is now, the loop will always terminate with MIChainHead ==
>>>>> nullptr.
>>>>>
>>>>>  Yaron
>>>>>
>>>>>
>>>>>
>>>>>  2014-08-03 21:10 GMT+03:00 Vassil Vassilev <vvasilev at cern.ch>:
>>>>>
>>>>>>  Hi Yaron,
>>>>>>   Yes I meant double destruction.
>>>>>> Vassil
>>>>>>
>>>>>> On 03/08/14 20:08, Yaron Keren wrote:
>>>>>>
>>>>>>  Hi Vassil,
>>>>>>
>>>>>>  Do you mean double destruction (not deletion) of MacroInfo first
>>>>>> time in ReleaseMacroInfo and the second time in ~Preprocessor via
>>>>>>  ~MacroInfoChain?
>>>>>>
>>>>>>    while (MacroInfoChain *I = MIChainHead) {
>>>>>>     MIChainHead = I->Next;
>>>>>>     I->~MacroInfoChain();
>>>>>>   }
>>>>>>
>>>>>>  or something else?
>>>>>>
>>>>>>  Yaron
>>>>>>
>>>>>>
>>>>>>
>>>>>>  2014-08-02 23:05 GMT+03:00 Vassil Vassilev <vvasilev at cern.ch>:
>>>>>>
>>>>>>> Hi,
>>>>>>>   In cases where ReleaseMacroInfo gets called and it doesn't cleanup
>>>>>>> the Preprocessor's MIChainHead can lead to double deletion. I am sending
>>>>>>> the patch that fixes the problem for me.
>>>>>>> Vassil
>>>>>>>
>>>>>>>
>>>>>>> diff --git a/lib/Lex/PPDirectives.cpp b/lib/Lex/PPDirectives.cpp
>>>>>>> index 5f38387..1a9b5eb 100644
>>>>>>> --- a/lib/Lex/PPDirectives.cpp
>>>>>>> +++ b/lib/Lex/PPDirectives.cpp
>>>>>>> @@ -94,6 +94,14 @@
>>>>>>> Preprocessor::AllocateVisibilityMacroDirective(SourceLocation Loc,
>>>>>>>  /// error in the macro definition.
>>>>>>>  void Preprocessor::ReleaseMacroInfo(MacroInfo *MI) {
>>>>>>>    // Don't try to reuse the storage; this only happens on error
>>>>>>> paths.
>>>>>>> +
>>>>>>> +  // If this is on the macro info chain, avoid double deletion on
>>>>>>> teardown.
>>>>>>> +  while (MacroInfoChain *I = MIChainHead) {
>>>>>>> +    if (&(I->MI) == MI)
>>>>>>> +      I->Next = (I->Next) ? I->Next->Next : 0;
>>>>>>> +    MIChainHead = I->Next;
>>>>>>> +  }
>>>>>>> +
>>>>>>>    MI->~MacroInfo();
>>>>>>>  }
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> cfe-commits mailing list
>>>>>>> cfe-commits at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cfe-commits mailing listcfe-commits at cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cfe-commits mailing list
>>>>> cfe-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140805/adcc24e7/attachment.html>


More information about the cfe-commits mailing list