[PATCH] D110784: Manually create unique_ptr in various pass adaptors

Mon Oct 4 16:45:29 PDT 2021

aeubanks added a comment.

In D110784#3041106 <https://reviews.llvm.org/D110784#3041106>, @dblaikie wrote:

> In D110784#3041091 <https://reviews.llvm.org/D110784#3041091>, @aeubanks wrote:
>
>> In D110784#3040968 <https://reviews.llvm.org/D110784#3040968>, @dblaikie wrote:
>>
>>> In D110784#3040848 <https://reviews.llvm.org/D110784#3040848>, @aeubanks wrote:
>>>
>>>> In D110784#3040750 <https://reviews.llvm.org/D110784#3040750>, @dblaikie wrote:
>>>>
>>>>> In D110784#3040749 <https://reviews.llvm.org/D110784#3040749>, @aeubanks wrote:
>>>>>
>>>>>> In D110784#3040708 <https://reviews.llvm.org/D110784#3040708>, @dblaikie wrote:
>>>>>>
>>>>>>> "83M -> 73M" is peak RAM usage during compilation, I guess?
>>>>>>>
>>>>>>> Is that worth the code changes? I think there's value in make_unique making it easier to read code/check for memory leaks, etc. (makes explicit new stand out more/get extra scrutiny when reading code)
>>>>>>
>>>>>> That's compile time, specifically the "instructions" stat in `perf stat` (I should have specified more clearly in the description).
>>>>>> PassBuilder.cpp is by far the longest file to compile (if you only build X86). With many cores/distributed build farms, it's the bottleneck. So yeah I think it's worth it.
>>>>>
>>>>> Ah, sorry, didn't catch the "instructions" on the line wrap there. I see.
>>>>>
>>>>> What about splitting up the file in some way(s)?
>>>>
>>>> We can't get around (at least with the current design) tons of instantiations of these because for each entry in PassRegistry.def we instantiate at least one of these adaptors, sometimes multiple ones. Splitting these up into multiple files doesn't make sense logically from a code organization point of view, since they're all about parsing a pass pipeline and a lot of the code is very similar between module/cgscc/function/loop passes.
>>>
>>> Yeah, I don't think it'd avoid any instantiations, but could improve build parallelism.
>>>
>>> If these explicit new+unique_ptr's are going to stay, they might warrant comments so someone doesn't come along and clean them up again? (I think we have clang-tidy checks that would probably identify this code and undo the work of this commit, for instance - and sometimes I go looking for uses of explicit new to see if they can be moved to more automatic memory management).
>>
>> Yeah they already have comments.
>
> Ah, sorry, looked at the last ones - seems the comment was missed in the last couple of ones in LoopPassManager.h

I added the comment to the remaining places.

>>>> I have tried splitting things out before with no real impact, although I can look again.
>>>
>>> No real impact on the compile time of each file separately/added together? (I think splitting a single long-compile, even if it adds some total time (eg: the compile time of the split files adds up to more than the compile time of the single file) can still be worthwhile for build parallelism)
>>
>> I split out some of the largest functions into PassBuilderPipelines.cpp and there was no measurable change in the compile time of PassBuilder.cpp. (I still submitted that because it was nice to separate it out)
>
> Ah, fair enough :/ happy to throw around ideas. It might be that it'd be more improvable by splitting into chunks of passes, rather than splitting functions (eg: rather than one file with the X operation for all passes and one file with the Y operation for all passes - instead one file with X and Y operations for the first half of passes, and another for the second - not that simple, I get, but a thought)

Some numbers since I was curious:
Removing everything but the includes: 15B instructions.
Removing parseModule/CGSCC/Function/LoopPass: 37B instructions.
Current state: 80B instructions.

I don't think it makes sense to split the parse*Pass() methods, they have a lot of similar code and it'd be unintuitive to have to update them if they were separate. And almost all the remaining code is to support those methods.
And that's how you'd probably split the passes, by splitting them per IR unit type. I don't think any other way would make sense.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110784/new/

https://reviews.llvm.org/D110784