[PATCH] D158675: [NFC][AsmPrinter] Refactor DbgVariable as a std::variant
Felipe de Azevedo Piovezan via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 6 12:04:18 PDT 2023
fdeazeve added inline comments.
================
Comment at: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h:149
+struct MMILoc {
+ mutable SmallVector<FrameIndexExpr, 1> FrameIndexExprs;
+};
----------------
scott.linder wrote:
> fdeazeve wrote:
> > scott.linder wrote:
> > > fdeazeve wrote:
> > > > Not for this patch, but it really bothers me that this is mutable. The original rationale must have been to allow sorting when we call the "get" method, but still...
> > > It is almost spooky how you've pointed out the same things I toyed with while working on the patches originally!
> > >
> > > I had a patch in the middle somewhere which just exposed the need to sort in the interface, but scrapped it early on. I think your `set` idea makes a lot of sense and I'll follow up with a patch for that as well.
> > One thing to consider here: sorting (either in the current vector approach or through the internal impl of std::set) require calling `Expr->getFragmentInfo()->OffsetInBits` as many time as comparison operations are performed, and this is possibly done multiple times for the same expression without any caching.
> >
> > Something to keep in mind as a possible improvement as well (perhaps computing & storing the offset in advance)
> I tried to do a bit of (non-statistically-rigorous) benchmarking of a few approaches.
>
> The versions are:
>
> * main: unmodified main branch
> * patched+uncached: most straightforward switch to std::set, without any caching of the fragment info
> * patched+eager_cached: above, but assume the `OffsetInBits` will be needed, and cache it in the constructor of `FrameIndexExpr`
> * patched+lazy_cached: above, but re-introduce a `mutable` member so the comparison operator can cache the `OffsetInBits` on first use
>
> The input is a precodegen dump of llvm-tblgen from a FullLTO build of main. Not sure if this is really representative, but it at least contains multi-fragment MMI entries.
>
> If there is a better way to do the benchmark I am happy to redo it, this is just the best I could come up with that would take a reasonable amount of time.
>
> ```
> llc -O3 build-lto/bin/llvm-tblgen.0.5.precodegen.bc -o /dev/null -filetype=obj -time-passes
>
> ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
>
> main:
> 22.3635 ( 28.8%) 14.2489 ( 71.7%) 36.6124 ( 37.5%) 36.6142 ( 37.5%) X86 Assembly Printer
> 21.9819 ( 28.1%) 14.6388 ( 71.8%) 36.6207 ( 37.2%) 36.6238 ( 37.2%) X86 Assembly Printer
>
> patched+uncached:
> 21.3484 ( 28.0%) 14.4208 ( 72.4%) 35.7693 ( 37.2%) 35.7706 ( 37.2%) X86 Assembly Printer
> 21.4563 ( 28.2%) 14.4606 ( 72.2%) 35.9169 ( 37.4%) 35.9191 ( 37.4%) X86 Assembly Printer
>
> patched+eager_cached:
> 21.2332 ( 32.0%) 14.3204 ( 76.2%) 35.5536 ( 41.8%) 35.5555 ( 41.8%) X86 Assembly Printer
> 21.6691 ( 31.7%) 14.5448 ( 74.4%) 36.2139 ( 41.2%) 36.2149 ( 41.2%) X86 Assembly Printer
>
> patched+lazy_cached:
> 22.4542 ( 27.9%) 14.9983 ( 71.0%) 37.4525 ( 36.9%) 37.4537 ( 36.9%) X86 Assembly Printer
> 22.9115 ( 28.3%) 14.7169 ( 69.3%) 37.6284 ( 36.8%) 37.6299 ( 36.8%) X86 Assembly Printer
> ```
>
> I only ran 2 trials so there is a lot of room for variance, but it seems like everything other than patched+eager_cached is more-or-less comparable when comparing User Time as a percent of compilation time. This makes some sense to me, as I suspect the typical case is to not have multiple fragments anyway, so we would never need to compare the offsets, turning the eager caching into pure overhead.
>
> I think for the sake of simplicity in the code it is not worth having the `mutable` cache at all. If you agree I can post a patch! If this doesn't seem convincing I can do a more rigorous comparison.
Oh wow, thanks for doing all of this! Indeed, I agree with you that the complexity is not worth it based on these numbers. I'd say we can start with the simplest implementation and work from there if this becomes a bottleneck.
By the way, we don't need to do this as part of the existing patch series, it seems reasonable to merge the refactors first!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158675/new/
https://reviews.llvm.org/D158675
More information about the llvm-commits
mailing list