[PATCH] D69322: [hip][cuda] Enable extended lambda support on Windows.

John McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Feb 1 00:30:19 PST 2021


rjmccall added a comment.

I agree that we should aim to avoid broad memory impact for uncommon features, and this seems to apply.



================
Comment at: clang/include/clang/AST/DeclCXX.h:395-400
     /// The number used to indicate this lambda expression for name
     /// mangling in the Itanium C++ ABI.
     unsigned ManglingNumber : 31;
 
+    /// The device side mangling number.
+    unsigned DeviceManglingNumber = 0;
----------------
rnk wrote:
> hliao wrote:
> > rnk wrote:
> > > It seems a shame to grow LambdaDefinitionData by a pointer for all users of C++ that do not use CUDA. Optimizing bitfields may be worth the time, but I'll leave it to @rjmccall or @rsmith to give guidance on whether that's worth it.
> > > 
> > > An alternative would be to store the device numbers in the mangling context and look them up when needed, since they are so rarely needed.
> > I like the alternative way by storing all numbering into the mangle/numbering context instead of AST itself. But, it needs additional numbering post-processing after AST importing. Sound to me a major refactoring work likely to be addressed later.
> Generally, I don't think we can count on contributors to come back later and optimize memory usage, so it seems reasonable to ask to avoid the regression in the first place. Above I see the bitfield usage optimizing memory usage of the number of captures, and then here we spent lots of memory storing device mangling numbers that are only used for CUDA. I think the memory usage concern still stands. I don't think it's unreasonable to maintain these numbers on the side in a DenseMap.
> 
> @rsmith or @rjmccall, do you agree with that reasoning?
I think that's the right idea in general.  I don't know how much it really matters for LambdaData, because there aren't a huge number of lambdas in typical translation units.  But if you want to optimize this, a hashtable in the `ASTContext` or `ManglingContext` seems reasonable.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69322/new/

https://reviews.llvm.org/D69322



More information about the cfe-commits mailing list