[PATCH] D135463: [FuncSpec] Do not generate multiple copies for identical specializations.

Alexandros Lamprineas via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 17 05:47:59 PST 2022


labrinea added inline comments.


================
Comment at: llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h:82
 
-using CallSpecBinding = std::pair<CallBase *, SpecializationInfo>;
-// We are using MapVector because it guarantees deterministic iteration
-// order across executions.
-using SpecializationMap = SmallMapVector<CallBase *, SpecializationInfo, 8>;
+using UniqueSpecSet = std::set<SpecInfo, std::not_equal_to<SpecInfo>>;
 
----------------
chill wrote:
> This has to be a `DenseSet`. `std::set` has logarithmic complexity of operations.
DenseSet is more expensive to iterate on. I have another patch which sorts the specializations globally (across module, not per function) where I keep the specializations sorted in a std::multiset and I use std::merge to unite the two sets. That's relatively cheap as it doesn't perform copies of objects but shuffles pointers around. Also that multiset is iterated a lot more than the set I am adding here. The compexity is logarithmic to the number of callsites, still not a huge number in my opinion.


================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:364
     // with constant arguments.
-    bool Added = false;
+    SpecInfo Spec{{}, 0 - Cost, /*Clone=*/nullptr};
     for (Argument *A : Args) {
----------------
chill wrote:
> Here evaluation of the gain occurs even if the specialization is a duplicate.
> I  would expect `getSpecializationBonus`  to take non-negligible time.
This is already happening, but good point there's room for improvement.


================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:441
 Function *
 FunctionSpecializer::createSpecialization(Function *F,
+                                          SpecInfo &Specialization) {
----------------
chill wrote:
> (nit) This function return `Close` twice, once as a return value and once as an out argument. A cleaner design would be to just return the value as the function does not depend on the `Clone` member variable and need not know anything about it.
I cached the Clone pointer to the SpecInfo for the parent revision D126455, so that updateCallSites() does not rely on two vectors (clones, specialziations) being in sync. So you are suggesting to not set the member variable here, but after returning from this function?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135463/new/

https://reviews.llvm.org/D135463



More information about the llvm-commits mailing list