[PATCH] D128750: [C++20] Implement P2113R0: Changes to the Partial Ordering of Constrained Functions

Mon Oct 17 12:12:18 PDT 2022

ychen added inline comments.

================
Comment at: clang/lib/Sema/SemaTemplateDeduction.cpp:5511-5539
+  template <typename T1, typename T2,
+            std::enable_if_t<std::is_same<T1, T2>::value, bool> = true>
+  bool operator()(T1 *PS1, T2 *PS2) {
+    return hasEqualTemplateArgumentList(
+        PS1->getTemplateArgsAsWritten()->arguments(),
+        PS2->getTemplateArgsAsWritten()->arguments());
+  }
----------------
mizvekov wrote:
> mizvekov wrote:
> > mizvekov wrote:
> > > mizvekov wrote:
> > > > ychen wrote:
> > > > > mizvekov wrote:
> > > > > > I think you are not supposed to use the `TemplateArgsAsWritten` here.
> > > > > > 
> > > > > > The injected arguments are 'Converted' arguments, and the transformation above, by unpacking the arguments, is reversing just a tiny part of the conversion process.
> > > > > > 
> > > > > > It's not very meaningful to canonicalize the arguments as written to perform a semantic comparison, as that works well just for some kinds of template arguments, like types and templates, but not for other kinds in which the conversion process is not trivial.
> > > > > > 
> > > > > > For example, I think this may fail to compare the same integers written in different ways, like `2` vs `1 + 1`.
> > > > > Indeed. It can happen only when comparing one partial specialization with another. I think the standard does not require an implementation to deal with this but we could use the best effort without much overhead. For `2` vs `1+1` or similar template arguments that are not dependent, we could assume the equivalence because they wouldn't be in the partial ordering stage if they're not equivalent. For more complicated cases like `J+2` vs `J+1+1` where J is NTTP, let's stop trying (match GCC) because the overhead is a little bit high. 
> > > > But I think the 'TemplateArgs', which are the specialization arguments and are available through `getTemplateArgs()`, are the converted arguments you want here, ie the AsWritten arguments converted against the template.
> > > > 
> > > > I don't see why you can't just use that.
> > > > 
> > > > How about we change:
> > > > ```
> > > >   if (!TemplateArgumentListAreEqual(S.getASTContext())(P1, P2))
> > > >     return nullptr;
> > > > ```
> > > > 
> > > > Into:
> > > > 
> > > > ```
> > > >   {
> > > >     ArrayRef<TemplateArgument> Args1 = P1->getTemplateArgs().asArray(), Args2;
> > > >     if constexpr (IsMoreSpecialThanPrimaryCheck)
> > > >       Args2 = P2->getInjectedTemplateArgs();
> > > >     else
> > > >       Args2 = P2->getTemplateArgs().asArray();
> > > > 
> > > >     if (Args1.size() != Args2.size())
> > > >       return nullptr;
> > > > 
> > > >     for (unsigned I = 0, E = Args1.size(); I < E; ++I) {
> > > >       TemplateArgument Arg2 = Args2[I];
> > > >       // Unlike the specialization arguments, the injected arguments are not
> > > >       // always canonical.
> > > >       if constexpr (IsMoreSpecialThanPrimaryCheck)
> > > >         Arg2 = S.Context.getCanonicalTemplateArgument(Arg2);
> > > > 
> > > >       // We use profile, instead of structural comparison of the arguments,
> > > >       // because canonicalization can't do the right thing for dependent
> > > >       // expressions.
> > > >       llvm::FoldingSetNodeID IDA, IDB;
> > > >       Args1[I].Profile(IDA, S.Context);
> > > >       Arg2.Profile(IDB, S.Context);
> > > >       if (IDA != IDB)
> > > >         return nullptr;
> > > >     }
> > > >   }
> > > > ```
> > > > 
> > > > That should work, right?
> > > Actually, you can even further simplify this.
> > > 
> > > You can't have two different specializations with the same specialization arguments. These arguments are used as the key to unique them anyway.
> > > 
> > > So simplify my above suggestion to:
> > > ```
> > >   if constexpr (IsMoreSpecialThanPrimaryCheck) {
> > >     ArrayRef<TemplateArgument> Args1 = P1->getTemplateArgs().asArray(),
> > >                                Args2 = P2->getInjectedTemplateArgs();
> > >     if (Args1.size() != Args2.size())
> > >       return nullptr;
> > > 
> > >     for (unsigned I = 0, E = Args1.size(); I < E; ++I) {
> > >       // We use profile, instead of structural comparison of the arguments,
> > >       // because canonicalization can't do the right thing for dependent
> > >       // expressions.
> > >       llvm::FoldingSetNodeID IDA, IDB;
> > >       Args1[I].Profile(IDA, S.Context);
> > >       // Unlike the specialization arguments, the injected arguments are not
> > >       // always canonical.
> > >       S.Context.getCanonicalTemplateArgument(Args2[I]).Profile(IDB, S.Context);
> > >       if (IDA != IDB)
> > >         return nullptr;
> > >     }
> > >   }
> > > ```
> > Yet another improvement here is that you can do this check once, when we create the specialization, and then simply store a `hasSameArgumentsAsPrimaryInjected` flag.
> > 
> > When we create the specialization, we already profile the specialization arguments anyway, so it's another computation you can avoid repeating.
> > 
> > Could even avoid repeating the profiling of the injected arguments if there was justified runtime overhead there, which I doubt, trading that off with increased memory use.
> It doesn't even need to be a flag to store in each SpecializationDecl, because there can be only one specialization with the same arguments as the primary injected arguments, for obvious reasons :D
> 
> So probably just a pointer in the TemplateDecl...
'TemplateArgs' is indeed what I needed, I don't know how I missed that. There could be multiple partial specializations with the same arguments as the primary injected arguments and they differ by constraints (https://godbolt.org/z/onME5ehEE). I'm unsure if adding a `hasSameArgumentsAsPrimaryInjected` flag would save much since the `IsMoreSpecialThanPrimaryCheck` is only performed once. We could avoid maybe one profiling by introducing the `hasSameArgumentsAsPrimaryInjected` flag though. On the other hand, it doesn't make the implementation much harder either. 

I did some compile-time measurements, and the results seem neutral https://llvm-compile-time-tracker.com/index.php?config=NewPM-O0-g&stat=instructions&remote=yuanfang-chen.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128750/new/

https://reviews.llvm.org/D128750