[PATCH] D123837: [C++20] [Modules] Judge isInCurrentModule currently

Iain Sandoe via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Apr 19 02:10:38 PDT 2022


iains added inline comments.


================
Comment at: clang/include/clang/Basic/Module.h:537-543
+  static StringRef getPrimaryModuleInterfaceName(StringRef Name) {
+    return Name.split(':').first;
+  }
+
   /// Get the primary module interface name from a partition.
   StringRef getPrimaryModuleInterfaceName() const {
+    return getPrimaryModuleInterfaceName(getTopLevelModuleName());
----------------
ChuanqiXu wrote:
> iains wrote:
> > I do not understand the purpose of this change, we already iterated over making the function more efficient.
> > 
> > For C++20 modules there cannot be more than one parent - so that the nested call to getTopLevelModule(), via getTopLevelModuleName() is redundant.
> > 
> > Is there some other case that needs special handling?
> > (If so then I think we should guard this at a higher level)
> > 
> The primary purpose of the change is that we need to get the primary module name for getLangOpts().CurrentModule. So we need a static member function which takes a StringRef.
> 
> Another point might be the correctness. For example, for the private module fragment, the current implementation would return the right result while the original wouldn't. (Although we didn't use private module fragment much...)
> 
> For the most case, I admit the current implementation would call more function a little more. But I think it is negligible.
(Maybe I am worrying too much - but it does seem that we have quite some complexity in an operation that we expect to repeat many times per TU).

----

> The primary purpose of the change is that we need to get the primary module name for getLangOpts().CurrentModule. So we need a static member function which takes a StringRef.

That is, of course,  fine (and we are stuck in the case that we are building a partition, with the fact that the primary module is not available).  It seems that there's not a lot we can do there. 

When we have module,  we could cache the result of the split, tho - so have a "PrimaryModuleName" entry in each module and populate it on the first use.  We are still stuck with the string comparison - but at least we could avoid repeating the split?
As I noted, the reason we did not bother to do this in the first use was that that use was only once per TU).

When we are building a consumer of a module, on the other hand, we *do* have the complete module import graph.

> Another point might be the correctness. For example, for the private module fragment, the current implementation would return the right result while the original wouldn't. (Although we didn't use private module fragment much...)

Of course, we must fix correctness issues ..
Since there is only GMF and PMF, perhaps we could reasonably treat those specially `if (isGlobalModule())` .. and/or `if (isPrivateModule())`?

> For the most case, I admit the current implementation would call more function a little more. But I think it is negligible.

It is hard to judge at the moment - perhaps we should carry on with what we have and then instrument some reasonable body of code?


================
Comment at: clang/lib/Sema/SemaLookup.cpp:1567-1568
+
+  return M->getPrimaryModuleInterfaceName() ==
+         Module::getPrimaryModuleInterfaceName(LangOpts.CurrentModule);
 }
----------------
ChuanqiXu wrote:
> iains wrote:
> > ... of course, "correctness before optimisation", but this test function seems like it would benefit from some refactoring as suggested below.
> > 
> > it seems a bit unfortunate that we are placing a string comparison in the "acceptable decl" lookup path, which might be made many times (the previous use of the string compare was in making the module decl - which only happens once in a TU).
> > 
> > * Sema has visibility of the import graph of modules (via DirectModuleImports and Exports).
> > * We also know what the module parse state is (FirstDecl, GMF etc)
> > * If were are not parsing a module (i.e. LangOpts.CurrentModule.empty()) the we could return false early?
> > * if ModuleScopes is empty we know we are not (yet) parsing a module
> > * Once we are parsing a module then we can test ModuleScopes.back() to find the current module 
> > 
> > there are only two users of isInCurrentModule() - so maybe we refactor could make more use of the higher level state information - and / or cache information on the parents of partitions  to  avoid the complexity of parsing strings each time.
> > 
> > what do you think?
> Yeah, string comparison is relatively expensive and we should avoid that. I added a search cache here and I thought it should handle the most cases.
> 
> For the solution you described, I am not sure if I understood correctly. It looks like a quick return path if we found we are not parsing a module? If yes, I think the current check could do similar works. 
That looks good.

On the actual test itself:

1/
we have the check of  `(M->isGlobalModule() && !M->Parent)`

So that answers "this decl is part of the global module", since that fires when we are parsing the GMF.  What will we answer,
 if `M->isGlobalModule() && M->Parent` ?
 it seems that with the changes above we will now answer that the decl is part of the Parent - not part of the GM?

2/
When we are actually  parsing the PMF we have a similar case, no?
 (so that there is no Parent pointer set yet, and I think that means that we would need to examine ModuleScopes to find the Parent module?)




================
Comment at: clang/test/Modules/cxx20-10-1-ex2.cpp:62-69
+// According to [module.import]p7:
+//   Additionally, when a module-import-declaration in a module unit of some
+//   module M imports another module unit U of M, it also imports all 
+//   translation units imported by non-exported module-import-declarations in
+//   the module unit purview of U.
+//
+// So B is imported in B:X3 due to B:X2 imported B. So n is visible here.
----------------
ChuanqiXu wrote:
> iains wrote:
> > @rsmith 
> > 
> > so, I wonder if this is intended, it seems to defeat the mechanisms to avoid circular dependencies.
> > 
> > TU1:
> > module M : Impl; // this is said not to generate a circular dependency because M : Impl does not implicitly import M
> > 
> > import M;  // but now we've done it manually ... and if that module is implicitly exported to importers of the Impl. module interface....
> > ....
> > 
> > TU2:
> > 
> > export module M;
> > import : Impl; 
> > 
> > // ...  then boom! we have a circular dep.
> > 
> > 
> @rsmith should be the right one to answer this.
> 
> Personally, I feel the case you described should be avoided. And I found it is hard to right cyclic module dependent code in current fashion. Since when I compile TU1, the compiler suggests that it failed to found M. And when I compile TU2, the compiler suggests that it failed to found M:impl... So it looks like we could avoid worrying this now. Maybe the build system could give better diagnostic message in the future. I think this might not be a big problem.
Right, one cannot actually build the faulty module (actually, in the same way as one cannot build example 3 from section 10.3 of the std).

So probably my concern is unfounded - it's just something the user will have to figure out - we are not required to diagnose the situation (and that would be hard to do for the compiler anyway).



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123837/new/

https://reviews.llvm.org/D123837



More information about the cfe-commits mailing list