[PATCH] D118586: [C++20][Modules][3/8] Initial handling for module partitions.
Chuanqi Xu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Feb 17 22:52:28 PST 2022
ChuanqiXu added inline comments.
================
Comment at: clang/lib/Sema/SemaModule.cpp:177
+ if (IsPartition) {
+ ModuleName += ":";
+ ModuleName += stringFromPath(Partition);
----------------
iains wrote:
> urnathan wrote:
> > iains wrote:
> > > ChuanqiXu wrote:
> > > > I chose '-' in my implementation since here ModuleName refers to the name in filesystem, right? And I remember '-' is the choice of GCC. So let's keep consistency.
> > > This is not my understanding.
> > >
> > > ModuleName is the "user-facing" name of the module that is described in the source, and we use this in diagnostics etc.
> > >
> > > The translation of that name to an on-disk name can be arbitrary and there are several mechanisms in clang already (e.g. -fmodule-file=A:B=foo.pcm) the module loader uses these to find the module on the basis of its user-facing name where required.
> > >
> > > When we have P1184, it is even more important that the interface is supplied with the name that the user will put into the source code.
> > >
> > >
> > I agree with Iain, we should use ':' is module names here. When mapping a named module to the file system some translation is needed because ':' is not permitted in file names on some filesystems (you know the ones!)
> (just to expand a little more)
>
> the on-disk name needs to be chosen suitably for the platform and by the user and/or the build system.
>
> When the FE chooses a default filename (which is done in building jobs, not in the Parser of course) it chooses one based on the source filename. It would be most unfortunate if the Parser/Sema needed to understand platform filename rules.
>
> When you do 'clang -module-file-info' (with the existing or updated version) you should see the module name as per the source code, the translation would only apply to the filename itself.
>
> - to answer a later comment:
>
> when we do -fmodule-file=A_B.pcm and A_B.pcm contains a module named A:B the loader notices this pairing when it pre-loads the module, so that when we ask for "A:B" the loader already knows which on-disk file contains. it.
>
> if we use the HeaderSearch mechanisms (when we want to figure out a module-name<=> filename pairing without loading the module) we can use -fmodule-name=A:B=A_B.pcm,
>
> These mechanisms work today, but P1184 is a more flexible mechanism and avoids having massive command lines with many -fmodule-file= cases.
>
But what if we need to import multiple modules? In our current workflow, we would like to compile importing module in following style:
```
clang++ -std=c++20 -fprebuilt-module-path=path1 -fprebuilt-module-path=path2 ... unit.cpp(m) ...
```
And unit.cpp(m) may look like:
```
export module M;
import :parta;
import :partb;
import :partc;
```
And our compiler would lookup `M-parta.pcm`, `M-partb.pcm` and `M-partc.pcm` in the path given in the command line. However, in current implementation, it would lookup `M:parta.pcm`, `M:partb.pcm` and `M:partc.pcm` in the path but it might fail. So my point here is that the current implementation doesn't work well with `fprebuilt-module-path`. And I don't think we should give up `fprebuilt-module-path` to support multiple `fmodule-file`. The `fprebuilt-module-path` is easier to understand and use for real projects. Even if we support multiple `-fmodule-file`, people might be frustrating to add many `fmodule-file` if they need to import many units.
So I really insist on this. Or at least let's add one option, something like `-fmodule-partition-separator` here.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D118586/new/
https://reviews.llvm.org/D118586
More information about the cfe-commits
mailing list