[cfe-dev] Make command line support for C++20 module uniform with GCC

chuanqi.xcq via cfe-dev cfe-dev at lists.llvm.org
Thu Oct 28 20:45:54 PDT 2021


    The disuccsion is really helpful!

   I think it may be implementable to support two strategy in clang:
(1) Produce .pcm and .o in a single compilation.
(2) Produce .pcm and .o separetely.
   At least the first choice may be friendly to the beginners. From my point of view, no matter what the conclusion of SG15 is, it would be better to compile a hello world example in one line.

So the needs for C++20 module that I summarized from the thread now includes:
(a) Offer a legtimate option to the users instead of `-Xclang` option.
(b) Offer a strategy to produce .pcm and .o in a single compilation.
(c) Make the compilation results independent from the order of the input line.
(d) Reduce the .pcm.

(a) should be easiest and we don't discuss it more. (b) (c) (d) needs further discussions about should/how we do it.
If every agree on the needs list, I would like to open 4 issues in bugzilla. I think it would be a better place to manage the needs.

Thanks,
Chuanqi


------------------------------------------------------------------
From:David Blaikie <dblaikie at gmail.com>
Send Time:2021年10月29日(星期五) 09:06
To:Richard Smith <richard at metafoo.co.uk>
Cc:chuanqi.xcq <yedeng.yd at linux.alibaba.com>; cfe-dev <cfe-dev at lists.llvm.org>
Subject:Re: [cfe-dev] Make command line support for C++20 module uniform with GCC

On Thu, Oct 28, 2021 at 5:51 PM Richard Smith <richard at metafoo.co.uk> wrote:
On Thu, 28 Oct 2021 at 17:06, David Blaikie <dblaikie at gmail.com> wrote:
On Thu, Oct 28, 2021 at 4:58 PM Richard Smith via cfe-dev <cfe-dev at lists.llvm.org> wrote:
On Mon, 25 Oct 2021 at 01:57, chuanqi.xcq via cfe-dev <cfe-dev at lists.llvm.org> wrote:
Hi all,

   Recently I am playing with C++20 modules and I found that the command line support of GCC
is much better than Clang. Here is an example:

```C++
// say_hello.cpp
module;
#include <iostream>
#include <string_view>
export module Hello;
export void SayHello
  (std::string_view const &name)
{
  std::cout << "Hello " << name << "!\n";
}
// main.cpp
#include <string_view>
import Hello;
int main() {
  SayHello("world");
  return 0;
}
```

To compile the example, in gcc we need:
```
g++ -std=c++20 -fmodules-ts say_hello.cpp main.cpp 
```

And in clang, we need:
```
clang++ -std=c++20 -fmodules-ts -Xclang -emit-module-interface -c say_hello.cpp -o Hello.pcm
clang++ -std=c++20 -fmodules-ts -fprebuilt-module-path=. main.cpp say_hello.cpp
```

Your point is well-taken. However, some part of the extra work required here is that you're not doing things in the expected way.

The above is not a correct way to enable C++20 modules in Clang: -fmodules-ts enables the old Modules TS mode, not C++20 modules. -std=c++20 is enough to enable C++20 modules.

For the '-Xclang -emit-module-interface' portion, what Clang expects is that files that define module interfaces are either named .cppm or are specified with -x c++-module. With that file type, you can use --precompile to produce a .pcm file (just like you'd use -E or -c to produce other kinds of output). For example:

clang++ -std=c++20 say_hello.cppm --precompile -o Hello.pcm

The above commands are also parsing say_hello.cpp twice. You can avoid that by using the precompiled form, say_hello.pcm, as a compilation input instead:

clang++ -std=c++20 -fprebuilt-module-path=. say_hello.pcm main.cpp

However, this is all based on a model where the PCM file contains a complete description of the input .cppm file, which is not a great model for us to use moving forward due to all the extra stuff ending up in the .pcm file. Currently, Clang lacks two important features here:

1) Produce a .pcm file and a .o file from a single compilation action.
2) Produce a .pcm file that contains only the information needed for an importer, not a complete description of the input.

Ah, that's good to know - didn't know you were inclined/supportive of this direction (as the only way to build a module - or some mode that'd do it as two-step too?) - one of the previous counterarguments was that producing the .pcm without the .o unblocked consumers sooner/let the .o generation be done in parallel with those consumers. Is that generally known/considered to be too small of a benefit to be worth the build/support complexity compared to the minimal-pcm+.o in-one-go mode & its benefits (smaller .pcms)?

I think it's likely there'll be reasonable build strategies that want to build a minimal PCM and a .o file with two separate actions (to maximize throughput in highly parallel builds), and there'll be reasonable build strategies that want to build them as part of the same action (to minimize total time in a build with less parallelism). I expect people will want both options to be available. The option that we currently provide -- producing a PCM file that can be used as an input to both .o generation and for import -- is probably not well aligned with what most build strategies will want.

Reckon there just aren't enough savings in reusing the PCM for .o generation compared to parsing from scratch? Not enough to justify adding an extra intermediate file (a full pcm that gets consumed for .o generation and a slim pcm that gets consumed by uses). That we'll move away from complete pcms entirely to only minimal pcms? Fair enough. Good to know/think about.

We will of course need some command-line support for those features, and being compatible with GCC (which already provides these features) would likely make sense.

As for building and using modules in a single clang command, I agree that'd be nice to have, both for convenience and for GCC compatibility. But ideally this shouldn't depend on what order the files are specified in on the command line, which would require some kind of pre-scanning to find which modules are defined in which files so they can be processed in topological order. (Otherwise, specifying the files in the wrong order would presumably result in stale .pcm files getting used, which would seem quite user-hostile. I don't know if that's what you get from GCC or if it does better somehow.) That kind of prescan might be more complexity than we'd want in the compiler driver, though we can discuss that and figure out where we want to draw that line.

In any case, I'm hoping we get some clear guidance from SG15 that we can follow.

Yeah, in clang we need to another line to emit module interface explicitly and another option
to tell the prebuilt-module-path. And in GCC, this happens by default, when GCC find it is compiling
a c++20 module, it would generate the module interface automatically to the path:
```
gcm.cache/filename.gcm
```
It would create `gcm.cache` in case it doesn't exist. 

And GCC would search prebuilt module interface in `gcm.cache` automatically.

It looks much more friendly to me. The intention of this mail is to ask if you think it is the right direction
to make the clang's command line support for c++20 module more like GCC. The different I see now includes:
- Generate prebuilt module interface automatically. (And generate it to a specific directory automatically)
- Have a default value for prebuilt module path.

I am wondering if any one more familiar with the clang's command line and file system would love to 
support this (I am not so familiar with it). Although It may take more time, I would love to support if others are busy.

Thanks,
Chuanqi_______________________________________________
 cfe-dev mailing list
cfe-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
 _______________________________________________
 cfe-dev mailing list
cfe-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20211029/bd1cbb24/attachment-0001.html>


More information about the cfe-dev mailing list