[PATCH] D131388: [docs] Add "C++20 Modules"

Thu Aug 11 14:00:46 PDT 2022

dblaikie added inline comments.

================
Comment at: clang/docs/CPlusPlus20Modules.rst:395-396
+
+Roughly, this theory is correct. But the problem is that it is too rough. Let's see what actually happens.
+For example, the behavior also depends on the optimization level, as we will illustrate below.
+
----------------
ChuanqiXu wrote:
> dblaikie wrote:
> > ChuanqiXu wrote:
> > > dblaikie wrote:
> > > > I'm not sure I'm able to follow the example and how it justifies the rough theory as inadequate to explain the motivation for modules - could you clarify more directly (in comments, and then we can discuss how to word it) what the motivation for this section is/what you're trying to convey?
> > > Let me answer the motivation first. The motivation comes from my personal experience. I feel like when most people heard modules, they would ask "how much speedup could we get"? And there are some other questions like "why does modules speedup the compilation?". So I guess the readers of the document may have similar questions and I try to answer it here.
> > > 
> > > The complexity theory is correct but it may be too abstract to our users. Since the complexity theory is about the scaling. But for certain users, the scales of their codes are temporarily fixed. So when they try to use modules but find the speedup doesn't meet their expectation in O2. They may feel frustrated. And it doesn't work if I say, "hey, you'll get much better speedup if the your codes get 10x longer." I guess they won't buy in. So what I try to do here is to manage the user's expectation to avoid any misunderstanding.
> > > 
> > > Following off is about the explanation. For example, there are `1` module interface and `10` users. There is a function `F` in the module interface and the function is used by every users. And let's say we need a `T` time to compile the function `F` and each users without the function `F`.
> > > In O0, the function `F` will get compiled completely once and get involved in the Sema part 10 times. Due to the Sema part is relatively fast and let's say the Sema part would take `0.1T`. Given we compile them serially, we need `12T` to compile the project.
> > > 
> > > But if we are with optimizations, each function `F` will get involved in optimizations and IPO in every users. And these optimizations are most time-consuming. Let's say these optimizations will consume `0.8T`. And the time required will be `19T`. It is easy to say the we need `20T` to compile the project if we're using headers. So we could find the speedup with optimization is much slower.
> > > 
> > > BTW, if we write the required time with variables, it will be `nT + mT + T*m*additional_compilation_part`. The `additional_compilation_part ` here corresponds to the time percentage of `Sema` or `Optimizations`. And since `T` and `additional_compilation_part ` are both constant. So if we write them in `O()` form, it would be `O(n+m)`.
> > > So the theory is still correct.
> > > 
> > > 
> > I think the message is getting a bit lost in the text (both in the proposed text, and the comment here).
> > 
> > "At -O0 implementations of non-inline functions defined in a module will not impact module users, but at higher optimization levels the definitions of such functions are provided to user compilations for the purposes of optimization (but definitions of these functions are still not included in the use's object file) - this means build speed at higher optimization levels may be lower than expected given -O0 experience, but does provide by more optimization opportunities"
> > 
> Yes, it is hard to talk clearly and briefly. In your suggested wording, you mentioned `non-inline` function, it is accurate but bring new information to this document. I'm worrying if the reader could understand it if the reader don't know c++ so much.
> 
> I put the suggested wording as the conclusion paragraph for the section and hope it could make the reader focus on the intention of the section.
Maybe "non-inline" could be replaced by "module implementation details" (but "function bodies" sounds OK too)

I think the issue for me is that the current description seems to go into more detail about compiler implementation details than might be helpful for a document at this level. I was/am hoping maybe a one paragraph summary might be simpler/more approachable/sufficiently accurate for the audience.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131388/new/

https://reviews.llvm.org/D131388