[clang] [C++20] [Modules] Embed all source files for C++20 Modules (PR #102444)

Aaron Ballman via cfe-commits cfe-commits at lists.llvm.org
Fri Aug 30 05:31:02 PDT 2024


AaronBallman wrote:

> > > > Was this discussed/reviewed/motivated? There are drawbacks to this approach outlined in #72383
> > > > @iains @jyknight @AaronBallman @Bigcheese
> > > 
> > > 
> > > The motivation is in #72383 and I comment in [#72383 (comment)](https://github.com/llvm/llvm-project/issues/72383#issuecomment-2275135890)
> > > This is not reviewed. I wait for several weeks but got no response. And I think it is good. So I choose to land it.
> > 
> > 
> > Thank you for the link! (FWIW, there's no problems with these changes having been landed; @ChuanqiXu9 is the code owner for modules and the PR was up for three weeks without discussion. This is just typical post-commit review feedback.)
> > There's a fair amount of discussion on that thread in opposition to this approach, and a comment thread on an issue is not really visible to many people. I think this warrants an RFC for a broader discussion, so I'd appreciate temporarily reverting this patch.
> 
> Thank you, Aaron! But I like to not revert this until someone give some complaints on that thread. Since the problem in the high level looks clear to me and feel like the discussion in that thread looks like people there have a pretty good understanding to the issue self.

Again, I don't think a comment thread on an issue has a wide enough audience. FWIW, I had *no idea* about that thread until I got pinged here.

> > Some thoughts for the RFC discussion:
> > 
> > * Given that this is not the default behavior for MSVC, should `clang-cl` behave differently than `clang`?
> 
> We can discuss this topic in the two aspects, a module specific one and a broader one not limited to modules (the general policy). For the module specific topic, since the BMI generated by clang and MSVC are not compatible, this change won't have any particular impact to the users of MSVC or `clang-cl`. Beyond the topic of BMI compatibility, this change will make the clang's behavior to match GCC and MSVC's behavior more. Like the OP said in the last paragraph of the 1st floor: #72383:
> 
> > AFAIK, neither GCC nor MSVC have this restriction for their BMIs.
> 
> The internal reason is that, in clang, the source locations are super fundamental and sensitive. So that, if we don't embed the source files, the BMI became invalid after we change the corresponding source file.

I don't see why that's an issue; if the source code changes, *not* changing the BMI is a slight quality-of-implementation improvement (e.g., when user changes comments, but the actual semantics remain the same), but that hardly seems worth the concerns I have below.

> And for the broader one, should users expect `clang-cl` to be a drop-in replacement for MSVC? There is a good post about this (https://www.reddit.com/r/cpp/comments/1e36n0w/c20_modules_with_clangcl_which_way_forward/). I feel this is pretty complex. Not only from the high level expectations (otherwise we'll always say yes) but also the implementations. From the perspective of modules, it looks too hard to implement the MSVC's BMI in clang unless MicroSoft would love to put more people to implement that to clang. But after all, I think this is a good topic to discuss to get more involved.

Yeah, that broader discussion is one we should have; and we should have the same conversation about GCC compatibility as we're 1) not compatible, 2) drifting farther apart, 3) still claim we're GCC 4.2, and 4) have never explicitly determined just how much compatibility we need/want as a community (it was an organic thing that grew out of the needs of the early developers of Clang)

> > * What are the impacts on other tooling (debugger, 3rd party static analysis, etc)?
> 
> There shouldn't be any bad impact on other tools. Since all the tools have to consume the BMI via the compiler (in the same version!). Since this patch simply loose the restriction, I don't expect this have any impact on 3rd party tools.

Good to know; but this is a small part of why I'd like an RFC -- tools vendors should be made aware so they can raise concerns if they have them.

> > * Is this the correct default when considering intellectual property security (will people expect their full, original source to be something that can be pulled from these artifacts)?
> 
> The consensus of SG15 for BMI is, the BMI is not good for distributed. Since the BMI format are not compatible across compilers (even with different versions!). So we would think BMI as a by-product of **a build process** only. Although there are thoughts about, we can distribute the BMI in a homogeneous environment where we have stable compiler versions. But such environments are closed ended environment for some specific projects or organizations. So I don't think it may affect any thing for the security as the design doesn't want us to ship the BMIs.

That's SG15. This change has serious implications for IP and I don't feel comfortable making that decision as quietly as you have. This is the primary reason why I'm insisting on an RFC. As a concrete example, large organizations often have contractors who utilize libraries written by the parent company. If the company produces a BMI for their framework (because they control the environment used by the contractor anyway, so this is safe and reasonable to do), they may accidentally ship their sensitive IP along with it.

So I continue to ask for this to be reverted and an RFC posted.

https://github.com/llvm/llvm-project/pull/102444


More information about the cfe-commits mailing list