[PATCH] D45438: [CodeView] Enable debugging of captured variables within C++ lambdas
David Blaikie via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Jan 11 17:51:45 PST 2022
dblaikie added inline comments.
================
Comment at: lib/CodeGen/CGDebugInfo.cpp:812-814
+ // CodeView types with C++ mangling need a type identifier.
+ if (CGM.getCodeGenOpts().EmitCodeView)
+ return true;
----------------
rnk wrote:
> rnk wrote:
> > dblaikie wrote:
> > > Just came across this code today - it's /probably/ a problem for LTO. LLVM IR linking depends on the identifier to determine if two structs are the same for linkage purposes - so if an identifier is added for a non-linkage (local/not externally visible) type, LLVM will consider them to be the same even though they're unrelated:
> > > ```
> > > namespace { struct t1 { int i; int j; }; }
> > > t1 v1;
> > > void *v3 = &v1;
> > > ```
> > > ```
> > > namespace { struct t1 { int i; }; }
> > > t1 v1;
> > > void *v2 = &v1;
> > > ```
> > > ```
> > > $ clang++-tot -emit-llvm -S a.cpp b.cpp -g -gcodeview && ~/dev/llvm/build/default/bin/llvm-link -o ab.bc a.ll b.ll && ~/dev/llvm/build/default/bin/llvm-dis ab.bc && cat ab.ll | grep "\"t1\""
> > > !8 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", scope: !9, file: !3, line: 1, size: 64, flags: DIFlagTypePassByValue, elements: !10, identifier: "_ZTSN12_GLOBAL__N_12t1E")
> > > $ clang++-tot -emit-llvm -S a.cpp b.cpp -g && ~/dev/llvm/build/default/bin/llvm-link -o ab.bc a.ll b.ll && ~/dev/llvm/build/default/bin/llvm-dis ab.bc && cat ab.ll | grep "\"t1\""
> > > !8 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", scope: !9, file: !3, line: 1, size: 64, flags: DIFlagTypePassByValue, elements: !10)
> > > !21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", scope: !9, file: !17, line: 1, size: 32, flags: DIFlagTypePassByValue, elements: !22)
> > > ```
> > >
> > > So in linking, now both `a.cpp` and `b.cpp` refer to a single `t1` type (in this case, it looks like the first one - the 64 bit wide one).
> > >
> > > If CodeView actually can't represent these two distinct types with the same name in the same object file, so be it? But this looks like it's likely to cause problems for consumers/users.
> > If you use the MSVC ABI, we will assign unique identifiers to these two structs:
> > ```
> > !9 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t1", scope: !10, file: !7, line: 1, size: 64, flags: DIFlagTypePassByValue, elements: !11, identifier: ".?AUt1@?A0xF964240C@@")
> > ```
> >
> > The `A0xF964240C` is set up here, and it is based on the source file path (the hash will only be unique when source file paths are unique across the build):
> > https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/MicrosoftMangle.cpp#L464
> > ```
> > // It's important to make the mangled names unique because, when CodeView
> > // debug info is in use, the debugger uses mangled type names to distinguish
> > // between otherwise identically named types in anonymous namespaces.
> > ```
> >
> > Maybe this should use a "is MSVC ABI" check instead, since that is what controls whether the identifier will be unique, and the unique-ness is what matters for LTO and PDBs.
> After thinking about it some more, see the comment I added here: rG59320fc9d78237a5f9fdd6a5030d6554bb6976ce
>
> ```
> // If the type is not externally visible, it should be unique to the current TU,
> // and should not need an identifier to participate in type deduplication.
> // However, when emitting CodeView, the format internally uses these
> // unique type name identifers for references between debug info. For example,
> // the method of a class in an anonymous namespace uses the identifer to refer
> // to its parent class. The Microsoft C++ ABI attempts to provide unique names
> // for such types, so when emitting CodeView, always use identifiers for C++
> // types. This may create problems when attempting to emit CodeView when the MS
> // C++ ABI is not in use.
> ```
>
> I think type identifiers are pretty crucial for CodeView functionality. The debugger won't be able to link a method to its class without them. Therefore, I think it's better to leave this as it is. The bad experience of duplicate type identifiers is better than the lack of functionality from not emitting identifiers at all for non-externally visible types.
Yeah, thanks for explaining/adding the comment - that about covers it.
Hmm, do these identifiers just need to only be unique within a single object file to provide the name referencing mechanics? Perhaps we could generate them later to ensure they're unique - rather than risk collapsing them while linking?
(at least here's an obscure case where they seem to collide: Compiling the same file with different `-D` values, I guess the hash is only for the filename, not for other properties that might impact the output - so this ends up with two types with the same mangled name even on MSVC ABI:
```
namespace { struct t1 { int i;
#ifdef SECOND
int j;
#endif
}; }
t1 v1;
void*
#ifdef SECOND
v2
#else
v3
#endif
= &v1;
```
```
clang++-tot -emit-llvm -S a.cpp -g -gcodeview -target x86_64-windows && clang++-tot -emit-llvm -S a.cpp -DSECOND -g -gcodeview -o b.ll -target x86_64-windows && ~/dev/llvm/build/default/bin/llvm-link -o ab.bc a.ll b.ll && ~/dev/llvm/build/default/bin/llvm-dis ab.bc && cat ab.ll | grep "\"t1\""
```
)
Repository:
rC Clang
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D45438/new/
https://reviews.llvm.org/D45438
More information about the cfe-commits
mailing list