<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 23.10.2020 00:22, David Blaikie
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAENS6EsusqiJoSY9WqLZJKLUQfFe2s6vYzrZHNDse98Dib78UA@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Sep 4, 2020 at 3:42
AM Alexey <<a href="mailto:avl.lapshin@gmail.com"
moz-do-not-send="true">avl.lapshin@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 03.09.2020 20:56, David Blaikie wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, Sep 3,
2020 at 5:15 AM Alexey <<a
href="mailto:avl.lapshin@gmail.com"
target="_blank" moz-do-not-send="true">avl.lapshin@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 03.09.2020 01:36, David Blaikie wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed,
Sep 2, 2020 at 3:26 PM Alexey <<a
href="mailto:avl.lapshin@gmail.com"
target="_blank" moz-do-not-send="true">avl.lapshin@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 02.09.2020 21:44, David
Blaikie wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr"
class="gmail_attr">On Wed, Sep
2, 2020 at 9:56 AM Alexey <<a
href="mailto:avl.lapshin@gmail.com" target="_blank"
moz-do-not-send="true">avl.lapshin@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 01.09.2020 20:07,
David Blaikie wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Fair enough
- thanks for clarifying
the differences! (I'd
still lean a bit towards
this being dwz-esque, as
you say "an extension of
classic dwz"</div>
</blockquote>
I doubt a little about
"llvm-dwz" since it might
confuse people who would
expect exactly the same
behavior.<br>
But if we think of it as "an
extension of classic dwz"
and the possible confusion
is not a big deal then<br>
I would be fine with
"llvm-dwz".<br>
<blockquote type="cite">
<div dir="ltr"> using a
bit more domain
knowledge (of
terminators and C++ odr
- though I'm not sure
dsymutil does rely on
the ODR, does it? It
relies on it to know
that two names represent
the same type, I
suppose, but doesn't
assume they're already
identical, instead it
merges their members))<br>
</div>
</blockquote>
<p>if dsymutil is able to
find a full definition
then it would remove all
other definitions(which
matched by name) and set
all references to that
found definition. If it is
not able to find a full
definition then it would
do nothing. i.e. if there
are two incomplete
definitions(DW_AT_declaration
(true)) with the same name
then they would not be
merged. That is a possible
improvement - to teach
dsymutil to merge
incomplete types.<br>
</p>
</div>
</blockquote>
<div>Huh, what does it do with
extra member function
definitions found in later
definitions? (eg: struct x {
template<typename T>
void f(); }; - in one
translation unit
x::f<int> is
instantiated, in another
x::f<float> is
instantiated - how are the two
represented with dsymutil?) <br>
</div>
</div>
</div>
</blockquote>
<p>They would be considered as two not
matched types. dsymutil would not
merge them somehow and thus would
not use single type description.
There would be two separate types
called "x" which would have mostly
matched members but differ with
x::f<int> and
x::f<float>. No any
de-duplication in that case.</p>
</div>
</blockquote>
<div>Oh, that's unfortunate. It'd be nice
for C++ at least, to implement a
potentially faster dsymutil mode that
could get this right and not have to
actually check for type equivalence,
instead relying on the name of the type
to determine that it must be identical.<br>
</div>
</div>
</div>
</blockquote>
<p>Right. That would result in even more size
reduction.<br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div><br>
The first instance of the type that's
encountered has its fully qualified name
or mangled name recorded in a map
pointing to the DIE. Any future instance
gets downgraded to a declaration, and
/certain/ members get dropped, but other
members get stuck on the declaration
(same sort of DWARF you see with "struct
foo { virtual void f1();
template<typename T> void f2() { }
}; void test(foo& f) {
f.f2<int>(); }"). Recording all
the member functions of the type/static
member variable types might be needed in
cases where some member functions are
defined in one translation unit and some
defined in another - though I guess that
infrastructure is already in place/that
just works today.<br>
</div>
</div>
</div>
</blockquote>
My understanding, is that there is not such
infrastructure currently. Current infrastructure
allows to reference single existing type
declaration(canonical) from other units. It does
not allow to reference different parts(in
different units) of incomplete type.<br>
</div>
</blockquote>
<div><br>
Huh, so what does the DWARF look like when you
define one member function in one file, and
another member function (common with inline
functions) in another file?<br>
</div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>I think it would be necessary to change the
order of how compilation units are processed to
implement such types merging. </div>
</blockquote>
<div><br>
Oh, I wasn't suggesting merging them - or didn't
mean to suggest that. I meant doing something like
what we do in LLVM for type homed (no-standalone)
DWARF, where we attach function declarations to
type declarations, eg:<br>
<br>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">struct x {</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures"><span> </span>void
f1();</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures"><span> </span>void
f2();</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures"><span> </span>template<typename
T></span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures"><span> </span>static
void f3();</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">};</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">#ifdef HOME</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">void x::f1() {</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">}</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">#endif</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">#ifdef AWAY</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">void x::f2() {</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">}</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">#endif</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">#ifdef TEMPL</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">template<typename
T></span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">void x::f3() {</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">}</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">template void
x::f3<int>();</span></p>
<p
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
style="font-variant-ligatures:no-common-ligatures">#endif<br>
<br>
Building "HOME" would show the DWARF I'd
expect to see the first time a type definition
is encountered during dsym.<br>
Building "AWAY" raises the question of - what
does dsymutil do with this DWARF? Does it
deduplicate the type, and make the definition
of 'f2' point to the 'f2' declaration in the
original type described in the prior CU
defined in "HOME"? If it doesn't do that, it
could/that would be good to reduce the DWARF
size.<br>
Building "TEMPL" would show the DWARF I'd
expect to see if a future use of that type
definition was encountered but the
original/home definition had no declaration of
this function: we should then emit maybe an
"extension" to the type (could be a straight
declaration, or maybe some newer/weirder
hybrid that points to the definition with some
attribute) & then inject the declaration
of the template/other new member into this
extension definition, etc.<br>
</span></p>
</div>
</div>
</div>
</blockquote>
Please check the reduced DWARF, generated by current
dsymutil for above example :<br>
<br>
0x0000000b: DW_TAG_compile_unit<br>
DW_AT_language (DW_LANG_C_plus_plus)<br>
DW_AT_name ("home.cpp")<br>
DW_AT_stmt_list (0x00000000)<br>
DW_AT_low_pc (0x0000000100000f80)<br>
DW_AT_high_pc (0x0000000100000f8b)<br>
<br>
0x0000002a: DW_TAG_structure_type<br>
DW_AT_name ("x")<br>
DW_AT_byte_size (0x01)<br>
<br>
0x00000033: DW_TAG_subprogram<br>
DW_AT_linkage_name ("_ZN1x2f1Ev")<br>
DW_AT_name ("f1")<br>
DW_AT_type (0x000000000000005e "int")<br>
DW_AT_declaration (true)<br>
DW_AT_external (true)<br>
DW_AT_APPLE_optimized (true)<br>
<br>
0x00000047: NULL<br>
<br>
0x00000048: DW_TAG_subprogram<br>
DW_AT_linkage_name ("_ZN1x2f2Ev")<br>
DW_AT_name ("f2")<br>
DW_AT_type (0x000000000000005e "int")<br>
DW_AT_declaration (true)<br>
DW_AT_external (true)<br>
DW_AT_APPLE_optimized (true)<br>
<br>
0x0000005c: NULL<br>
0x0000005d: NULL<br>
<br>
0x0000006a: DW_TAG_subprogram<br>
DW_AT_low_pc (0x0000000100000f80)<br>
DW_AT_high_pc (0x0000000100000f8b)<br>
DW_AT_specification
(0x0000000000000033 "_ZN1x2f1Ev") <br>
<br>
<br>
0x000000a0: DW_TAG_compile_unit<br>
DW_AT_language (DW_LANG_C_plus_plus)<br>
DW_AT_name ("away.cpp")<br>
DW_AT_stmt_list (0x00000048)<br>
DW_AT_low_pc (0x0000000100000f90)<br>
DW_AT_high_pc (0x0000000100000f9b)<br>
<br>
0x000000c6: DW_TAG_subprogram<br>
DW_AT_low_pc (0x0000000100000f90)<br>
DW_AT_high_pc (0x0000000100000f9b)<br>
DW_AT_specification
(0x0000000000000048 "_ZN1x2f2Ev") <br>
<br>
0x000000fc: DW_TAG_compile_unit<br>
DW_AT_language (DW_LANG_C_plus_plus)<br>
DW_AT_name ("templ.cpp")<br>
DW_AT_stmt_list (0x00000090)<br>
DW_AT_low_pc (0x0000000100000fa0)<br>
DW_AT_high_pc (0x0000000100000fab)<br>
<br>
0x0000011b: DW_TAG_structure_type<br>
DW_AT_name ("x")<br>
DW_AT_byte_size (0x01)<br>
<br>
0x00000124: DW_TAG_subprogram<br>
DW_AT_linkage_name ("_ZN1x2f1Ev")<br>
DW_AT_name ("f1")<br>
DW_AT_type (0x0000000000000168 "int")<br>
DW_AT_declaration (true)<br>
DW_AT_external (true)<br>
DW_AT_APPLE_optimized (true)<br>
0x00000138: NULL<br>
<br>
0x00000139: DW_TAG_subprogram<br>
DW_AT_linkage_name ("_ZN1x2f2Ev")<br>
DW_AT_name ("f2")<br>
DW_AT_type (0x0000000000000168 "int")<br>
DW_AT_declaration (true)<br>
DW_AT_external (true)<br>
DW_AT_APPLE_optimized (true)<br>
0x0000014d: NULL<br>
<br>
0x0000014e: DW_TAG_subprogram<br>
DW_AT_linkage_name ("_ZN1x2f3IiEEiv")<br>
DW_AT_name ("f3<int>")<br>
DW_AT_type (0x0000000000000168 "int")<br>
DW_AT_declaration (true)<br>
DW_AT_external (true)<br>
DW_AT_APPLE_optimized (true)<br>
0x00000166: NULL<br>
0x00000167: NULL<br>
<br>
0x00000174: DW_TAG_subprogram<br>
DW_AT_low_pc (0x0000000100000fa0)<br>
DW_AT_high_pc (0x0000000100000fab)<br>
DW_AT_specification
(0x000000000000014e "_ZN1x2f3IiEEiv")<br>
0x00000190: NULL<br>
<br>
<br>
>Building "HOME" would show the DWARF I'd expect to see
the first time a type definition is encountered during
dsym.<br>
<br>
compile unit "home.cpp" contains the type
definition(0x0000002a) and reference to its
member(DW_AT_specification (0x0000000000000033
"_ZN1x2f1Ev")).<br>
<br>
>Building "AWAY" raises the question of - what does
dsymutil do with this DWARF? Does it deduplicate the type,
and make the definition of 'f2' point to the 'f2'
declaration in the original type described in the prior CU
defined in "HOME"? If it doesn't do that, it could/that
would be good to reduce the DWARF size.<br>
<br>
compile unit "away.cpp" does not contain type definition
and contains reference to type definition from compile
unit "home.cpp" (DW_AT_specification
(0x0000000000000048 "_ZN1x2f2Ev")).<br>
i.e. dsymutil deduplicates the type and makes the
definition of 'f2' point to the 'f2' declaration in the
original type described in the prior CU "home.cpp".<br>
<br>
>Building "TEMPL" would show the DWARF I'd expect to
see if a future use of that type definition was
encountered but the original/home definition had no
declaration of this function: we should then emit maybe an
"extension" to the type (could be a straight declaration,
or maybe some newer/weirder hybrid that points to the
definition with some attribute) & then inject the
declaration of the template/other new member into this
extension definition, etc.<br>
<br>
compile unit "templ.cpp" contains the type
definition(0x0000011b) which matches with (0x0000002a)
plus defines the new member 0x0000014e.<br>
It also references this new member by
DW_AT_specification (0x000000000000014e
"_ZN1x2f3IiEEiv"). In this case type description is not
de-duplicated.<br>
</div>
</blockquote>
<div><br>
</div>
<div>Ah, yeah - that seems like a missed opportunity -
duplicating the whole type DIE. LTO does this by making
monolithic types - merging all the members from different
definitions of the same type into one, but that's maybe too
expensive for dsymutil (might still be interesting to know
how much more expensive, etc). But I think the other way to
go would be to produce a declaration of the type, with the
relevant members - and let the DWARF consumer identify this
declaration as matching up with the earlier definition.
That's the sort of DWARF you get from the non-MachO default
-fno-standalone-debug anyway, so it's already pretty well
tested/supported (support in lldb's a bit younger/more
work-in-progress, admittedly). I wonder how much dsym size
there is that could be reduced by such an implementation.</div>
</div>
</div>
</blockquote>
<p>I see. Yes, that could be done and I think it would result in
noticeable size reduction(I do not know exact numbers at the
moment).</p>
<p>I work on multi-thread DWARFLinker now and it`s first version
will do exactly the same type processing like current dsymutil.</p>
<p>Above scheme could be implemented as a next step and it would
result in better size reduction(better than current state).</p>
<p>But I think the better scheme could be done also and it would
result in even bigger size reduction and in faster execution. This
scheme is something similar to what you`ve described above: "LTO
does - making monolithic types - merging all the members from
different definitions of the same type into one".</p>
<p>DWARFLinker could create additional artificial compile unit and
put all merged types there. Later patch all type references to
point into this additional compilation unit. No any bits would be
duplicated in that case. The performance improvement could be
achieved due to less amount of the copied DWARF and due to the
fact that type references could be updated when DWARF is cloned(no
need in additional pass for that).<br>
</p>
<p>Anyway, that might be the next step after multi-thread
DWARFLinker would be ready.<br>
</p>
<blockquote type="cite"
cite="mid:CAENS6EsusqiJoSY9WqLZJKLUQfFe2s6vYzrZHNDse98Dib78UA@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div> <br>
Do you suggest that 0x0000011b should be transformed into
something like that:<br>
<br>
0x000000fc: DW_TAG_compile_unit<br>
DW_AT_language (DW_LANG_C_plus_plus)<br>
DW_AT_name ("templ.cpp")<br>
DW_AT_stmt_list (0x00000090)<br>
DW_AT_low_pc (0x0000000100000fa0)<br>
DW_AT_high_pc (0x0000000100000fab)<br>
<br>
0x0000011b: DW_TAG_structure_type<br>
DW_AT_specification (0x0000002a "x")<br>
<br>
0x00000124: DW_TAG_subprogram<br>
DW_AT_linkage_name ("_ZN1x2f3IiEEiv")<br>
DW_AT_name ("f3<int>")<br>
DW_AT_type (0x000000000000005e "int")<br>
DW_AT_declaration (true)<br>
DW_AT_external (true)<br>
DW_AT_APPLE_optimized (true)<br>
0x00000138: NULL<br>
0x00000139: NULL<br>
<br>
0x00000140: DW_TAG_subprogram<br>
DW_AT_low_pc (0x0000000100000fa0)<br>
DW_AT_high_pc (0x0000000100000fab)<br>
DW_AT_specification
(0x0000000000000124 "_ZN1x2f3IiEEiv")<br>
0x00000155: NULL<br>
<br>
Did I correctly get the idea?<br>
</div>
</blockquote>
<div><br>
</div>
<div>Yep, more or less. It'd be "safer" if 11b didn't use
DW_AT_specification to refer to 2a, but instead was only a
completely independent declaration of "x" - that path is
already well supported/tested (well, it's the
work-in-progress stuff for lldb to support
-fno-standalone-debug, but gdb's been consuming DWARF like
this for years, Clang and GCC both produce DWARF like this
(if the type is "homed" in another file, then Clang/GCC
produce DWARF that emits a declaration with just the members
needed to define any member functions
defined/inlined/referenced in this CU)) for years.<br>
<br>
But using DW_AT_specification, or maybe some other extension
attribute might make the consumers task a bit easier (could
do both - use an extension attribute to tie them up, leave
DW_AT_declaration/DW_AT_name here for consumers that don't
understand the extension attribute) in finding that they're
all the same type/pieces of teh same type.</div>
<div> </div>
</div>
</div>
</blockquote>
<p>yes. would try this solution.</p>
<p>Thank you, Alexey.<br>
</p>
<br>
</body>
</html>