<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 28.10.2020 20:38, David Blaikie
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAENS6EuOz-rVnp5ry=PCf9RUO3Rj-ZarU-jwrqGmBqj53Kjy=g@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Oct 28, 2020 at 6:01
AM Alexey Lapshin <<a href="mailto:avl.lapshin@gmail.com"
moz-do-not-send="true">avl.lapshin@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 28.10.2020 01:49, David Blaikie wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, Oct 27,
2020 at 12:34 PM Alexey Lapshin <<a
href="mailto:avl.lapshin@gmail.com"
target="_blank" moz-do-not-send="true">avl.lapshin@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 27.10.2020 20:32, David Blaikie wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue,
Oct 27, 2020 at 1:23 AM Alexey Lapshin
<<a
href="mailto:avl.lapshin@gmail.com"
target="_blank" moz-do-not-send="true">avl.lapshin@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 26.10.2020 22:38, David
Blaikie wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr"
class="gmail_attr">On Sun, Oct
25, 2020 at 9:31 AM Alexey
Lapshin <<a
href="mailto:avl.lapshin@gmail.com"
target="_blank"
moz-do-not-send="true">avl.lapshin@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 23.10.2020 19:43,
David Blaikie wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">
<blockquote
type="cite">
<div dir="ltr">
<div
class="gmail_quote">
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div><br>
<br>
</div>
</blockquote>
<div><br>
</div>
<div>Ah, yeah
- that seems
like a missed
opportunity -
duplicating
the whole type
DIE. LTO does
this by making
monolithic
types -
merging all
the members
from different
definitions of
the same type
into one, but
that's maybe
too expensive
for dsymutil
(might still
be interesting
to know how
much more
expensive,
etc). But I
think the
other way to
go would be to
produce a
declaration of
the type, with
the relevant
members - and
let the DWARF
consumer
identify this
declaration as
matching up
with the
earlier
definition.
That's the
sort of DWARF
you get from
the non-MachO
default
-fno-standalone-debug
anyway, so
it's already
pretty well
tested/supported
(support in
lldb's a bit
younger/more
work-in-progress,
admittedly). I
wonder how
much dsym size
there is that
could be
reduced by
such an
implementation.</div>
</div>
</div>
</blockquote>
<p>I see. Yes, that
could be done and
I think it would
result in
noticeable size
reduction(I do not
know exact numbers
at the moment).</p>
<p>I work on
multi-thread
DWARFLinker now
and it`s first
version will do
exactly the same
type processing
like current
dsymutil.</p>
</blockquote>
<div>Yeah, best to
keep the behavior
the same through
that</div>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">
<div>
<p>Above scheme
could be
implemented as a
next step and it
would result in
better size
reduction(better
than current
state).</p>
<p>But I think the
better scheme
could be done
also and it
would result in
even bigger size
reduction and in
faster
execution. This
scheme is
something
similar to what
you`ve described
above: "LTO does
- making
monolithic types
- merging all
the members from
different
definitions of
the same type
into one".</p>
</div>
</blockquote>
<div>I believe the
reason that's
probably not been
done is that it
can't be streamed -
it'd lead to
buffering more of
the output </div>
</div>
</div>
</blockquote>
<p>yes. The fact that DWARF
should be streamed into
AsmPrinter complicates
parallel dwarf generation.
In my prototype, I
generate <br>
several resulting
files(each for one source
compilation unit) and then
sequentially glue them
into the final resulting
file.<br>
</p>
</div>
</blockquote>
<div>How does that help? Do you
use relocations in those
intermediate object files so
the DWARF in them can refer
across files? <br>
</div>
</div>
</div>
</blockquote>
<p>It does not help with referring
across the file. It helps to
parallel the generation of CU
bodies. <br>
It is not possible to write two CUs
in parallel into AsmPrinter. To make
possible parallel generation I
stream them into different
AsmPrinters(this comment is for "I
believe the reason that's probably
not been done is that it can't be
streamed". which initially was about
referring across the file, but it
seems I added another direction).<br>
</p>
</div>
</blockquote>
<div>Oh, I see - thanks for explaining,
essentially buffering on-disk. <br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p> </p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p> </p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div>(if two of these
expandable types
were in one CU - the
start of the second
type couldn't be
known until the end
because it might
keep getting pushed
later due to
expansion of the
first type) and/or
having to revisit
all the type
references (the
offset to the second
type wouldn't be
known until the end
- so writing the
offsets to refer to
the type would have
to be deferred until
then).<br>
</div>
</div>
</div>
</blockquote>
<p>That is the second
problem: offsets are not
known until the end of
file.<br>
dsymutil already has that
situation for inter-CU
references, so it has
extra pass to<br>
fixup offsets. </p>
</div>
</blockquote>
<div>Oh, it does? I figured it
was one-pass, and that it only
ever refers back to types in
previous CUs? So it doesn't
have to go back and do a
second pass. But I guess if
sees a declaration of T1 in
CU1, then later on sees a
definition of T1 in CU2, does
it somehow go back to CU1 and
remove the declaration/make
references refer to the
definition in CU2? I figured
it'd just leave the
declaration and references to
it as-is, then add the
definition and use that from
CU2 onwards? <br>
</div>
</div>
</div>
</blockquote>
<p>For the processing of the types, it
do not go back. <br>
This "I figured it was one-pass, and
that it only ever refers back to
types in previous CUs" <br>
and this "I figured it'd just leave
the declaration and references to it
as-is, then add the definition and
use that from CU2 onwards" are
correct. <br>
</p>
</div>
</blockquote>
<div>Great - thanks for
explaining/confirming! </div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p> <br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p>With multi-thread
implementation such
situation would arise more
often <br>
for type references and so
more offsets should be
fixed during additional
pass.<br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">
<div>
<p>DWARFLinker
could create
additional
artificial
compile unit and
put all merged
types there.
Later patch all
type references
to point into
this additional
compilation
unit. No any
bits would be
duplicated in
that case. The
performance
improvement
could be
achieved due to
less amount of
the copied DWARF
and due to the
fact that type
references could
be updated when
DWARF is
cloned(no need
in additional
pass for that).<br>
</p>
</div>
</blockquote>
<div>"later patch all
type references to
point into this
additional
compilation unit" -
that's the
additional pass that
people are probably
talking/concerned
about. Rewalking all
the DWARF. The
current dsymutil
approach, as far as
I know, is single
pass - it knows the
final, absolute
offset to the type
from the moment it
emits that
type/needs to refer
to it. <br>
</div>
</div>
</div>
</blockquote>
<p>Right. Current dsymutil
approach is single pass.
And from that point of
view, solution <br>
which you`ve described(to
produce a declaration of
the type, with the
relevant members) <br>
allows to keep that single
pass implementation.<br>
<br>
But there is a restriction
for current dsymutil
approach: To process
inter-CU references <br>
it needs to load all DWARF
into the memory(While it
analyzes which part of
DWARF is live, <br>
it needs to have all CUs
loaded into the memory).</p>
</div>
</blockquote>
<div>All DWARF for a single file
(which for dsymutil is mostly
a single CU, except with LTO I
guess?), not all DWARF for all
inputs in memory at once,
yeah? <br>
</div>
</div>
</div>
</blockquote>
<p>right. In dsymutil case - all DWARF
for a single file(not all DWARF for
all inputs in memory at once).<br>
But in llvm-dwarfutil case single
file contains DWARF for all original
input object files and it all
becomes<br>
loaded into memory.<br>
</p>
</div>
</blockquote>
<div>Yeha, would be great to try to go
CU-by-CU. </div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p>That leads to huge memory
usage. <br>
It is less important when
source is a set of object
files(like in dsymutil
case) and this <br>
become a real problem for
llvm-dwarfutil utility
when source is a single
file(With current <br>
implementation it needs
30G of memory for
compiling clang binary).<br>
</p>
</div>
</blockquote>
<div>Yeah, that's where I think
you'd need a fixup pass one
way or another - because
cross-CU references can mean
that when you figure out a new
layout for CU5 (because it has
a duplicate type definition of
something in CU1) then you
might have to touch CU4 that
had an absolute/cross-CU
forward reference to CU5. Once
you've got such a fixup pass
(if dsymutil already has one?
Which, like I said, I'm
confused why it would have
one/that doesn't match my very
vague understanding) then I
think you could make dsymutil
work on a per-CU basis
streaming things out, then
fixing up a few offsets.<br>
</div>
</div>
</div>
</blockquote>
<p>When dsymutil deduplicates types it
changes local CU reference into
inter-CU reference(so that CU2(next)
could reference type definition from
CU1(prev)). To do this change it
does not need to do any fixups
currently.<br>
<br>
When dsymutil meets already
existed(located in the input object
file) inter-CU reference pointing
into the CU which has not been
processed yet(and then its offset is
unknown) it marks it as "forward
reference" and patches later during
additional pass "fixup forward
references" at a time when offsets
are known. <br>
</p>
</div>
</blockquote>
<div>OK, so limited 2 pass system. (does
it do that second pass once at the end
of the whole dsymutil run, or at the end
of each input file? (so if an input file
has two CUs and the first CU references
a type in the second CU - it could write
the first CU with a "forward reference",
then write the second CU, then fixup the
forward reference - and then go on to
the next file and its CUs - this could
improve performance by touching recently
used memory/disk pages only, rather than
going all the way back to the start
later on when those pages have become
cold)</div>
</div>
</div>
</blockquote>
<p>yes, It does it in the end of each input
file.</p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p> <br>
If CUs would be processed in
parallel their offsets would not be
known at the moment when local type
reference would be changed into
inter-CU reference. So we would need
to do the same fix-up processing for
all references to the types like we
already do for other inter-CU
references.<br>
</p>
</div>
</blockquote>
<div>Yeah - though the existence of this
second "fixup forward references" system
- yeah, could just use it much more
generally as you say. Not an extra pass,
just the existing second pass but having
way more fixups to fixup in that pass.</div>
</div>
</div>
</blockquote>
If we would be able to change the algorithm in
such way : <br>
<br>
1. analyse all CUs.<br>
2. clone all CUs.<br>
<br>
Then we could create a merged type
table(artificial CU containing types) during
step1. <br>
If that type table would be written first, then
all following CUs could use known offsets <br>
to the types and we would not need additional
fix-up processing for type references. <br>
It would still be necessary to fix-up other
inter-CU references. But it would not be
necessary <br>
to fix-up type references (which constitute the
vast majority).<br>
</div>
</blockquote>
<div><br>
</div>
<div>To me, that sounds more expensive than the
fixup forward references pass.</div>
</div>
</div>
</blockquote>
<p>If we would speak about direct comparison then yes
loading DWARF one more time looks more expensive than
fixup forward references pass. But if we would speak
about the general picture then it could probably be
beneficial:<br>
<br>
1. merging types would lead to a smaller size of
resulting DWARF. This would speed up the process.<br>
f.e. If we would switch "odr types deduplication" off
in current implementation then it would increase
execution time two times. That is because more DWARF
should be cloned and written in the result.
Implementation of "merging types" would probably have a
similar effect <br>
- It would speed-up the overall process. So from one
side additional step for loading DWARF would <br>
decrease performance but a smaller amount of
resulting data would increase performance.<br>
<br>
2. When types would be put in the first CU then we would
have a simple strategy for our liveness analysis
algorithm: just always keep the first CU in memory. This
allows us to speed up our liveness analysis step.<br>
<br>
Anyway, all the above is just an idea for future work.
Currently, I am going to implement multithread
processing for CUs loaded into memory and having the
same type of processing as it currently is(Which assumes
that "fixup forward references pass" started to do more
work by fixing types references).<br>
</p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div> <br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p> <br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p>Without loading all CU
into the memory it would
require two passes
solution. First to analyze
<br>
which part of DWARF
relates to live code and
then second pass to
generate the result. <br>
</p>
</div>
</blockquote>
<div>Not sure it'd require any
more second pass than a
"fixup" pass, which it sounds
like you're saying it already
has? <br>
</div>
</div>
</div>
</blockquote>
<p>It looks like it would need an
additional pass to process inter-CU
references(existed in incoming file)
if we do not want to load all CUs
into memory.<br>
</p>
</div>
</blockquote>
<div>Usually inter-CU references aren't
used, except in LTO - and in LTO all the
DWARF deduplication and function
discarding is already done by the IR
linker anyway. (ThinLTO is a bit
different, but really we'd be better off
teaching it the extra tricks anyway
(some can't be fixed in ThinLTO - like
emitting a "Home" definition of an
inline function, only to find out other
ThinLTO backend/shards managed to
optimize away all uses of the
function... so some cleanup may be
useful there)). It might be possible to
do a more dynamic/rolling cache - keep
only the CUs with unresolved cross-CU
references alive and only keep them
alive until their cross-CU references
are found/marked alive. This should make
things no worse than the traditional
dsymutil case - since cross-CU
references are only effective/generally
used within a single object file (it's
possible to create relocations for them
into other files - but I know LLVM
doesn't currently do this and I don't
think GCC does it) with multiple CUs
anyway - so at most you'd keep all the
CUs from a single original input file
alive together.<br>
</div>
</div>
</div>
</blockquote>
But, since it is a DWARF documented case the
tool should be ready for such case(when inter-CU
<br>
references are heavily used).</div>
</blockquote>
<div><br>
Sure - but by implementing a CU liveness window
like that (keeping CUs live only so long as they
need to be rather than an all-or-nothing approach)
only especially quirky inputs would hit the worst
case while the more normal inputs could perform
better.<br>
</div>
</div>
</div>
</blockquote>
<p>It is not clear what should be put in such CU liveness
window. If CU100 references CU1 - how could we know that
we need to put CU1 into CU liveness window before we
processed CU100?<br>
</p>
</div>
</blockquote>
<div>Fair point, not just forward references to worry about
but backward references too. I wonder how much savings there
is in the liveness analysis compared to "keep one copy of
everything, no matter whether it's live or not", then it can
be a pure forward progress situation. (with the quirk that
you might emit a declaration for an entity once, then a
definition for it later - alternatively if a declaration is
seen it could be skipped under the assumption that a
definition will follow (& use a forward ref fixup) - and
if none is found, splat some stub declarations into a
trailing CU at the end) <br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p> </p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div> Moreover, llvm-dwarfutil would be the tool
producing <br>
exactly such situation. The resulting
file(produced by llvm-dwarfutil) would contain a
lot of <br>
inter-CU references. Probably, there is no
practical reasons to apply llvm-dwarfutil to the
same <br>
file twice but it would be a good test for the
tool.<br>
</div>
</blockquote>
<div><br>
It'd be a good stress test, but not necessarily
something that would need to perform the best
because it wouldn't be a common use case.<br>
</div>
</div>
</div>
</blockquote>
<p>I agree that we should not slow down the DWARFLinker in
common cases only because we need to support the worst
cases.<br>
But we also need to implement a solution which works in
some acceptable manner for the worst case. </p>
</div>
</blockquote>
<div>I think that depends on "acceptable" - correct, yes.
Practical to run in reasonable time/memory? Not necessarily,
in my opinion. <br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p>The current solution - loading everything in memory -
makes it hard to use in a non-dsymutil
scenario(llvm-dwarfutil).<br>
</p>
</div>
</blockquote>
<div>I agree it's worth exploring the non-dsymutil scenario,
as you are - I'm just saying we don't necessarily need to
support high usability (fast/low memory usage/etc)
llvm-dwarfutil on an already dwarfutil'd binary (but as
you've pointed out, the "window" is unknowable because of
backward references, so this whole subthread is perhaps
irrelevant).<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p> </p>
<p>There could be several things which could be used to
decide whether we need to go on a light or heavy path:<br>
<br>
1. If the input contains only a single CU we do not need
to unload it from memory. Thus - we would not need to do
an extra DWARF loading pass.<br>
2. If abbreviations from the whole input file do not
contain inter-CU references then while doing liveness
analysis, we do not need to wait until other CUs are
processed.<br>
</p>
</div>
</blockquote>
<div>(2) Yeah, that /may/ be a good idea, cheap to test, etc.
Though I'd still wonder if a more general implementation
strategy could be found that would make it easier to get a
sliding scale of efficiency depending on how much inter-CU
references where were, not a "if there are none it's good,
if there are any it's bad or otherwise very different to
implement". <br>
</div>
</div>
</div>
</blockquote>
<p>I think, there is a scenario which would make it possible to
process CU once for not referenced CUs and handle inter-CU
references in a scalable way(even for dwarfutil`d binary):<br>
<br>
1. Implement a global type's table and types merging. This allows
us to have all types in the memory. <br>
Then, all inter-CU type references would point into that memory
type table. <br>
(we do not know which CU should be put into CU liveness window,
we also could not put all CUs into the memory, but we could put
all types into the memory).<br>
<br>
2. If there are not other inter-CU references then all CUs would
be handled by one pass.<br>
<br>
3. If there are other inter-CU references, then after all CU
processed by the first pass we would have a list of referenced
CUs. Then, we could delete already cloned data(for referenced CU)
and start the process again: <br>
load CU, mark liveness, clone data. This second pass would be
done for only referenced CUs. <br>
For not-complex, not closely coupled cases it would work
relatively fast.<br>
<br>
4. put memory type table into artificial CU. Update all type`s
references.<br>
</p>
<p><br>
</p>
<blockquote type="cite"
cite="mid:CAENS6EuOz-rVnp5ry=PCf9RUO3Rj-ZarU-jwrqGmBqj53Kjy=g@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p> <br>
Then that scheme would be used for worst cases:<br>
<br>
1. for (CU : CU1...CU100) {<br>
load CU.<br>
analyse CU.<br>
unload CU.<br>
} <br>
2. for (CU : CU1...CU100) {<br>
load CU.<br>
clone CU.<br>
unload CU.<br>
} <br>
3. fixup forward references.<br>
<br>
and that scheme for light cases:<br>
<br>
1. for (CU : CU1...CU100) {<br>
load CU.<br>
analyse CU.<br>
clone CU.<br>
unload CU.<br>
}<br>
2. fixup forward references.<br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>Generally, I think we should not assume that
inter-CU references would be used in a limited
way.<br>
<br>
Anyway, if this scheme: <br>
<br>
1. analyse all CUs.<br>
2. clone all CUs.<br>
<p>would work slow then we would need to
continue with one-pass solution and not
support complex closely coupled inputs.<br>
</p>
</div>
</blockquote>
<div><br>
</div>
<div>yeah, certainly seeing the data/experiments
will be interesting, if you end up implementing
some different strategies, etc.<br>
<br>
I guess one possibility for parallel generation
could be something more like Microsoft's approach
with a central debug info server that compilers
communicate with - not that exact model, I mean,
but if you've got parallel threads generating
reduced DWARF into separate object files - they
could communicate with a single thread responsible
for type emission - the type emitter would be
given types from the separate threads and compute
their size, queue them up to be streamed out to
the type CU (& keep the source CU alive until
that work was done) - such a central type emitter
could quickly determine the size of the type to be
emitted and compute future type offsets (eg: if 5
types were in the queue, it could've figured out
the offset of those types already) to answer type
offset queries quickly and unblock the parallel
threads to continue emitting their CUs containing
type references.<br>
</div>
</div>
</div>
</blockquote>
<p>yes. Thank you. Would think about it.<br>
</p>
<p>Alexey.<br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div><br>
- Dave </div>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
</div>
</blockquote>
</body>
</html>