[llvm-dev] [RFC] Making .eh_frame more linker-friendly

Wed Oct 25 18:42:10 PDT 2017

Hi,

Many linkers including lld have a feature to eliminate unused sections from
output to make output smaller (which is essentially a mark-sweep gc where
sections are vertices and relocations are edges). lld and GNU gold have yet
another feature, ICF, to merge functions by contents to save more space.

When we remove or merge a function, we want to eliminate its exception
handling information as well. But that isn't very easy to do due to the
format of .eh_frame. Here are reasons:

1. Linkers have to parse, split, eliminate exception handling information
for dead functions, and then reconstruct an .eh_frame section. It is
tedious, and it doesn't feel very much like a task that linkers have to do
(linkers usually handle sections as opaque blobs and are agnostic of
section contents.) That is contrary to other data where section is the
atomic unit of inclusion/elimination.

2. From the viewpoint of gc, .eh_frame has reverse edges to sections.
Usually, if section A depends on section B, there's a relocation in A
pointing to B. But that isn't the case for .eh_frame, but opposite. If
section A has exception handling information in .eh_frame section B, B has
a relocation against A. This makes implementing a gc tricky, and when it is
combined to (1), it is more tricky.

3. Comparing .eh_frame contents for equivalence is hard. In order to merge
functions by contents, we need to verify that their exception handling
information is also the same, but doing it isn't easy given the current
.eh_frame format.

So, I don't feel .eh_frame needed to be designed that way. Maybe we can
improve. Here is my rough idea:

1. We can emit an .eh_frame section for each .text section. So, if you pass
-ffunction-sections, the resulting object file would have multiple
.eh_frame sections. This makes .eh_frame a unit of garbage collection and
eliminates the need to parse .eh_frame contents. It also makes it very easy
to compare .eh_frame sections for function merging.

2. Make each .eh_frame section have a link to its .text section. We could
set a section index of a .text section to its corresponding .eh_frame's
sh_link field. This would make gc much easier. (If text section A is
pointed by an .eh_frame section B via sh_link, that A is alive means B is
alive. It is still reverse, but this is much more manageable.)

I think doing the above things doesn't break the compatibility with
existing linkers, and new linkers can take advantage of the format that is
more friendly to the linker. I don't think of any obvious disadvantage of
doing them, except that we would have more sections, but I may be wrong as
I'm no expert of .eh_frame.

What do you guys think?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171025/1430ae5b/attachment.html>