[LLVMdev] Proposal: function prefix data

Thu Jul 18 18:06:36 PDT 2013

On Thu, Jul 18, 2013 at 02:59:42PM -0700, Sean Silva wrote:
> So far you only seem to have presented the GHC ABI use case (and see just
> below about UBSan). Do you know of any other use cases?

Not concretely.  I was referring to other language implementations which
might use runtime function metadata.

> > we might as well use it in UBSan as
> > opposed to something which is going to be strictly slower.
> >
> 
> Below you use UBSan's use case as a motivating example, which seems
> incongruous to this "might as well use it in UBSan" attitude. Without
> having evaluated alternatives as being "too slow" for UBSan, I don't think
> that UBSan's use case should be used to drive this proposal.

OK.  So let's approach this from the
GHC/runtime-function-metadata-based-language standpoint.  I would
argue that the client still ought to have some control over where the
data appears relative to the function.  This might be for the sake
of conformance with an existing ABI for that language.  For example,
in the existing GHC tables-next-to-code ABI, the data appears right
before the function.

Given the ability to do this, is it too much of a stretch for the
client to be able to specify where the symbol should be located?
This is something that will be needed anyway, as explained in my
symbol offset proposal.  And provided that the client behaves, it
shouldn't impose any additional burden on LLVM itself.

> I don't have an issue with target-dependent things per se; I just think
> that they should be given a bit more thought and not added unless existing
> mechanisms are insufficient. For example, could this be implemented as a
> late IR pass that adds a piece of inline-asm to the beginning of the
> function?

I don't like it, for four reasons:

1) Inline asm is just as target dependent as prefix data (perhaps even
   more so, if you consider that different targets may have different
   flavours of asm).

2) It takes control of the specific encoding of the instructions out of
   your hands, which can be important if you use it as a signature.

3) It inhibits optimisation, as it becomes more difficult to
   optimise away loads through a known function pointer.

4) The backend will probably need to be taught to treat this particular
   piece of asm specially, i.e. by not emitting a function prelude
   until it is emitted.  By contrast, the backend can be taught
   to emit prefix data trivially with two lines of code.

Thanks,
-- 
Peter