<div dir="ltr">On Thu, Jul 18, 2013 at 12:13 PM, Peter Collingbourne <span dir="ltr"><<a href="mailto:peter@pcc.me.uk" target="_blank">peter@pcc.me.uk</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im">On Wed, Jul 17, 2013 at 07:50:58PM -0700, Sean Silva wrote:<br>


> On Wed, Jul 17, 2013 at 6:06 PM, Peter Collingbourne <<a href="mailto:peter@pcc.me.uk">peter@pcc.me.uk</a>>wrote:<br>

><br>

> > Hi,<br>

> ><br>

> > I would like to propose that we introduce a mechanism in IR to allow<br>

> > arbitrary data to be stashed before a function body.  The purpose of<br>

> > this would be to allow additional data about a function to be looked<br>

> > up via a function pointer.  Two use cases come to mind:<br>

> ><br>

> > 1) We'd like to be able to use UBSan to check that the type of the<br>

> >    function pointer of an indirect function call matches the type of<br>

> >    the function being called.  This can't really be done efficiently<br>

> >    without storing type information near the function.<br>

> ><br>

><br>

> How efficient does it have to be? Have some alternatives already proven to<br>

> be "too slow"? (e.g. a binary search into a sorted table)<br>

<br>

</div>This has admittedly not been measured.  It depends on the rate at<br>

which the program performs indirect function calls.  But given the<br>

other use cases for this feature</blockquote><div><br></div><div>So far you only seem to have presented the GHC ABI use case (and see just below about UBSan). Do you know of any other use cases?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

 we might as well use it in UBSan as<br>

opposed to something which is going to be strictly slower.<br></blockquote><div><br></div><div>Below you use UBSan's use case as a motivating example, which seems incongruous to this "might as well use it in UBSan" attitude. Without having evaluated alternatives as being "too slow" for UBSan, I don't think that UBSan's use case should be used to drive this proposal.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div><div class="h5"><br>

> > 2) Allowing GHC's tables-next-to-code ABI [1] to be implemented.<br>

> >    In general, I imagine this feature could be useful for the<br>

> >    implementation of languages which require runtime metadata for<br>

> >    each function.<br>

> ><br>

> > The proposal is that an IR function definition acquires a constant<br>

> > operand which contains the data to be emitted immediately before<br>

> > the function body (known as the prefix data).  To access the data<br>

> > for a given function, a program may bitcast the function pointer to<br>

> > a pointer to the constant's type.  This implies that the IR symbol<br>

> > points to the start of the prefix data.<br>

> ><br>

> > To maintain the semantics of ordinary function calls, the prefix data<br>

> > must have a particular format.  Specifically, it must begin with a<br>

> > sequence of bytes which decode to a sequence of machine instructions,<br>

> > valid for the module's target, which transfer control to the point<br>

> > immediately succeeding the prefix data, without performing any other<br>

> > visible action.  This allows the inliner and other passes to reason<br>

> > about the semantics of the function definition without needing to<br>

> > reason about the prefix data.  Obviously this makes the format of the<br>

> > prefix data highly target dependent.<br>

> ><br>

><br>

> I'm not sure that something this target dependent is the right choice. Your<br>

> example below suggests that the frontend would then need to know magic to<br>

> put "raw" in the instruction stream. Have you considered having the feature<br>

> expose just the intent "store this data attached to the function, to be<br>

> accessed very quickly", and then have an intrinsic<br>

> ("llvm.getfuncdata.i{8,16,32,64}"?) which extracts the data in a<br>

> target-dependent way?<br>

<br>

</div></div>The problem is that things like UBSan need to be able to understand<br>

the instruction stream anyway (to a certain extent).  In UBSan's case,<br>

determining at runtime whether a function has prefix data depends on<br>

a specific signature of instructions at the start of the program.<br>

There are a wide variety of signatures that can be used here and<br>

I believe we shouldn't try to constrain the frontend author with a<br>

signature (at least partly) of our own design.<br>

<br>

I think that if someone wants a target-independent way of<br>

embedding prefix data it should be done as a library on top of the<br>

target-dependent facilities provided in IR.  One could imagine a set<br>

of routines like this:<br>

<br>

/// Given some constant data, attach valid prefix data.<br>

void attachPrefixData(Function *F, Constant *Data);<br>

<br>

/// Returns an i1 indicating whether prefix data is present for FP.<br>

Value *hasPrefixData(Value *FP);<br>

<br>

/// Returns a pointer to the prefix data for FP.<br>

Value *getPrefixDataPointer(Value *FP, Type *DataType);<br>

<div class="im"><br>

> Forcing clients to embed deep<br>

> target-specific-machine-code knowledge in their frontends seems like a step<br>

> in the wrong direction for LLVM.<br>

<br>

</div>Given a set of routines such as the ones described above, I think we<br>

can give frontends a choice of whether to do this or not.  Besides,<br>

LLVM already contains plenty of target-specific information in its IR.<br>

Varargs, inline asm, calling conventions, etc.  I don't think making<br>

all aspects of the IR target-independent should be a worthwhile goal<br>

<div class="im">for LLVM.<br></div></blockquote><div><br></div><div>I don't have an issue with target-dependent things per se; I just think that they should be given a bit more thought and not added unless existing mechanisms are insufficient. For example, could this be implemented as a late IR pass that adds a piece of inline-asm to the beginning of the function?</div>

<div><br></div><div>-- Sean Silva</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im">


<br>

> > This requirement could be relaxed when combined with my earlier symbol<br>

> > offset proposal [2] as applied to functions.  However, this is outside<br>

> > the scope of the current proposal.<br>

> ><br>

> > Example:<br>

> ><br>

> > %0 = type <{ i32, i8* }><br>

> ><br>

> > define void @f() prefix %0 <{ i32 1413876459, i8* bitcast ({ i8*, i8* }*<br>

> > @_ZTIFvvE to i8*) }> {<br>

> >   ret void<br>

> > }<br>

> ><br>

> > This is an example of something that UBSan might generate on an<br>

> > x86_64 machine.  It consists of a signature of 4 bytes followed by a<br>

> > pointer to the RTTI data for the type 'void ()'.  The signature when<br>

> > laid out as a little endian 32-bit integer decodes to the instruction<br>

> > 'jmp .+0x0c' (which jumps to the instruction immediately succeeding<br>

> > the 12-byte prefix) followed by the bytes 'F' and 'T' which identify<br>

> > the prefix as a UBSan function type prefix.<br>

> ><br>

><br>

> Do you know whether OoO CPU's will still attempt to decode the "garbage" in<br>

> the instruction stream, even if there is a jump over it? (IIRC they will<br>

> decode ahead of the PC and hiccup (but not fault) on garbage). Maybe it<br>

> would be better to steganographically encode the value inside the<br>

> instruction stream? On x86 you could use 48b8<imm64> which only has 2 bytes<br>

> overhead for an i64 (putting a move like that, which moves into a<br>

> caller-saved register on entry, would effectively be a noop).<br>

<br>

</div>On the contrary, I think this is a good argument for allowing<br>

(not forcing) frontends to encode the prefix data as they please,<br>

thus enabling this kind of creativity.<br>

<div class="im"><br>

> This is some<br>

> pretty gnarly target-dependent stuff which seems like it would best be<br>

> hidden in the backend (e.g. architectures that have "constant island"-like<br>

> passes might want to stash the data in there instead).<br>

<br>

</div>I think that adding support for things like constant islands is<br>

something that can be added incrementally at a later stage.  One could<br>

consider for example an additional llvm::Function field which specifies<br>

the number of bytes that the backend may use at the beginning of the<br>

function such that the prefix data may be of any format.  (Once this<br>

is in place the aforementioned library routines could become relatively<br>

trivial.)  The backend could use this space to, say, insert a relative<br>

branch that skips the prefix data and a first constant island.<br>

<br>

Thanks,<br>

<span class=""><font color="#888888">--<br>

Peter<br>

</font></span></blockquote></div><br></div></div>