[llvm-dev] Extracting values from tokens

Thu Aug 27 15:24:19 PDT 2015


On 08/26/2015 11:38 AM, Joseph Tremoulet wrote:
>
> Hi,
>
> Now that we have the token type (http://reviews.llvm.org/rL245029), I 
> need an operation that will "extract" a non-token value from a token.  
> I know people have several use cases in mind for tokens, so I wanted 
> to solicit feedback on how general the solution should be (so I've 
> cc'ed the people from the review of the token change). I'm also 
> interested in getting consensus so that as "extraction"s get added for 
> each use case they have similar look-and-feel.
>
> My particular need here is very narrow:  I need the 'catchpad' 
> operation to define a value which is a pointer to the on-heap 
> exception object it catches (which my target's personality routine 
> will supply to the handler code).  Since the 'catchpad' operation is 
> defined as producing a token, in order to get at the exception pointer 
> I need some operation that can take that token as input and produce 
> the exception pointer as output.
>
> Going fully general, I could imagine having an operator with a name 
> like 'tokenextract' that is parameterized by the type it produces and 
> accepts one argument of type token plus zero or more arguments of 
> arbitrary type which indicate what is being extracted.  If we're ever 
> going to want to support orthogonal kinds of extractions operating on 
> the same token value, I think that approach would break down because 
> it doesn't give a good way to specify which kind of extraction is 
> being performed.  On the other hand, I think it's entirely plausible 
> that each token-producing operator will only ever have a fixed set of 
> extractions that make sense for it, so this could be a workable 
> solution under the assumption that the way to interpret '%x = 
> tokenextract %tok, ty1 %arg1, ty2 %arg2' (for the sake of e.g. 
> lowering out some construct that is represented using token linkage) 
> is to first look at the operator defining %tok, and then interpret the 
> selector args in the context of that operation.  This in turn implies 
> that each token-producing operator's definition (in the Lang Ref) 
> should spell out what can be extracted from it and what its convention 
> for selector args is.  To my mind, that's a bit too convoluted, and 
> the informal description of an operator's selector arg convention 
> really seems like something that one ought to be able to specify as 
> typing rules.
>
> So I find myself arguing against a fully general solution here.  I 
> think instead it makes sense for each kind of extraction to specify an 
> intrinsic that represents it, with the argument/return types specified 
> in the usual way as the signature of the intrinsic.  And on a 
> case-by-case basis any intrinsic could be replaced with an 
> instruction, following the same process that any other operation 
> follows as it finds its way into the IR.
>
After reading your description, I find myself with no strong opinion 
either direction.  Your discussion of the pros and cons of each approach 
covers the topic well.  I'd be perfectly willing to go either direction 
due to the lack of a compelling argument in one direction.  I'd probably 
lean towards the generic version myself, but I'm happy to defer to the 
people actual working on using the mechanism at the moment.
>
> Ironically, the intrinsic approach that I'm advocating is awkward for 
> my actual use case of extracting an exception pointer from a catchpad 
> -- the argument and return types should really be dictated by the 
> personality routine, and so can vary from function to function, but 
> intrinsics only support a limited form of overloading.  But I think it 
> would be ok to start with an intrinsic (called @llvm.eh.get_pad_param 
> or something) that can be overloaded to return anyptr (or maybe anyptr 
> + anyint) and not worry about more overloading until/unless we have 
> more use cases.
>
Seems reasonable to me.

We could also go with a generic mechanism based on a variadic intrinsic 
if we wanted.  We have all of the building blocks for this between 
gc.result and gc.statepoint.  If we combined a variadic argument list 
with anyany result, we'd get an intrinsic with close to the semantics of 
the instruction you were considering.  We could potentially use this to 
prototype both approaches and see which one appears less ugly.
>
> Thoughts?
>
> Thanks
>
> -Joseph
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150827/19fd0def/attachment.html>