[llvm-dev] Extracting values from tokens

Wed Aug 26 11:38:15 PDT 2015

Hi,

Now that we have the token type (http://reviews.llvm.org/rL245029), I need an operation that will "extract" a non-token value from a token.  I know people have several use cases in mind for tokens, so I wanted to solicit feedback on how general the solution should be (so I've cc'ed the people from the review of the token change).  I'm also interested in getting consensus so that as "extraction"s get added for each use case they have similar look-and-feel.

My particular need here is very narrow:  I need the 'catchpad' operation to define a value which is a pointer to the on-heap exception object it catches (which my target's personality routine will supply to the handler code).  Since the 'catchpad' operation is defined as producing a token, in order to get at the exception pointer I need some operation that can take that token as input and produce the exception pointer as output.

Going fully general, I could imagine having an operator with a name like 'tokenextract' that is parameterized by the type it produces and accepts one argument of type token plus zero or more arguments of arbitrary type which indicate what is being extracted.  If we're ever going to want to support orthogonal kinds of extractions operating on the same token value, I think that approach would break down because it doesn't give a good way to specify which kind of extraction is being performed.  On the other hand, I think it's entirely plausible that each token-producing operator will only ever have a fixed set of extractions that make sense for it, so this could be a workable solution under the assumption that the way to interpret '%x = tokenextract %tok, ty1 %arg1, ty2 %arg2' (for the sake of e.g. lowering out some construct that is represented using token linkage) is to first look at the operator defining %tok, and then interpret the selector args in the context of that operation.  This in turn implies that each token-producing operator's definition (in the Lang Ref) should spell out what can be extracted from it and what its convention for selector args is.  To my mind, that's a bit too convoluted, and the informal description of an operator's selector arg convention really seems like something that one ought to be able to specify as typing rules.

So I find myself arguing against a fully general solution here.  I think instead it makes sense for each kind of extraction to specify an intrinsic that represents it, with the argument/return types specified in the usual way as the signature of the intrinsic.  And on a case-by-case basis any intrinsic could be replaced with an instruction, following the same process that any other operation follows as it finds its way into the IR.

Ironically, the intrinsic approach that I'm advocating is awkward for my actual use case of extracting an exception pointer from a catchpad -- the argument and return types should really be dictated by the personality routine, and so can vary from function to function, but intrinsics only support a limited form of overloading.  But I think it would be ok to start with an intrinsic (called @llvm.eh.get_pad_param or something) that can be overloaded to return anyptr (or maybe anyptr + anyint) and not worry about more overloading until/unless we have more use cases.

Thoughts?

Thanks
-Joseph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150826/4a33a110/attachment.html>