[llvm-commits] [PATCH] Add va_argpack intrinsics

Eli Friedman eli.friedman at gmail.com
Tue May 15 14:01:13 PDT 2012


On Tue, May 15, 2012 at 1:53 PM, Benjamin Kramer
<benny.kra at googlemail.com> wrote:
>
> On 15.05.2012, at 22:31, Eli Friedman wrote:
>
>> On Tue, May 15, 2012 at 12:10 PM, Benjamin Kramer
>> <benny.kra at googlemail.com> wrote:
>>>
>>> On 15.05.2012, at 20:47, Eli Friedman wrote:
>>>
>>>> On Tue, May 15, 2012 at 11:03 AM, Chandler Carruth <chandlerc at google.com> wrote:
>>>>> On Tue, May 15, 2012 at 11:36 AM, Eli Friedman <eli.friedman at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> On Tue, May 15, 2012 at 10:03 AM, Benjamin Kramer
>>>>>> <benny.kra at googlemail.com> wrote:
>>>>>>>
>>>>>>> On 15.05.2012, at 18:16, Chandler Carruth wrote:
>>>>>>>
>>>>>>>> On Tue, May 15, 2012 at 9:08 AM, Benjamin Kramer
>>>>>>>> <benny.kra at googlemail.com> wrote:
>>>>>>>> This will be used to implement __builtin_va_arg_pack[len] in clang
>>>>>>>> (PR7219).
>>>>>>>>
>>>>>>>> Awesome, thanks for working on this...
>>>>>>>>
>>>>>>>> Since this is used to model a pretty crazy gcc extension (used by
>>>>>>>> glibc) we have
>>>>>>>> to expand the intrinsics in the inliner. When it replaces a vararg call
>>>>>>>> site with
>>>>>>>> a copy of the function another pass over all instructions is performed
>>>>>>>> which
>>>>>>>> replaces any pack intrinsics with the values from the call site. There
>>>>>>>> are some
>>>>>>>> rather narrow requirements where the intrinsics can occur, documented
>>>>>>>> in LangRef
>>>>>>>> and enforced by the verifier.
>>>>>>>>
>>>>>>>> This patch has two other side effects:
>>>>>>>> - DAE is not allowed to remove varargs if a va_pack intrinsic is used.
>>>>>>>> - Inliner can now inline vararg functions iff they don't use va_start.
>>>>>>>> This can
>>>>>>>>  happen if DAE doesn't remove the varargs because there is a va_pack
>>>>>>>> intrinsic
>>>>>>>>  or if the AlwaysInliner tries to inline the function.
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm pretty OK wit this strategy. It actually has some other interesting
>>>>>>>> artifacts I'd like to explore later. Specifically, I think we can generalize
>>>>>>>> this to allow inlining of varargs functions when the va_start is in dead
>>>>>>>> code...
>>>>>>
>>>>>> I'm actually mildly concerned about this strategy: the LLVM IR inliner
>>>>>> can't actually reason correctly about the ABI rules for a varargs
>>>>>> call, and will therefore screw up if you try to pass anything
>>>>>> complicated (_Complex, structs) through the varargs list.  This might
>>>>>> not matter in practice if it isn't used for anything other than the
>>>>>> printf and scanf families of functions, but it's still worth thinking
>>>>>> about.
>>>>>
>>>>>
>>>>> I share your concerns here regarding the pack and pack_len builtins...
>>>>> However, I think in the extremely limited context they are used, this is
>>>>> safe.
>>>>>
>>>>> The idea is that these builtins can only be used inside of some other vararg
>>>>> function, and the only arguments we move from the outer call to the inner
>>>>> are ones that would already have gone through Clang's vararg lowering. The
>>>>> goal seems specifically to re-use the way that the outer call is lowered
>>>>> when setting up the inlined inner call.
>>>>>
>>>>> I've been trying to think of a construct that breaks this, but so far I
>>>>> can't come up with any...
>>>>
>>>> Consider the following on x86-64:
>>>> void f(int x, int y, ...);
>>>> struct A { long long x,y; };
>>>> void g(int a, int b, int c, int d, ...) {
>>>>  //f(a,b,__builtin_va_pack());
>>>>  f(a,b,(struct A){},(struct A){});
>>>> }
>>>> void h() {
>>>>  struct A x;
>>>>  g(0,0,0,0,x,x);
>>>> }
>>>
>>> This is pretty insane and I'm afraid there is no solution without putting an inliner into clang :(
>>>
>>> Are you okay with "Passing aggregate types to a function using llvm.va_pack is undefined behavior." in the LangRef?
>>
>> It's "anything which doesn't map to a single LLVM scalar type".  I
>
> ok
>
>> really have two concerns here: one, we don't want to silently
>> miscompile code, and two, we need to make sure this limitation doesn't
>> actually bite in the cases we care about.
>
> We can't rule out miscompiling code as the information what's passed to
> the function is lost at the IR level. I'm only aware of one (sadly high-profile)
> user of __builtin_va_arg_pack, glibc. It uses it to wrap certain functions
> like printf, open or syslog, which should be safe.

If someone generates the IR intrinsic, fine, they should know what
they're doing.  I'm more worried about the clang side.

-Eli




More information about the llvm-commits mailing list