[cfe-dev] Help with register allocation for undef inputs to inline asm

Mon Jul 12 08:07:22 PDT 2021

On 7/12/21 10:55 AM, David Spickett wrote:
>> forgive me, but isn't that asm just specifying inputs?  There are no outputs, so the allocator doesn't know anything's being clobbered?
> 
> Yes. I wasn't aware that "+" was a thing but that looks like the
> proper way to do this.
> 
> "Operands using the ‘+’ constraint modifier count as two operands
> (that is, both as input and output) towards the total maximum of 30
> operands per asm statement."
> 
> And you're right without the clobber info you're going to have issues
> if this asm is anywhere other than at the end of a void returning
> function (which the original report was). Using + solves both issues,
> thanks!

great! a couple of other points:

a) be aware of the '&' constraint -- this means that output is written 
*before* all the inputs have been read.  It's an 'early clobber', and 
cannot reside in the same register as an input.

b) in your example 'volatile' could inhibit optimization, there's 
nothing volatile about the actual asm.  But of course, the code you 
derived this from may have hidden side-effects.  Generally "set 
magic_reg, %[val]" kinds of asms need that.

> 
> On Mon, 12 Jul 2021 at 14:26, Nathan Sidwell <nathan at acm.org> wrote:
>>
>> On 7/8/21 11:35 AM, David Spickett via cfe-dev wrote:
>>> (since this is inline asm I'm sending this to cfe-dev, though it
>>> includes llvm's logic)
>>>
>>> I've been looking at this bug report:
>>> https://bugs.llvm.org/show_bug.cgi?id=50647
>>> (the initial report is for ARM but I've found it applies to any architecture)
>>>
>>> Where an undef input to an inline asm statement isn't assigned its own
>>> register and overlaps a second input value. Here's a minimal version
>>> of it:
>>> void func(unsigned long long n)
>>> {
>>>       unsigned long long b;
>>>       n = 99;
>>>
>>>       __asm__ volatile (
>>>           "add %[_b], %[_b], %[_b] \n\t" // Assigned register X
>>>           "add %[_n], %[_n], %[_n] \n\t" // Also assigned register X
>>>           :
>>>           : [_n] "r" (n), [_b] "r" (b)
>>>       );
>>> }
>>>
>>> Godbolt: https://godbolt.org/z/bro9hde46
>>>
>>> This produces an inline asm statement in IR where the input for "b" is undef.
>>> tail call void asm sideeffect "add $1, $1, $1 \0A\09add $0, $0, $0
>>> \0A\09", "r,r"(i64 99, i64 undef) #2, !dbg !22, !srcloc !23
>>
>> forgive me, but isn't that asm just specifying inputs?  There are no
>> outputs, so the allocator doesn't know anything's being clobbered?
>>
>> Indeed, if I change it to:
>>      __asm__ volatile (
>>           "add %[_b], %[_b], %[_b] \n\t"
>>           "add %[_n], %[_n], %[_n] \n\t"
>>           : [_n] "+r" (n), [_b] "+r" (b)
>>           :
>>       );
>>
>> I get different registers for _n and _b.
>>
>>>
>>> This makes sense and I can see intuitively why you wouldn't assign a
>>> unique register to an undef value. It has no value after all, it could
>>> be anything including the same value as the other input. I tracked
>>> this decision down to somewhere in the VirtRegRewriter pass but I
>>> haven't been able to pin down the exact place yet.
>>>
>>> My question is:
>>> Would making an exception here for the inline asm case make sense? Or
>>> is this an instance of undef values gives you undef results, in a way
>>> that we would be happy to keep. (FWIW gcc does assign unique registers
>>> in this case)
>>>
>>> Thanks,
>>> David Spickett.
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>
>>
>> --
>> Nathan Sidwell

-- 
Nathan Sidwell