[llvm-dev] [cfe-dev] [RFC] ASM Goto With Output Constraints

Fri Jun 28 14:53:13 PDT 2019

On Fri, Jun 28, 2019 at 1:48 PM James Y Knight <jyknight at google.com> wrote:

> On Fri, Jun 28, 2019 at 3:00 PM Bill Wendling <isanbard at gmail.com> wrote:
>
>> On Thu, Jun 27, 2019 at 1:44 PM Bill Wendling <isanbard at gmail.com> wrote:
>>
>>> On Thu, Jun 27, 2019 at 1:29 PM James Y Knight <jyknight at google.com>
>>> wrote:
>>>
>>>> I think this is fine, except that it stops at the point where things
>>>> actually start to get interesting and tricky.
>>>>
>>>> How will you actually handle the flow of values from the callbr into
>>>> the error blocks? A callbr can specify requirements on where its outputs
>>>> live. So, what if two callbr, in different branches of code, specify
>>>> _different_ constraints for the same output, and list the same block as a
>>>> possible error successor? How can the resulting phi be codegened?
>>>>
>>>> This is where I fall back on the statement about how "the programmer
>>> knows what they're doing". Perhaps I'm being too cavalier here? My concern,
>>> if you want to call it that, is that we don't be too restrictive on the new
>>> behavior. For example, the "asm goto" may set a register to an error value
>>> (made up on the spot; may not be a common use). But, if there's no real
>>> reason to have the value be valid on the abnormal path, then sure we can
>>> declare that it's not valid on the abnormal path.
>>>
>>> I think I should explain my "programmer knows what they're doing"
>> statement a bit better. I'm specifically referring to inline asm here. The
>> more general "callbr" case may still need to be considered (see Reid's
>> reply).
>>
>> When a programmer uses inline asm, they're implicitly telling the
>> compiler that they *do* know what they're doing  (I know this is common
>> knowledge, but I wanted to reiterate it.). In particular, either they need
>> to reference an instruction not readily available from the compiler (e.g.
>> "cpuid") or the compiler isn't able to give them the needed performance in
>> a critical section. I'm extending this sentiment to callbr with output
>> constraints. Let's take your example below and write it as "normal" asm
>> statements one on each branch of an if-then-else (please ignore any syntax
>> errors):
>>
>> if:
>>   br i1 %cmp, label %true, label %false
>>
>> true:
>>   %0 = call { i32, i32 } asm sideeffect "poetry $0, $1", "={r8},={r9}" ()
>>   br label %end
>>
>> false:
>>   %1 = call { i32, i32 } asm sideeffect "poetry2 $0, $1", "={r10},={r11}"
>> ()
>>   br label %end
>>
>> end:
>>   %vals = phi { i32, i32 } [ %0, %true ], [ %1, %false ]
>>
>> How is this handled in codegen? Is it an error or does the back-end
>> handle it? Whatever's done today for "normal" inline asm is what I *think*
>> should be the behavior for the inline asm callbr variant. If this doesn't
>> seem sensible (and I realize that I may be thinking of an "in a perfect
>> world" scenario), then we'll need to come up with a more sensible solution
>> which may be to disallow the values on the error block until we can think
>> of a better way to handle them.
>>
>
> This example is no problem, because instructions can be emitted between
> what's emitted by "call asm" and the end of the block (be it a fallthrough,
> or a jump instruction. What gets emitted there is a move of the output
> register to another location -- either a register or to the stack. And
> therefore at the beginning of the "end" block, "%vals" is always in a
> consistent location, no matter how you got to that block.
>
> But in the callbr case, there is not a location at which those moves can
> be emitted, after the callbr, before the jump to "error".
>

I see what you mean. Let's say we create a pseudo-instruction (similar to
landingpad, et al) that needs to be lowered by the backend in a reasonable
manner. The EH stuff has an external process/library that performs the
actual unwinding and which sets the values accordingly. We won't have this.
What we could do instead is split the edges and insert the copy-to-<where
ever> statements there. So something like:

>>>

bb1:

  callbr ... [label %asm.goto.dest]

bb2:

  callbr ... [label %asm.goto.dest]

asm.goto.dest:

  ...

<<<

converted to something like:

>>>

bb1:

  callbr ... [label %asm.goto.dest.bb1]

bb2:

  callbr ... [label %asm.goto.dest.bb2]

asm.goto.dest.bb1:

  %v.bb1 = extractvalue ...

  br label %asm.goto.dest

asm.goto.dest.bb2:

  %v.bb2 = extractvalue ...

  br label %asm.goto.dest

asm.goto.dest:

  %v = phi [%v.bb1, label %asm.goto.dest.bb1], [%v.bb2, label %asm.goto.bb2]

  ...

  ...

<<<

It's not 100% not barfy, but it's what the compiler does in similar
situations.

-bw
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190628/5aa00255/attachment.html>