[LLVMdev] Two labels around one instruction in Codegen

Tue Nov 6 09:18:31 PST 2007

Duncan Sands wrote:
> Hi Nicolas,
>
>   
>> In order to have exceptions for non-call instructions (such as sdiv,
>> load or stores), I'm modifying codegen so that it generates a BeginLabel
>> and an EndLabel between the "may throwing" instruction. This is what the
>> codegen of an InvokeInst does.
>>     
>
> the rule is that all instructions between eh begin labelN and eh end labelN
> must unwind to the same landing pad.  This is why invokes are bracketed by
> such labels.  There are also two other cases to consider: (1) potentially
> throwing instructions which are not allowed to throw (nounwind), 

What do you mean "not allowed"? Is this decided by the front-end? Or by
an optimization pass (div may throw, but if we have a = b / 5 we not it
won't throw).

> (2) throwing
> instructions for which any thrown exception will not be processed in this
> function. 

I'm not sure I understand here.

>  In case (1) the instruction should have no entry in the final
> dwarf exception table, while in case (2) it should have an entry.  We don't
> handle (1) right now, however the plan is that nounwind calls will also be
> bracketed by labels but will have no associated landing pad. 

Why would they be bracketed by labels if codegen knows they don't throw?

>  As for (2),
> the dwarf writer scans all instructions in the function and if it sees a
> call that is not bracketed by labels then it generates an appropriate entry
> in the exception table 

Do you mean "that _is_ bracketed by labels" ?

> (this will of course need to be modified to consider
> all throwing instructions - note that this means that "maythrow" markings will
> have to exist right to the end of code generation!); it is done this way
> because labels inhibit optimizations (we used to bracket all calls with
> labels, but stopped doing that because of the optimization problem).  I'm
> mentioning this because the begin and end labels are not *between* maythrow
> instructions, they bracket them.
>
>   

Sure, that would be the goal. Which means the labels are not created
between an instruction, but between the instructions of a basic block.
I'll see if this works. My first implementation was between one
instruction because it was very simple to copy the invoke case for
non-calls.

>> However, when generating native code, only BeginLabel is generated, and
>> it is generated after the instruction. I'm not familiar with DAGs in the
>> codegen library, so here are my 2-cents thoughts why:
>>
>> 1) BeginLabel and EndLabel are generated with:
>>   DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
>>                             DAG.getConstant({Begin|End}Label, MVT::i32)));
>>
>> This seems to work with InvokeInst instructions, because the root of the
>> DAG is modified by the instruction. With instructions such as sdiv, the
>> root is not modified: the instruction only lowers itself to:
>> DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)
>>     
>
> I think that not creating a new root means that the instruction is allowed
> to be re-ordered with respect to other instructions, as long as it occurs
> before its uses.  Re-ordering is rather dubious for instructions that may
> throw, though it's not clear what is acceptable.  I think you probably need
> a new selection DAG "throw" node which you wrap throwing instructions in, a
> bit like a TokenFactor.  This throw node would be setup in such a way as to
> be bracketable by labels.
>
>   

I need to get some LLVM code reading ;-)

>> Which probably makes the codegen think EndLabel and BeginLabel are in
>> the same place
>>     
>
> In that case I would expect them both to be deleted...
>   

Only one was deleted. Consider the code:

define i32 @test(i32 %argc) {
entry:
        %tmp2 = sdiv i32 2, %argc       to label %continue unwind to
label %unwindblock ; <i32> [#uses=1]

continue:
        ret i32 %tmp2

unwindblock:
        unwind
}

And here is the resulting x86 code (Llabel1 was supposed to be before
the {ctld, idvl} and Llabel2 which was after is not generated)

test:
.Leh_func_begin1:

.Llabel4:
        movl    $2, %eax
        movl    4(%esp), %ecx
        cltd
        idivl   %ecx

.Llabel1:
.LBB1_1:        # continue
        ret
.LBB1_2:        # unwindblock

Thanks Duncan,
Nicolas