[LLVMdev] [RFC] Add compiler scheduling barriers

Yi Kong kongy.dev at gmail.com
Fri Jun 27 06:19:25 PDT 2014


On 24 June 2014 01:55, Philip Reames <listmail at philipreames.com> wrote:
>
> On 06/19/2014 09:35 AM, Yi Kong wrote:
>>
>> Hi all,
>>
>> I'm currently working on implementing ACLE extensions for ARM. There
>> are some memory barrier intrinsics, i.e.__dsb and __isb that require
>> the compiler not to reorder instructions around their corresponding
>> built-in intrinsics(__builtin_arm_dsb, __builtin_arm_isb), including
>> non-memory-access instructions.[1] This is currently not possible.
>>
>> It is sometimes useful to prevent the compiler from reordering
>> memory-access instructions as well. The only way to do that in both
>> GCC and LLVM is using a in-line assembly hack:
>>    asm volatile("" ::: "memory")
>>
>> I propose adding two compiler scheduling barriers intrinsics to LLVM:
>> __schedule_barrier_memory and __schedule_barrier_full. The former only
>> prevents memory-access instructions reordering around the instruction
>> and the latter stops all. So that __isb, for example, can be
>> implemented something like:
>>    inline void __isb() {
>>      __schedule_barrier_full();
>>      __builtin_arm_isb();
>>      __schedule_barrier_full();
>>    }
>
> Given your examples are in C, I want to ask a clarification question.  Are
> you proposing adding such intrinsics to the LLVM IR? Or to some runtime
> library?  If the later, *specifically* which one? Or at the MachineInst
> layer?
>
> I'm going to run under the assumption you're using C pseudo code for IR.  If
> this is not the case, the rest of this will be off base.

Yes, IR.

> I'm not familiar with the exact semantics of an "isb" barrier, but I think
> you should look at the existing fence IR instructions.  These restrict
> memory reorderings in the IR.  Depending on the platform, they may imply
> hardware barriers, but they always imply compiler barriers.
>
> If all you want is a compiler barrier with the existing fence semantics
> w.r.t. reordering, we could consider extending fence with a "compiler only"
> (bikeshed needed!) attribute.

AFAIK, there isn't an existing fence strong enough for the memory
barrier intrinsics. The current strongest fence still allows
register-register data-processing instructions reordering across. For
DSB and ISB, no instruction should be allowed.

> If you're describing a new memory ordering for existing fences, that would
> seem like a reasonable extension.
>
> I'm not familiar with how we currently handle intrinsics for architecture
> specific memory barriers.  Can anyone else comment on that?  Is there a way
> to tag a particular intrinsic function as *also* being a full fence?

I'm interested in this as well.

>> To implement these intrinsics, I think the best method is to add
>> target-independent pseudo-instructions with appropriate
>> properties(hasSideEffects for memory barrier and isTerminator for full
>> barrier) and a pseudo-instruction elimination pass after the
>> scheduling pass.
>
> Why would your barrier need to be a basic block terminator?  That doesn't
> parse for me.  Could you explain?

Compiler shouldn't allow instructions to be reordered between basic
blocks. By implementing as a basic block terminator, it will stop any
instruction from reordering.

I'm not very familiar with LLVM, can you propose the correct way of
implementing it?

>> What do people think of this idea?
>
> I'm honestly unclear on what your problem is and what you're trying to
> propose.  It make take a few rounds of conversation to clarify.
>
> Philip



More information about the llvm-dev mailing list