[LLVMdev] [RFC] Add compiler scheduling barriers
Yi Kong
kongy.dev at gmail.com
Fri Jun 27 06:19:25 PDT 2014
On 24 June 2014 01:55, Philip Reames <listmail at philipreames.com> wrote:
>
> On 06/19/2014 09:35 AM, Yi Kong wrote:
>>
>> Hi all,
>>
>> I'm currently working on implementing ACLE extensions for ARM. There
>> are some memory barrier intrinsics, i.e.__dsb and __isb that require
>> the compiler not to reorder instructions around their corresponding
>> built-in intrinsics(__builtin_arm_dsb, __builtin_arm_isb), including
>> non-memory-access instructions.[1] This is currently not possible.
>>
>> It is sometimes useful to prevent the compiler from reordering
>> memory-access instructions as well. The only way to do that in both
>> GCC and LLVM is using a in-line assembly hack:
>> asm volatile("" ::: "memory")
>>
>> I propose adding two compiler scheduling barriers intrinsics to LLVM:
>> __schedule_barrier_memory and __schedule_barrier_full. The former only
>> prevents memory-access instructions reordering around the instruction
>> and the latter stops all. So that __isb, for example, can be
>> implemented something like:
>> inline void __isb() {
>> __schedule_barrier_full();
>> __builtin_arm_isb();
>> __schedule_barrier_full();
>> }
>
> Given your examples are in C, I want to ask a clarification question. Are
> you proposing adding such intrinsics to the LLVM IR? Or to some runtime
> library? If the later, *specifically* which one? Or at the MachineInst
> layer?
>
> I'm going to run under the assumption you're using C pseudo code for IR. If
> this is not the case, the rest of this will be off base.
Yes, IR.
> I'm not familiar with the exact semantics of an "isb" barrier, but I think
> you should look at the existing fence IR instructions. These restrict
> memory reorderings in the IR. Depending on the platform, they may imply
> hardware barriers, but they always imply compiler barriers.
>
> If all you want is a compiler barrier with the existing fence semantics
> w.r.t. reordering, we could consider extending fence with a "compiler only"
> (bikeshed needed!) attribute.
AFAIK, there isn't an existing fence strong enough for the memory
barrier intrinsics. The current strongest fence still allows
register-register data-processing instructions reordering across. For
DSB and ISB, no instruction should be allowed.
> If you're describing a new memory ordering for existing fences, that would
> seem like a reasonable extension.
>
> I'm not familiar with how we currently handle intrinsics for architecture
> specific memory barriers. Can anyone else comment on that? Is there a way
> to tag a particular intrinsic function as *also* being a full fence?
I'm interested in this as well.
>> To implement these intrinsics, I think the best method is to add
>> target-independent pseudo-instructions with appropriate
>> properties(hasSideEffects for memory barrier and isTerminator for full
>> barrier) and a pseudo-instruction elimination pass after the
>> scheduling pass.
>
> Why would your barrier need to be a basic block terminator? That doesn't
> parse for me. Could you explain?
Compiler shouldn't allow instructions to be reordered between basic
blocks. By implementing as a basic block terminator, it will stop any
instruction from reordering.
I'm not very familiar with LLVM, can you propose the correct way of
implementing it?
>> What do people think of this idea?
>
> I'm honestly unclear on what your problem is and what you're trying to
> propose. It make take a few rounds of conversation to clarify.
>
> Philip
More information about the llvm-dev
mailing list