[LLVMdev] parallel loop metadata simplification

Redmond, Paul paul.redmond at intel.com
Fri Mar 1 13:05:39 PST 2013


On 2013-03-01, at 12:23 PM, Hal Finkel wrote:

> ----- Original Message -----
>> From: "Hal Finkel" <hfinkel at anl.gov>
>> To: "Paul Redmond" <paul.redmond at intel.com>
>> Cc: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu>
>> Sent: Friday, March 1, 2013 11:13:06 AM
>> Subject: Re: [LLVMdev] parallel loop metadata simplification
>> 
>> ----- Original Message -----
>>> From: "Paul Redmond" <paul.redmond at intel.com>
>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>> Cc: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu>
>>> Sent: Friday, March 1, 2013 10:49:24 AM
>>> Subject: Re: [LLVMdev] parallel loop metadata simplification
>>> 
>>> 
>>> On 2013-03-01, at 11:35 AM, Hal Finkel wrote:
>>> 
>>>> ----- Original Message -----
>>>>> From: "Paul Redmond" <paul.redmond at intel.com>
>>>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>>>> Cc: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu>
>>>>> Sent: Friday, March 1, 2013 10:06:51 AM
>>>>> Subject: Re: [LLVMdev] parallel loop metadata simplification
>>>>> 
>>>>> Hi Hal,
>>>>> 
>>>>> On 2013-02-28, at 9:33 PM, Hal Finkel wrote:
>>>>> 
>>>>>> ----- Original Message -----
>>>>>>> From: "Paul Redmond" <paul.redmond at intel.com>
>>>>>>> To: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu>
>>>>>>> Sent: Thursday, February 28, 2013 1:30:57 PM
>>>>>>> Subject: [LLVMdev] parallel loop metadata simplification
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I've been working on clang codegen for #pragma ivdep and
>>>>>>> creating
>>>>>>> the
>>>>>>> llvm.mem.parallel_loop_access metadata seems quite difficult.
>>>>>>> The
>>>>>>> main problem is that there are so many places where loads and
>>>>>>> stores
>>>>>>> are created and all of them need to be changed when emitting a
>>>>>>> parallel loop. Note that creating llvm.loop.parallel is not a
>>>>>>> problem.
>>>>>>> 
>>>>>>> One option is to modify IRBuilder to enable attaching the
>>>>>>> metadata
>>>>>>> for loads and stores but that seems like a huge hack.
>>>>>> 
>>>>>> Can you please explain why this is a bad option? To be honest,
>>>>>> this
>>>>>> is what I expected you to do. The IRBuilder sits on top of all
>>>>>> of
>>>>>> the loads and stores, and seems like the perfect place to keep
>>>>>> state for something that affects "all" loads and stores in some
>>>>>> code region. In addition, putting this in IRBuilder should also
>>>>>> make it easier to use this feature from other frontends.
>>>>> 
>>>>> One concern I have is that the llvm.mem.parallel_loop_access can
>>>>> refer to nested loops. Is it possible that instructions in the
>>>>> same
>>>>> loop require different metadatas (one is parallel in both inner
>>>>> and
>>>>> outer loop and another is only parallel in the inner loop?) This
>>>>> is
>>>>> a more general problem I think but the IRBuilder would only be
>>>>> useful for brute force adding the same metadata to all
>>>>> instructions
>>>>> with mayReadOrWriteMemory().
>>>> 
>>>> Well, it would be more complicated than that anyway because you
>>>> specifically don't want to tag the loads and stores to the local
>>>> alloca'd variable storage locations, only the "explicit" memory
>>>> references. I think an extra boolean parameter to CreateLoad,
>>>> etc.
>>>> could take care of those? Nevertheless, IRBuilder could keep a
>>>> stack of parallel loops to handle nested cases, right?
>>>> 
>>> 
>>> Hmm, I guess I'm missing something. It is my understanding that all
>>> loads and stores require the metadata (and based on the
>>> implementation of isAnnotatedParallel). What you describe suggests
>>> that a loop goes from non-parallel to parallel after SROA?
>> 
>> Or after mem2reg; something like that, yes. I think that's right.
> 
> On second thought, to be more specific, it might depend on whether the alloca is inside or outside the loop.
> 

I have discovered that you can provide a custom inserter to IRBuilder (who knew!). This has basically solved all my problems and allowed me to generate the proper metadata with minimal changes to clang codegen. Currently it adds the metadata to all loads and stores but I don't think this is a problem and can be refined later if necessary.

paul

> -Hal
> 
>> 
>> -Hal
>> 
>>> 
>>> paul
>>> 
>>>> -Hal
>>>> 
>>>>> 
>>>>> Thoughts?
>>>>> 
>>>>> paul
>>>>> 
>>>>>> 
>>>>>> -Hal
>>>>>> 
>>>>>>> 
>>>>>>> I'd like to reopen the discussion on requiring the
>>>>>>> llvm.mem.parallel_loop_access metadata. I understand the
>>>>>>> reason
>>>>>>> for
>>>>>>> the metadata is to protect against transformations that may
>>>>>>> introduce unsafe parallel memory accesses (the reg2mem
>>>>>>> example.)
>>>>>>> I'm
>>>>>>> wondering if perhaps we can make the metadata more
>>>>>>> user-friendly
>>>>>>> by
>>>>>>> providing a single loop metadata which can be expanded into
>>>>>>> "safer"
>>>>>>> metadata as required. Specifically a loop pass could be added
>>>>>>> that
>>>>>>> expands llvm.loop.parallel into llvm.loop.parallel_protected +
>>>>>>> llvm.mem.parallel_loop_access.
>>>>>>> 
>>>>>>> Perhaps there are simpler options I've overlooked?
>>>>>>> 
>>>>>>> paul
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> 





More information about the llvm-dev mailing list