[LLVMdev] [cfe-dev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

Tue Oct 2 19:52:37 PDT 2012

Hi,

Le 02/10/2012 19:29, Hal Finkel a écrit :
> On Mon, 01 Oct 2012 22:56:50 -0700
> Chris Lattner <clattner at apple.com> wrote:
>> On Oct 1, 2012, at 10:37 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>> On Mon, 01 Oct 2012 21:26:54 -0700
>>> Chris Lattner <clattner at apple.com> wrote:
>>>> On Oct 1, 2012, at 6:16 PM, greened at obbligato.org wrote:
>>>>> Sanjoy Das <sanjoy at playingwithpointers.com> writes:
>>>>>
>>>>>> In short, I propose a intrinsic based approach which hinges on
>>>>>> the concept of a "parallel map".  The immediate effect of using
>>>>>> intrinsics is that we no longer have to worry about missing
>>>>>> metadata.  Moreover, we are still free to lower the intrinsics in
>>>>>> a variety of ways -- including vectorizing them or lowering them
>>>>>> to calls to an actual openmp backend.
>>>>>
>>>>> I'll re-ask here since this is in its own thread.
>>>>>
>>>>> Why can't we just make ordinary function calls to runtime
>>>>> routines?
>>>>
>>>> I agree.  I can't imagine any practical way that a metadata-based
>>>> approach could be preserved by optimizers.
>>>
>>> Regarding the metadata approach, it depends on what you mean by
>>> preserved. The trick is to make sure that transformations that don't
>>> understand the metadata can't cause miscompiles. The specific scheme
>>> that I proposed used a combination of procedurization and
>>> cross-referencing metadata such that invalidated parallel metadata
>>> can be detected and the entire enclosing parallel region can be
>>> dropped.
>>>
>>> The proposal from Intel, which more-heavily uses intrinsics, has
>>> other advantages, but will require more modifications to existing
>>> passes to realize its potential optimization benefits.
>>
>> My comment was mostly in response to the Intel proposal, which
>> effectively translates OpenMP pragmas directly into llvm intrinsics +
>> metadata.  I can't imagine a way to make this work *correctly*
>> without massive changes to the optimizer.
>
> Also, I should mention that Sanjoy's recommendation, which is to move
> the parallelization state into an analysis pass, might make sense here.
> If not all intermediate passes preserve the analysis, then the state
> will be lost, and no parallelization will occur. In the context of
> OpenMP, where parallelization is essentially optional, I think this
> should be fine.

What do you mean by "parallelization is essentially optional"?
It was already answered today (on llvmdev@) by David that:

"Actually, it is perfectly possible to have a program with OpenMP
directives that is NOT valid when those directives are ignored.  In
other words, it's possible to write a legal OMP program that relies on
parallelism to function correctly.".

Just think about a task-based producer/consumer code for example.

-- 
Mehdi