[polly] r248861 - Identify and hoist definitively invariant loads

Tobias Grosser via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 2 01:03:52 PDT 2015


On 10/01/2015 03:35 PM, Johannes Doerfert wrote:
> On 10/01, Tobias Grosser wrote:
>> On 09/30/2015 10:51 PM, Johannes Doerfert wrote:
>>> On 09/30, Tobias Grosser wrote:
>>>> On 09/30/2015 01:47 AM, Johannes Doerfert via llvm-commits wrote:
>>>>> Author: jdoerfert
>>>>> Date: Tue Sep 29 18:47:21 2015
>>>>> New Revision: 248861
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=248861&view=rev
>>>>> Log:
>>>>> Identify and hoist definitively invariant loads
>>>>>
>>>>>    As a first step in the direction of assumed invariant loads (loads
>>>>>    that are not written in some context) we now detect and hoist
>>>>>    definitively invariant loads. These invariant loads will be preloaded
>>>>>    in the code generation and used in the optimized version of the SCoP.
>>>>>    If the load is only conditionally executed the preloaded version will
>>>>>    also only be executed under the same condition, hence we will never
>>>>>    access memory that wouldn't have been accessed otherwise. This is also
>>>>>    the most distinguishing feature to licm.
>>>>>
>>>>>    As hoisting can make statements empty we will simplify the SCoP and
>>>>>    remove empty statements that would otherwise cause artifacts in the
>>>>>    code generation
>>>>
>>>> Nice, thank you!
>>> Wait till the series is in. Then it will impact SPEC benchmarks and (at
>>> least) give us an opportunity to optimize a lot more regions.
>>
>> Nice.
>>>> I just observed one issue. This commit breaks our OpenMP bot. I spent some
>>>> time investigating and the reason is that you just dropped the memory
>>>> accesses of the invariant loads entirely from the ScopStmt and then pass
>>>> the relevant values along by using GlobalMaps. Currently the OpenMP codegen
>>>> assumes that all memory accesses are visible in the ScopStmts. Hence,
>>>> "secrectly" passing scalar values via the GlobalMaps is a little against
>>>> the current design. We could probably document this (ab?)use of GlobalMap
>>>> and teach the OpenMP codegen about the invariant values in GlobalMaps.
>>>> An alternative to this approach would be to actual add new MemoryAccesses
>>>> to the ScopStmt that reference new scalar one-element ScopArrayInfo objects.
>>>> My feeling is that this would reduce the BlockGenerator changes needed here,
>>>> which probably also helps to keep our BlockGenerator more understandable.
>>
>>> I am not 100% following but let me state 3 things:
>>>    1) I am pretty sure that removing the accesses from the statements is
>>>       what we want and should do. Keeping them around has bad
>>>       consequences in many later passes, mainly artifacts in the AST
>>>       generation and additional "filter" code in the code generation.
>>
>> Could you give an example. I currently do not see how any memory references
>> would even influence the generated AST.
> A memory reference doesn't explicitly but a statement that would be
> empty without it and is only present to hold that reference we will
> never execute at that position does.

Could we not just drop read-only statements similar to how we today drop 
read-write-none statements?

>>>    2) The global map was mainly used during OpenMP code generation for
>>>       values that were "generated somewhere before" the current block and
>>>       that is what I am using it for. The actual documentation in the
>>>       block generator states:
>>>
>>> @param GlobalMap   A mapping from llvm::Values used in the original scop
>>>                     region to a new set of llvm::Values. Each reference to
>>>                     an original value appearing in this mapping is replaced
>>>                     with the new value it is mapped to.c
>>>
>>>       I think preloaded invariant accesses fit this description
>>>       perfectly.
>>
>> This map is specifically for code generation, but all analysis (e.g. the
>> set of accessed values that need to be passed to the subfunction) is done
>> on the polyhedral representation. The problem with GlobalMaps is that you
>> have no idea why a value is in GlobalMaps, so you either pass always the
>> entire content of GlobalMaps or you need to rescan each ScopStmt to determine
>> which elements of GlobalMaps are relevant for a loop-subtree and which
>> of them needs to be passed. This can currently avoided as we know that the
>> ScopStmts memory accesses are the only ones that need to be considered.
> We can hand down all preloaded values to the subfunction, that should
> solve the problem (see r249010).

It unbreaks the build, but is not really optimal. In a large kernel, we 
now always pass down _all_ invariant loads, instead of just passing down 
the ones that are actually used in the parallel loop body. For a large 
scop (the largest scop I currently play with has 266 statements), this 
is rather inefficient.

Best,
Tobias


More information about the llvm-commits mailing list