[LLVMdev] Pointer Context Metadata (was: Parallel Loop Metadata)

Vladimir Guzma vladimir.guzma at tut.fi
Tue Feb 19 20:37:38 PST 2013


>>> 
>>> - Update the language reference
>>> - Update the loop vectorizer (to update the metadata when it
>>> unrolls)
>>> - Update the regular unroller
>>> - Update the alias analysis (maybe this is sufficient for basic
>>> support in BBVectorize) - is your current code close enough for
>>> this?
>> 
>> Current implementation of AA uses work item metadata as well as
>> 'region' metadata identifiers (regions begin separated by barriers).
>> So in order to provide similar functionality, parallel loop metadata
>> would need both, 'loop ID' and 'loop iteration ID'.
>> 
>> This is perhaps not much of a immediate concern with respect to
>> current discussion, but can become trouble in case there are two
>> consecutive loops, both marked with parallel loop metadata, both
>> fully unrolled (or partially unrolled followed by some loop
>> fusion/combining pass). In this case there is need for 'loop ID' to
>> distinguish origin, since 'loop iteration ID' is not enough.
> 
> Agreed, but I think that the current loop.parallel metadata scheme already has that. The memory access metadata refers to the metadata marking the loop back edges.

Good.
> 
>> 
>>> - Update the BB vectorizer to prefer pairings from different
>>> iterations
>> 
>> Updating BB vectorizer to use work item metadata was rather trivial
>> addition of a test for difference in identifiers, very similar to
>> one in AA (though we also record position of the instruction in the
>> originating block) and should be trivial to add to BBvectorizer as
>> well, using parallel metadata.
>> It become major mess once complains started about speed, e.g. pocl
>> version does not have any search limits or maximum instr per group
>> etc, so finding all candidate pairs become rather time consuming.
>> So the candidate selection is not compatible with BB vectorizer, and
>> whole bunch of code was removed…
> 
> Interesting.
> 
>> 
>> Anyways, perhaps interesting parts for integrating to BBVectorizer
>> could be (crude) caching during replaceOutputs to be used when
>> vectorizing phi nodes. There is also some vectorization of
>> getelementpointer instructions, creation vectors of allocas to get
>> better vector memory accesses, some magic about computing addresses
>> of stride memory accesses using vectors, some tweaks to eliminate
>> unneeded shuffle instructions in replacement inputs etc. There are
>> lot of assumptions that the instructions to be vectorized are really
>> identical from different work items (due to recorded position in the
>> originating code), which may not be case in general BB vectorized
>> cases.
> 
> To clarify, are these features that you've implemented in your version?

Yes, these are there. As well as some stuff to clean up after vectorizer...
>From performance point of view, addition of vectors of phi nodes was most beneficial for our main target (TTA architecture).

regards
Vlado
> 
>> 
>> Anyways, if the loop metadata gets updated, I can have a look at
>> updating AA and moving it from pocl to LLVM, but not likely this
>> week (maybe Pekka can provide it sooner if there is a rush).
>> I can not make any promises about BBVectorization atm, unfortunately.
> 
> Great, thanks!
> 
> -Hal
> 
>> 
>> regards
>> Vlado
>>> 
>>> Thanks again,
>>> Hal
>>> 
>>>> 
>>>> BR,
>>>> --
>>>> --Pekka
>>>> 
>>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> 
>> 





More information about the llvm-dev mailing list