[LLVMdev] loop pragmas

Tue Nov 27 04:49:39 PST 2012

I am thinking about another use of annotations that fits in a longer term vision, which centers around feeding compilers with information from higher-level tools such as precompilers.

Deciding how to map a portable piece of software to a heterogeneous multicore processor and get the best performance for a range of widely varying architectures, requires much higher level code analysis than what is possible in "standard" compilers on their relatively low-level IRs. To have portable performance and high programmer productivity, an application will need to be written in a higher-level language, like Julia, MATLAB, ... that can deliver much of the needed information to the compilers, for example by using parallel programming patterns (see Berkeley Parlab), and that allows a compiler to choose among different implementations of algorithms and data structures (see Petabricks). Building compiler front-ends, middle-ends and back-ends from scratch that can use all information available in such programs and that produce high-quality code for a range of architectures is undoable for most if not all research labs. 

And it should also not be necessary: many excellent lower-level language compilers already exist, some of which support a wide range of architectures, such as LLVM (OoO CPUs, VLIWs, GPUs, CGRAs in my own backend and in Samsung's proprietary SRP backend, etc.). So for research purposes and hopefully later also real-world development if the research results are good, ideally we would have to only develop the necessary precompiler tools that take in very high-level code and that produce tuned lower-level (C or bitcode or LLVM IR or ...) after high-level analysis and target-dependent optimisations and selections, after which LLVM then takes care of the further low-level optimizations and specific code generation for the different targets in the heterogeneous multicore. 

To facilitate research in this direction and use LLVM in such a tool flow, it is absolutely necessary that both the precompiler and the researcher doing manual experiments can steer the low-level LLVM compiler with code annotations such as loop pragmas or attributes, simply because once the code is in a lower-level form such as C or bitcode or LLVM IR, there is not enough information available in the code itself to select the best low-level transformations. 

It is interesting to note that in such an approach, the precompiler probably would be trained with machine learning. If this is the case, it might also be able to learn which annotations actually influence the compiler and which do not, for example because they are destroyed before some optimization pass is executed. So the precompiler can learn to generate code tuned for specific targets taking into account all limitations of the compiler that will do the actual code generation, including its incomplete or stupid or whatever support for pragmas and other such things...

Best,

Bjorn

On 21 Nov 2012, at 18:56, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:

> On 11/21/2012 11:32 AM, Tobias Grosser wrote:
>> On 11/21/2012 03:45 PM, Krzysztof Parzyszek wrote:
>>> 
>>> I'm thinking of this in terms of parallelization directives.  The
>>> optimizations that rely on such annotations would need to be done as
>>> early as possible, before any optimization that could invalidate them.
>>> If the annotation can become false, you are right---it's probably not a
>>> good idea to have it as the medium.
>> 
>> If we use metadata to model annotations, we need to ensure that it is
>> either correct or in case a transformation can not guarantee the
>> correctness of the meta data, that it is removed.
> 
> Yes, that is not hard to accomplish.
> 
> 
>>> Other types of annotations that are
>>> "harmless" are probably good to have, for example "unroll-by" (assuming
>>> that this is a suggestion to the compiler, not an order).
>> 
>> To my knowledge, we are avoiding to allow the user to 'tune' the
>> compiler. Manual tuning may be good for a certain piece of hardware, but
>> will have negative effects on other platforms.
> 
> A lot of ISV code is meant to run on a particular platform, or on a small set of target platforms.  Such code is often hand-tuned and the tuning directives will be different for different targets..  I see no reason why we shouldn't allow that.  As a matter of fact, not allowing it will make us less competitive.
> 
> 
>> Instead of providing facilities to tune the hardware, we should
>> understand why LLVM does not choose the right unrolling factor.
> 
> Because in general it's impossible.  User's hints are always welcome.
> 
> -Krzysztof
> 
> 
> -- 
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev