[LLVMdev] [RFC] OpenMP Representation in LLVM IR
hfinkel at anl.gov
Tue Oct 9 22:19:32 PDT 2012
----- Original Message -----
> From: "Andrey Bokhanko" <andreybokhanko at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Wednesday, October 3, 2012 3:15:54 AM
> Subject: Re: [LLVMdev] [RFC] OpenMP Representation in LLVM IR
> > While I think that it will be relatively easy to have the
> > intrinsics
> > serve as code-motion barriers for other code that might be threads
> > sensitive (like other external function calls), we would need to
> > think
> > through exactly how this would work. The easiest thing would be to
> > make
> > the intrinsics have having unmodeled side effects, although we
> > might
> > want to do something more intelligent.
> Yes, that's exactly the idea.
Right. You should verify that using the 'unmodeled side effects' tag does not inhibit the optimizations you seek to preserve. If we need to work out some other less-restrictive semantics, then we should discuss that.
> > Where do you propose placing the parallel loop intrinsics calls
> > relative to the loop code?
> In preloop ("opening" intrinsic) and postloop ("closing" one).
> > Will this inhibit restructuring (like loop
> > interchange), fusion, etc. if necessary?
> I guess so... Loops usually deal with reading/writing memory, and if
> an intrinsic is marked as "modifies everything", this hardly leaves
> any possibility for [at least] the optimizations you mentioned.
> But this is different from what I have in mind. Basically, the plan
> to perform analysis and some optimizations before procedurization,
> do the rest (including loop restructuring) after it. This is not
> mentioned in the proposal (we tried to be succint -- only 20 pages
> long! :-)), but explained in detail in [Tian05] (sorry, the link in
> the proposal doesn't lead you directly to pdf file; use this one
With regard to what you're proposing, the paper actually leaves a lot unexplained. The optimizations that it discusses prior to OpenMP lowering seem to be, "classical peephole optimizations within basic-blocks", inlining, and "OpenMP construct-aware constant propagation" (in addition to some aliasing analysis). If this is what you plan to do in LLVM as well, are you planning on implementing special-purpose passes for these transformations, or re-using existing ones? If you're reusing existing ones, which ones? And how would they need to be modified to become 'OpenMP aware'?
Can you please comment on the loop-analysis use case that I outline here:
Would this kind of simplification fall under the 'constant propagation' transformation, or will something else be required?
What might be most useful is if we develop a set of optimization tests so that we have a set of use cases from which we can base a design. Do you already have a set of such tests? I'd be happy to work with you on this.
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev