[LLVMdev] sinking address computing in CodeGenPrepare

Thu Nov 21 19:25:47 PST 2013

----- Original Message -----
> From: "Evan Cheng" <evan.cheng at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM" <llvmdev at cs.uiuc.edu>, "Junbum Lim" <junbums at gmail.com>, "Andrew Trick" <atrick at apple.com>
> Sent: Thursday, November 21, 2013 6:47:40 PM
> Subject: Re: [LLVMdev] sinking  address computing in CodeGenPrepare
> 
> 
> On Nov 20, 2013, at 10:38 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "Evan Cheng" <evan.cheng at apple.com>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: "LLVM" <llvmdev at cs.uiuc.edu>, "Junbum Lim" <junbums at gmail.com>
> >> Sent: Wednesday, November 20, 2013 7:48:13 PM
> >> Subject: Re: [LLVMdev] sinking  address computing in
> >> CodeGenPrepare
> >> 
> >> 
> >> On Nov 20, 2013, at 5:38 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> >> 
> >>> ----- Original Message -----
> >>>> From: "Evan Cheng" <evan.cheng at apple.com>
> >>>> To: "Junbum Lim" <junbums at gmail.com>
> >>>> Cc: llvmdev at cs.uiuc.edu
> >>>> Sent: Wednesday, November 20, 2013 7:01:49 PM
> >>>> Subject: Re: [LLVMdev] sinking  address computing in
> >>>> CodeGenPrepare
> >>>> 
> >>>> 
> >>>> On Nov 20, 2013, at 3:10 PM, Junbum Lim <junbums at gmail.com>
> >>>> wrote:
> >>>> 
> >>>>> 
> >>>>> 
> >>>>> When multiple GEPs or other operations are used for the address
> >>>>> calculation, OptimizeMemoryInst() performs address matching and
> >>>>> determines a final addressing expression as a simple form
> >>>>> (e.g.,
> >>>>> ptrtoint/add/inttoptr) and sinks it into user's block so that
> >>>>> ISel
> >>>>> could have better chance to fold address computation into LDRs
> >>>>> and
> >>>>> STRs. However, OptimizeMemoryInst() seems to do this
> >>>>> transformation even when the address calculation derived from a
> >>>>> single GEP, resulting in poor alias analysis because GEP is no
> >>>>> longer used.
> >>>> 
> >>>> I don't follow your last statement. How does this impact AA?
> >>>> CodeGenPrep is run late, after AA is done.
> >>> 
> >>> I don't know if this is relevant for Lim or not, but some targets
> >>> use AA during CodeGen (instruction scheduling mostly, but SDAG
> >>> too).
> >> 
> >> MachineSched uses AA to determine if something is loop invariant,
> >> which basically boils down to looking at machine operand and see
> >> it's pointing to constant memory. I don't see how that's impact by
> >> GEP vs. ADDS + MUL.
> > 
> > MachineSched can use AA for a lot more than that. I use AA during
> > scheduling because, in addition to picking up loads from constant
> > memory, it lets me do a kind of modulo scheduling for unrolled
> > loops. AA can tell that loads and stores to different arrays don't
> > alias, and loads and stores to different offsets of the same array
> > don't alias.
> 
> I still don't understand what this has to do with whether GEP is
> lowered in codegenprep though.

As I recall, BasicAA does not look through int <-> ptr conversions.

> 
> > 
> >> Also, the analysis should have already been done
> >> and cached.
> > 
> > BasicAA has a cache internally, but as far as I can tell, only to
> > guard against recursion (and it is emptied after each query). Am I
> > missing something?
> 
> It's not clear to me how AA is used in codegen. I understand some
> information are transferred to memoperands during LLVM IR to SDISel
> conversion. Is AA actually being recomputed using LLVM IR during
> codegen?

It depends on what the (sub)target requests. By default, no. But if the target overrides TargetSubtargetInfo::useAA to return true, then yes.

 -Hal

> 
> Evan
> 
> > 
> > -Hal
> > 
> >> 
> >> Evan
> >> 
> >>> 
> >>> -Hal
> >>> 
> >>>> 
> >>>> Evan
> >>>> 
> >>>>> 
> >>>>> So, do you think it is a possible workaround to sink a GEP
> >>>>> without
> >>>>> converting it into a set of integer operations
> >>>>> (ptrtoint/add/inttoptr) if the address mode is derived only
> >>>>> from
> >>>>> a
> >>>>> single GEP.
> >>>>> 
> >>>>> Thanks,
> >>>>> Jun
> >>>>> 
> >>>>> 
> >>>>> On Nov 12, 2013, at 7:14 PM, Evan Cheng <evan.cheng at apple.com>
> >>>>> wrote:
> >>>>> 
> >>>>>> 
> >>>>>> On Nov 12, 2013, at 11:24 AM, Junbum Lim <junbums at gmail.com>
> >>>>>> wrote:
> >>>>>> 
> >>>>>>> 
> >>>>>>> I wonder why CodeGenPrepare breaks GEP into integer
> >>>>>>> calculations
> >>>>>>> (ptrtoin/add/inttopt) instead of directly sinking the address
> >>>>>>> calculation using GEP into user's block.
> >>>>>> 
> >>>>>> I believe it's primary for address mode matching where only
> >>>>>> part
> >>>>>> of the GEP can be folded (depending on the instruction set).
> >>>>>> 
> >>>>>> Evan
> >>>>>> 
> >>>>>>> 
> >>>>>>> Thanks,
> >>>>>>> Jun
> >>>>>>> 
> >>>>>>> 
> >>>>>>> On Nov 12, 2013, at 12:07 PM, Evan Cheng
> >>>>>>> <evan.cheng at apple.com>
> >>>>>>> wrote:
> >>>>>>> 
> >>>>>>>> The reason for this is to allow folding of address
> >>>>>>>> computation
> >>>>>>>> into loads and stores. A lot of modern arch, e.g. X86 and
> >>>>>>>> arm,
> >>>>>>>> have complex addressing mode.
> >>>>>>>> 
> >>>>>>>> Evan
> >>>>>>>> 
> >>>>>>>> Sent from my iPad
> >>>>>>>> 
> >>>>>>>>> On Nov 12, 2013, at 8:39 AM, Junbum Lim <junbums at gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>> 
> >>>>>>>>> Hi All,
> >>>>>>>>> 
> >>>>>>>>> In CodeGenPrepare pass,  OptimizeMemoryInst() try to sink
> >>>>>>>>> address computing into users' block by converting GET to
> >>>>>>>>> integers? It appear that it have impacts on ISel's result,
> >>>>>>>>> but
> >>>>>>>>> I'm not clear about the main purpose of the transformation.
> >>>>>>>>> 
> >>>>>>>>> FROM :
> >>>>>>>>> for.body.lr.ph:
> >>>>>>>>>          %zzz = getelementptr inbounds %struct.SS* %a2, i32
> >>>>>>>>>          0, i32 35
> >>>>>>>>> 
> >>>>>>>>> for.body:
> >>>>>>>>>          %4 = load double* %zzz, align 8, !tbaa !0
> >>>>>>>>> 
> >>>>>>>>> TO :
> >>>>>>>>> for.body:
> >>>>>>>>> %sunkaddr27 = ptrtoint %struct.SS* %a2 to i32       <-----
> >>>>>>>>> sink
> >>>>>>>>> address computing into user's block
> >>>>>>>>> %sunkaddr28 = add i32 %sunkaddr27, 272
> >>>>>>>>> %sunkaddr29 = inttoptr i32 %sunkaddr28 to double*
> >>>>>>>>> %4 = load double* %sunkaddr29, align 8, !tbaa !8
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>> From what I observed, this transformation can cause poor
> >>>>>>>>> alias
> >>>>>>>>> analysis results without using GEP.  So, I want to see
> >>>>>>>>> there
> >>>>>>>>> is any way to avoid this conversion.
> >>>>>>>>> 
> >>>>>>>>> My question is :
> >>>>>>>>> 1. Why do we need to sink address computing into users'
> >>>>>>>>> block?
> >>>>>>>>> What is the benefit of this conversion ?
> >>>>>>>>> 2. Can we directly use GEP instead of breaking it into
> >>>>>>>>> integer
> >>>>>>>>> calculations ?
> >>>>>>>>> 
> >>>>>>>>> Thanks,
> >>>>>>>>> Jun
> >>>>>>>>> _______________________________________________
> >>>>>>>>> LLVM Developers mailing list
> >>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>>>>>> 
> >>>>>> 
> >>>>> 
> >>>> 
> >>>> _______________________________________________
> >>>> LLVM Developers mailing list
> >>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>>> 
> >>> 
> >>> --
> >>> Hal Finkel
> >>> Assistant Computational Scientist
> >>> Leadership Computing Facility
> >>> Argonne National Laboratory
> >> 
> >> 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory