[LLVMdev] sinking address computing in CodeGenPrepare

Tue Nov 26 16:04:05 PST 2013

----- Original Message -----
> From: "Junbum Lim" <junbums at gmail.com>
> To: "Andrew Trick" <atrick at apple.com>
> Cc: "Evan Cheng" <evan.cheng at apple.com>, "Hal Finkel" <hfinkel at anl.gov>, "LLVM" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, November 26, 2013 5:04:43 PM
> Subject: Re: [LLVMdev] sinking  address computing in CodeGenPrepare
> 
> 
> 
> 
> On Nov 21, 2013, at 10:37 PM, Andrew Trick < atrick at apple.com >
> wrote:
> 
> 
> 
> 
> 
> 
> On Nov 21, 2013, at 4:47 PM, Evan Cheng < evan.cheng at apple.com >
> wrote:
> 
> 
> 
> 
> On Nov 20, 2013, at 10:38 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> 
> ----- Original Message -----
> 
> 
> From: "Evan Cheng" < evan.cheng at apple.com >
> To: "Hal Finkel" < hfinkel at anl.gov >
> Cc: "LLVM" < llvmdev at cs.uiuc.edu >, "Junbum Lim" < junbums at gmail.com
> >
> Sent: Wednesday, November 20, 2013 7:48:13 PM
> Subject: Re: [LLVMdev] sinking address computing in CodeGenPrepare
> 
> 
> On Nov 20, 2013, at 5:38 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> 
> ----- Original Message -----
> 
> 
> From: "Evan Cheng" < evan.cheng at apple.com >
> To: "Junbum Lim" < junbums at gmail.com >
> Cc: llvmdev at cs.uiuc.edu
> Sent: Wednesday, November 20, 2013 7:01:49 PM
> Subject: Re: [LLVMdev] sinking address computing in
> CodeGenPrepare
> 
> 
> On Nov 20, 2013, at 3:10 PM, Junbum Lim < junbums at gmail.com > wrote:
> 
> 
> 
> 
> 
> When multiple GEPs or other operations are used for the address
> calculation, OptimizeMemoryInst() performs address matching and
> determines a final addressing expression as a simple form (e.g.,
> ptrtoint/add/inttoptr) and sinks it into user's block so that
> ISel
> could have better chance to fold address computation into LDRs
> and
> STRs. However, OptimizeMemoryInst() seems to do this
> transformation even when the address calculation derived from a
> single GEP, resulting in poor alias analysis because GEP is no
> longer used.
> 
> I don't follow your last statement. How does this impact AA?
> CodeGenPrep is run late, after AA is done.
> 
> I don't know if this is relevant for Lim or not, but some targets
> use AA during CodeGen (instruction scheduling mostly, but SDAG
> too).
> 
> MachineSched uses AA to determine if something is loop invariant,
> which basically boils down to looking at machine operand and see
> it's pointing to constant memory. I don't see how that's impact by
> GEP vs. ADDS + MUL.
> 
> MachineSched can use AA for a lot more than that. I use AA during
> scheduling because, in addition to picking up loads from constant
> memory, it lets me do a kind of modulo scheduling for unrolled
> loops. AA can tell that loads and stores to different arrays don't
> alias, and loads and stores to different offsets of the same array
> don't alias.
> 
> I still don't understand what this has to do with whether GEP is
> lowered in codegenprep though.
> 
> 
> 
> 
> 
> 
> Also, the analysis should have already been done
> and cached.
> 
> BasicAA has a cache internally, but as far as I can tell, only to
> guard against recursion (and it is emptied after each query). Am I
> missing something?
> 
> It's not clear to me how AA is used in codegen. I understand some
> information are transferred to memoperands during LLVM IR to SDISel
> conversion. Is AA actually being recomputed using LLVM IR during
> codegen?
> 
> 
> 
> In general, when AA is used during codegen, it grabs the IR value
> from the machine memoperands, then runs normal IR-level alias
> analysis. The IR needs to stay around and be immutable. That’s why
> anything that changes aliasing of IR-level memory ops should be run
> before CodeGen. For example, stack coloring needs to conservatively
> mutilate the machine memoperands to work around this problem.
> 
> 
> We need to sink address computation to expose addressing modes to
> ISEL, but I’m not sure why we need to lower to ptrtoint. That
> doesn’t seem good for AA at all.
> 
> -Andy
> 
> 
> 
> 
> 
> 
> 
> 
> When multiple GEPs or other multiple operations are used for the
> address calculation, codegenprep performs address matching and
> determines a final addressing expression as a simple form (e.g.,
> ptrtoint/add/inttoptr) and sinks it into user's block, resulting in
> folding of address computations into LDRs and STRs.
> 
> 
> However, codegenprep performs this conversion even for an address
> expression from a single GEP, resulting in poor AA in scheduler
> because basicaa doesn't handle IntToPrt. As a simple workaround, I
> think the GEP could be directly sunk without breaking it into
> integers operations if the address is simply derived from a single
> GEP.

I think that this makes sense. It will be interesting to see if this results in any non-AA-related CodeGen changes. Can you prepare a patch?

 -Hal

> 
> 
> -Jun
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -Hal
> 
> 
> 
> 
> Evan
> 
> 
> 
> 
> -Hal
> 
> 
> 
> 
> Evan
> 
> 
> 
> 
> So, do you think it is a possible workaround to sink a GEP
> without
> converting it into a set of integer operations
> (ptrtoint/add/inttoptr) if the address mode is derived only from
> a
> single GEP.
> 
> Thanks,
> Jun
> 
> 
> On Nov 12, 2013, at 7:14 PM, Evan Cheng < evan.cheng at apple.com >
> wrote:
> 
> 
> 
> 
> On Nov 12, 2013, at 11:24 AM, Junbum Lim < junbums at gmail.com >
> wrote:
> 
> 
> 
> 
> I wonder why CodeGenPrepare breaks GEP into integer
> calculations
> (ptrtoin/add/inttopt) instead of directly sinking the address
> calculation using GEP into user's block.
> 
> I believe it's primary for address mode matching where only part
> of the GEP can be folded (depending on the instruction set).
> 
> Evan
> 
> 
> 
> 
> Thanks,
> Jun
> 
> 
> On Nov 12, 2013, at 12:07 PM, Evan Cheng < evan.cheng at apple.com >
> wrote:
> 
> 
> 
> The reason for this is to allow folding of address computation
> into loads and stores. A lot of modern arch, e.g. X86 and arm,
> have complex addressing mode.
> 
> Evan
> 
> Sent from my iPad
> 
> 
> 
> On Nov 12, 2013, at 8:39 AM, Junbum Lim < junbums at gmail.com >
> wrote:
> 
> Hi All,
> 
> In CodeGenPrepare pass, OptimizeMemoryInst() try to sink
> address computing into users' block by converting GET to
> integers? It appear that it have impacts on ISel's result,
> but
> I'm not clear about the main purpose of the transformation.
> 
> FROM :
> for.body.lr.ph:
> %zzz = getelementptr inbounds %struct.SS* %a2, i32
> 0, i32 35
> 
> for.body:
> %4 = load double* %zzz, align 8, !tbaa !0
> 
> TO :
> for.body:
> %sunkaddr27 = ptrtoint %struct.SS* %a2 to i32 <-----
> sink
> address computing into user's block
> %sunkaddr28 = add i32 %sunkaddr27, 272
> %sunkaddr29 = inttoptr i32 %sunkaddr28 to double*
> %4 = load double* %sunkaddr29, align 8, !tbaa !8
> 
> 
> From what I observed, this transformation can cause poor
> alias
> analysis results without using GEP. So, I want to see there
> is any way to avoid this conversion.
> 
> My question is :
> 1. Why do we need to sink address computing into users'
> block?
> What is the benefit of this conversion ?
> 2. Can we directly use GEP instead of breaking it into
> integer
> calculations ?
> 
> Thanks,
> Jun
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory