[LLVMdev] Interesting post increment situation in DAG combiner
Sergei Larin
slarin at codeaurora.org
Fri Mar 1 15:16:04 PST 2013
Hal,
Here is my patch for the post inc case. I think it is symmetrically applicable to the pre-inc, but I have not tested it for that.
I think you can clearly see my intent here - I simply select the "latest" candidate when multiple are available.
Who else might be interested in this?
Sergei
---
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
> -----Original Message-----
> From: Hal Finkel [mailto:hfinkel at anl.gov]
> Sent: Friday, March 01, 2013 4:53 PM
> To: Sergei Larin
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: Interesting post increment situation in DAG combiner
>
> ----- Original Message -----
> > From: "Sergei Larin" <slarin at codeaurora.org>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: llvmdev at cs.uiuc.edu
> > Sent: Friday, March 1, 2013 10:24:39 AM
> > Subject: Interesting post increment situation in DAG combiner
> >
> > Hal, (and everyone who might care about post increment generation)...
>
> Sergei,
>
> Perhaps this is a problem that I wish that I had ;) -- PPC does not
> have post-increment loads and stores, only pre-increment ones.
> Nevertheless, I think that the situation is similar...
>
> For one thing, I recently committed an enhancement to pre-increment
> generation that causes later constant offsets to use the new
> incremented address instead of the old address. I thought that this
> would not be an issue for post-increment generation, but it seems that
> I was wrong: you'd have the same problem here: if you post-increment
> the load and not the store, then you might need an extra register to
> hold the original base address for the store. In this case, you'd not
> have a problem if you just chose to post-increment the store instead,
> but the general problem still exists.
>
> Regarding the selection of which load/store to pre/post increment, I
> think that this is also a general issue. At least for pre-increment
> generation I've seen it make some odd choices.
>
> In short, I do recognize the issue, and I'm curious to see your patch.
> If it can improve pre-increment selection as well, that would help me
> too.
>
> Thanks again,
> Hal
>
> >
> > I have an interesting question/observation. Consider this vector
> loop.
> >
> > void vec_add_const(unsigned N, short __attribute__ ((aligned (16)))
> > *A,
> > short __attribute__ ((aligned
> > (16))) val) { unsigned i,j; for
> > (i=0; i<N; i++) {
> > for (j=0; j<N; j++) {
> > A[i*N+j] += val;
> > }
> > }
> > }
> >
> > The innermost loop looks like this right before the DAG selection
> > begins.
> >
> > p.loop_body.us65: ; preds =
> > %p.loop_body.lr.ph.us78, %p.loop_body.us65
> > %p_arrayidx.us69.phi = phi i16* [ %p_arrayidx.us69.gep,
> > %p.loop_body.lr.ph.us78 ], [ %p_arrayidx.us69.inc, %p.loop_body.us65
> ]
> > %p.loopiv48.us66 = phi i32 [ 0, %p.loop_body.lr.ph.us78 ], [
> > %p.next_loopiv.us67, %p.loop_body.us65 ]
> > %vector_ptr.us70 = bitcast i16* %p_arrayidx.us69.phi to <4 x i16>*
> > %p.next_loopiv.us67 = add nsw i32 %p.loopiv48.us66, 4
> > <<<<<<<<<<<<<<<<<<
> > IV
> > %_p_vec_full.us71 = load <4 x i16>* %vector_ptr.us70, align 16
> > <<<<<<<<<<<<<<<<<<<Load
> > %add5p_vec.us72 = add <4 x i16> %_p_vec_full.us71, %5
> > store <4 x i16> %add5p_vec.us72, <4 x i16>* %vector_ptr.us70, align
> > 16
> > <<<<<<<<<<<<<<<Store
> > %p_arrayidx.us69.inc = getelementptr i16* %p_arrayidx.us69.phi, i32
> > 4
> > <<<<<<<<<<<<<<< Common Ptr
> > %11 = icmp slt i32 %p.next_loopiv.us67, %leftover_lb
> > br i1 %11, label %p.loop_body.us65, label
> > %p.loop_header38.preheader.us84
> >
> > When it gets to the DAG Combiner, in CombineToPostIndexedLoadStore()
> > two opportunities for post inc are recognized - the load and the
> > store.
> > Now, you can easily see that in this case you would want the store to
> > get the post inc, not the load, but since the DAG combiner simply
> > scans top-down, the opposite happens.
> >
> > So here is the question - do you recognize this as a deficiency,
> > and can
> > you see the same in PPC? The fix is code trivial, but it would
> > introduce a general concept of a primitive cost function to the
> > PostInc candidacy selection in DAG combine. If you recognize the
> > issue, I will post a patch with more details, but if I am missing the
> > big picture here, please advise.
> >
> > Sergei
> >
> >
> > ---
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > hosted by The Linux Foundation
> >
> >
> >
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DagCombiner_PostInc.patch
Type: application/octet-stream
Size: 3444 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130301/97fc5162/attachment.obj>
More information about the llvm-dev
mailing list