[PATCH] Added more insertps optimizations
filcab+llvm.phabricator at gmail.com
Fri May 16 12:48:45 PDT 2014
Thanks Andrea. I will check that out and resubmit the patch.
On Fri, May 16, 2014 at 12:47 PM, Andrea Di Biagio <
Andrea_DiBiagio at sn.scee.net> wrote:
> > Hi Andrea,
> > Wouldn't the tablegen patterns be problematic when we use the load result
> > several times?
> > If we use it several times, then we shouldn't generate additional loads.
> > do you think it doesn't matter since they should be close together, since
> > we're generating for one BB only?
> The ISel pattern matcher will only reduce a dag sequence if the chain is
> "profitable to fold". A chain is never profitable, for example, if it
> contains nodes with more than one use.
> In the case of your new test function 'insertps_from_vector_load', if I
> add another use for variable %1, then the following rule will no longer
> def : Pat<(v4f32 (X86insertps (v4f32 VR128:$src1), (loadv4f32
> (INSERTPSrm VR128:$src1, addr:$src2, imm:$src3)>;
> That is because eventually, method 'X86DAGtoDAGISel::IsProfitableToFold'
> will return false if the 'loadv4f32' has more than one use (see
> X86ISelDAGToDAG.cpp - around line 303). Basically, the matcher will try to
> verify that all the intermediate nodes between the X86Insertps (excluded)
> and the loadv4f32 (included), only have a single use.
> Given how the ISel Matcher works, in case of a load result used several
> times, we will not generate additional loads.
> > That was the reasoning behind doing it with code and guarding them with
> > MayFoldLoad, which includes hasOneUse().
> That reasoning is correct. However, the pattern Matcher will do the same
> and will never try to fold the loadv4f32 if it has more than one use.
> (P.s.: you can see it by yourself if you some experiments adding more uses
> for the load instruction and then you debug ISel passing flag -debug to
> I hope this helps.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits