[llvm-commits] patch: partial DSE

Peter Cooper peter_cooper at apple.com
Tue Nov 1 16:39:55 PDT 2011


I agree with you Evan if you consider the magic 16 constant which really does need TargetData, but i could remove that for now.  The result would be the following 2 conditions allowing the optimization to take place:

llvm::isPowerOf2_64(InstWriteOffset)

which means that the earlier store is going to be trimmed to a power of 2 in which case any instructions over the power of 2 boundary were likely not vector instructions anyway, or if they were then we're replacing all those vector instructions with probably more vector instructions which is good, and

((DepWriteAlign != 0) && InstWriteOffset % DepWriteAlign == 0)

which says that the later store is at an offset at the same alignment as the earlier store.  If the alignment was < 16 in this case then we'd probably not generate vector instructions for the earlier stores anyway so trimming the earlier store should be ok.

Pete


On Nov 1, 2011, at 4:22 PM, Evan Cheng wrote:

> This might be something we have to defer until we add "what's native types" to TargetData. I'm worried when / if this does something bad, the performance impact can be very significant.
> 
> Evan
> 
> On Nov 1, 2011, at 3:12 PM, Eli Friedman wrote:
> 
>> On Tue, Nov 1, 2011 at 2:47 PM, Eric Christopher <echristo at apple.com> wrote:
>>> 
>>> On Oct 31, 2011, at 5:38 PM, Peter Cooper wrote:
>>> 
>>>> Hi
>>>> 
>>>> Please review this patch to allow DSE to trim stores as opposed to deleting them.
>>>> 
>>>> The logic here is that if the end of the earlier store is dead because of a later store then the length of the earlier store will be trimmed in size to avoid writing dead memory.  The only time i won't do this is if the original store was likely to use vector writes which if shortened would end up as multiple scalar writes and so is less efficient.
>>>> 
>>> 
>>> Not a huge fan of this style:
>>> 
>>> +          (OR = isOverwrite(Loc, DepLoc, *AA,
>>> +                            DepWriteOffset,
>>> +                            InstWriteOffset)) != OverwriteUnknown &&
>>> 
>>> in large conditionals. Things that return booleans, or set something for a block, e.g.:
>>> 
>>> if (ConstantInt *CI = dyn_cast<ConstantInt>(Inst)) {
>>> }
>>> 
>>>> Any help removing the magic vector size (16) constant would be good too :)
>>> 
>>> Something like this maybe?
>>> 
>>> bool isLegalVector = false;
>>> if (VectorType *VecTy = dyn_cast<VectorType>(Store->getType()) {
>>> EVT VT = TLI.getValueType(VecTy);
>>> isLegalVector = TLI.isTypeLegal(VT);
>>> }
>>> 
>>> does require target info though and I'm not sure how kosher that is in AA.
>> 
>> There aren't actually any vector types involved here; the issue is
>> that, for example, a 32 byte memset is cheaper than a 31-byte memset
>> under the default settings on x86-64.  I'm not sure what the right
>> approach is here.
>> 
>> -Eli
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111101/c0983861/attachment.html>


More information about the llvm-commits mailing list