[PATCH 2/3] ARM cost model: Address computation in vector mem ops not free
Nadav Rotem
nrotem at apple.com
Thu Feb 7 11:42:57 PST 2013
+++ b/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -3045,7 +3045,8 @@ LoopVectorizationCostModel::getInstructionCost(Instruction *I, unsigned VF) {
// We mark this instruction as zero-cost because scalar GEPs are usually
// lowered to the intruction addressing mode. At the moment we don't
// generate vector geps.
- return 0;
+ return TTI.getAddressComputationCost(VectorTy);
+
We include the cost of GEPs when we calculate the Load/Store costs. Are you worried about cases where GEPs is not consumed by load/stores ?
Thanks,
Nadav
On Feb 7, 2013, at 11:32 AM, Renato Golin <renato.golin at linaro.org> wrote:
> On 7 February 2013 14:31, Arnold <aschwaighofer at apple.com> wrote:
> I agree with you, it is unfortunate. However, I am trying to model an idiosyncrasy of the processor that has a big implication on performance. It is very expensive on swift if you happen to load into a S register, or D sub lane. Two such instructions are not pipelined but sequentialized.
>
> In that case, the cost will be much more than 2 or 3, no?
>
>
> Stride has the value of the isConsecutivePtr method:
>
> Ok, in the original code you had:
>
> if (Stride < 0)
> return parent::cost();
> return Cost;
>
> In this you have:
>
> if (Stride > 0)
> return Cost;
> return parent::cost();
>
> It seems you're missing the case where it's == 0, but I can't tell which way it should go.
>
>
> I don't think we need a function call for the value 3 here. It is a value just like any other that is returned by TTI.
>
> What I'm trying to say is that this value seems to come out of the blue. I could be wrong, obviously, but it seems to me that you're experimenting with a micro-benchmark and fine-tuning to your particular example, which is dangerous on a wider perspective.
>
> I understand that this might be a big hit on a set of examples, but we should get some constants out, just to make it clear that we're not talking about "idealized cycle count", but something else entirely.
>
> Like:
>
> const int AVOID_AT_ALL_COSTS = 100;
> const int DANGEROUS_IN_MOST_CASES = 10;
> const int NOT_GOOD_BUT_COULD_BE_OK = 5;
>
> etc...
>
> cheers,
> --renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130207/584b634d/attachment.html>
More information about the llvm-commits
mailing list