[PATCH] D42981: [COST] Fix cost model of load instructions on X86

Thu Feb 25 06:18:45 PST 2021

ABataev added a comment.

In D42981#2587030 <https://reviews.llvm.org/D42981#2587030>, @lebedev.ri wrote:

> Please can you explain why in some cases the cost increases,
> and we no longer vectorize some test cases/vectorize with smaller vector width?

It does not increase the cost of vector code, it lowers the cost of scalar code with memops. Just the resulting cost difference between vector version and scalar version increases. I explained it already, that's because of the cost model. For example, the throughput cost of `add r,r` for X86 is 0.25, and for `add m,r` is 0.5. But the cost model gives the cost 1 for the `add r,r` and 2 for `add m,r` (load + add, 1 for load and 1 for add LLVM IR instructions).
At the same time, the actual throughput cost of `VMOV+VADD` is ~1.5, and the cost model estimates it to 2 (again, 1 for vmov and 1 for vadd). As you can see, `add m,r` and `VMOV+VADD` have the same cost, though actually for `add m,r` it should be lower.
So, in terms of comparing the cost for 2 scalar instructions `add r,r` and `add m,r` it is still not correct but it is not important because we don't need to compare the cost of scalar instructions but compare the cost of scalar and vector versions of instructions.
So, it lowers the cost of scalar instructions with mem access relatively to the vector versions of instructions.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D42981/new/

https://reviews.llvm.org/D42981