[PATCH] D44523: Change calculation of MaxVectorSize

Thu Mar 15 12:12:16 PDT 2018

dcaballe added a comment.

Hi Krzysztof,

I'm afraid this is not a simple typo :). The current code is conservatively correct and just replacing `WidestType` with `SmallestType` would be problematic.
`MaxVectorSize` is computing the maximum number of elements of any type that you can put in a physical vector (`unsigned WidestRegister = TTI.getRegisterBitWidth(true);`). For example, imaging that `WidestRegister` is 128-bit and we have double (64-bit) and char (8-bit) data types in the loop:

- With the current code, `MaxVectorSize = 128 / 64 = 2` elements/physical vector. This number of elements is OK for our loop because 2 doubles and 2 chars fit into a 128-bit vector.
- With your proposed change, `MaxVectorSize = 128 / 8 = 16` elements/physical vector. This number may "problematic" for our loop because 16 chars fit into a 128-bit vector but 16 doubles doesn't. We would need to use 8 physical vector registers to pack 16 doubles!

We use the term double/triple/... pumping vectorization when we have to use multiple 2/3/... physical register to pack some data types. Unfortunately, this approach is not always beneficial since it may increase too much the register pressure and lead to register spilling. If we wanted to enable something like this, we would need to add the proper cost model support to evaluate that double/triple/... pumping vectorization scenarios have better cost than the standard approach.

I hope this is helpful.
Please, let me know if you have any question.

Thanks,
Diego

Repository:
  rL LLVM

https://reviews.llvm.org/D44523