Simon Pilgrim via llvm-dev
llvm-dev at lists.llvm.org
Sun Jan 22 08:07:27 PST 2017
> On 20 Jan 2017, at 14:53, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> On 01/20/2017 08:30 AM, Jonas Paulsson wrote:
>> On 2017-01-20 14:31, Hal Finkel wrote:
>>> On 01/20/2017 06:11 AM, Jonas Paulsson via llvm-dev wrote:
>>>> I wonder why getScalarizationOverhead() does not take into account the number of operands of the instruction? This should influence the number of extracts needed, so instead of
>>>> Scalarization cost = NumEls * (insert + extract)
>>>> it would be better to do
>>>> Scalarization cost = NumEls * (insert + (extract * numOperands))
>>> I suspect this is an oversight (although we need to be a bit careful here because if two operands are the same, which is not uncommon, we don't want to double the cost).
>> Do you in those cases of an identical operand want to count just a cost of "1" for a register move, instead of the "extraction cost"?
> There should be no cost to reusing the operand. (mul a, a) should only extract a once, the fact that it is used twice should not increase the cost.
There appears to be a similar issue within the x86 AVX1 cost tables for cases where we have to split the 256-bit integer operations. Some binops add 1*extract_subvector + 1*insert_subvector to the 2*128-binop costs whilst others don’t bother adding anything at all. We need to try harder to determine if we should add 1 (duplicate input or constant folded extract) or 2 extracts to the final cost.
More information about the llvm-dev