[llvm-dev] [RFC] Goals for VPlan-based cost modelling

Mon Nov 2 01:53:01 PST 2020

Hi Renato,

> The cost model structure needs to be target independent but with heavy 
> target-specific knowledge. The kind of specialisation we created for 
> the previous vectoriser is too naive, and assumes the concepts are the 
> same, just not necessarily the costs. That's not true at all for 
> different hardware.
Yes, that's a big issue to be solved. I do like the idea from Florians 
reply, basically encoding an instruction's environment in the 
VPInstruction that is generated and make it such that target-specific 
cost components would overwrite the cost calculations for the general 
class if they can be treated in a special way by the architecture. 
Though currently the gather that is extended and the scatter whose input 
is truncated are my only examples for this kind of thing, which is why I 
am hoping that the community will have encountered more examples like 
these and is willing to share. Maybe there are cases that are more 
complex, that e.g. merge or depend on far more then two instructions, in 
which case the generation of the right VPInstructions might become 
tricky - especially if there are multiple possible 'merge patterns' (for 
one or different architectures) that are overlapping, which would 
require to generate alternative VPlans, whose number for long loops with 
many special cases might grow exponentially to cover all possible 
combinations.

> There are probably many other ways to do this, I'm just thinking out 
> loud to give an idea of the problems we faced earlier. I'm sure you'll 
> find better ways to do this yourself once you start looking at it more.
>
> Regardless of how we do it, having the ability to say "this shuffle is 
> free on this instruction" but "not with that other one" will make the 
> cost model a lot more precise and need a lot less fudged heuristics.
Well thank you for thinking out loud, that is exactly what I was hoping 
to get out of this RFC!
Yes, there needs to be a good way to say things like that, and 
preferrably 'good' should not only mean efficient, but also easy to 
understand, maintain and extend - but that's just me dreaming :)
Are you thinking of a concrete example when mentioning those shuffles? 
Because concrete examples would really help me a lot when thinking 
further about this - if I decide to do this as a thesis, I might have to 
constrain the actual implementation to (a subset of instructions for) a 
single architecture, but for thinking about the issue itself any 
concrete examples for any architecture would be a great help!

> 2. How to use the cost model
>
> Once we have costs of a VPlan, we need to traverse the problem space 
> efficiently. It would be awesome if we could use some smart traversing 
> (like RLO) but that also brings highly uncertain execution times and 
> the need for training and genralisation, so not generally feasible. 
> But we can do something simpler while still having the same idea. What 
> we really don't want is to be as greedy as the original vectoriser. We 
> must have a way to occasionally increase costs without giving up, but 
> for that, we need a policy that tells us that it's ok to go that way 
> and not some other (less optimal) way.
>
> This policy has to be target-specific. Given it's more deterministic 
> nature (more than RLO at least), this will probably be the place where 
> we fudge most of our heuristics. Things like "keep adding 
> shuffles that it's likely they'll disappear", even if they don't all 
> the time, it's worth pushing a bit more, in case they do. So we'll 
> also need to have hard limits, possibly per target, and possibly 
> benchmark-driven heuristics.
>
> Once we have the policy and the cost, we can traverse one or many 
> paths (depending on the budget we give the compiler, which could be a 
> command line option), and then push as hard as we can through all 
> paths within the budget and when we stop, we take the lowest cost and 
> apply the series. How we traverse this will depend on the 
> implementation of the cost model and the policy, but some simple 
> heuristics search would be fine as a first approach.
>
> Folks working on VPlan are creating a really nice structure to work 
> with the different concepts in vectorisation strategies (predication, 
> scalable vectors, use-def chain costs), so I'm sure you'll have at 
> least a few good ways to proceed, and hopefully help define the 
> interfaces between the different parts of the vectoriser.
Hm, I hadn't yet gotten as far as how to use it, but you're right of 
course - that will be another challenge. I know that there are many 
things being planned and on the way for VPlan, but I'll willingly admit 
that I am currently still having some trouble gaining an overview of 
what is there, what is not, and what might be on its way. It's simply 
hard and takes a lot of time to find documentation for all the new 
things, but I'm sure I'll get there.

Thanks for your detailed reply and all the ideas, they do help a lot.

Best,
Anna