[PATCH] D91984: [CostModel] Add basic implementation of getGatherScatterOpCost.
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 24 10:43:45 PST 2020
fhahn added a comment.
In D91984#2413579 <https://reviews.llvm.org/D91984#2413579>, @dmgreen wrote:
>> I think assigning a more accurate cost will go a long way towards avoiding generating gathers in practice, unless they are beneficial even with the bad lowering.
> In the past, I believe the only thing to create gather's would be the loop vectorizer, and it would only do so if isLegalMaskedGather returned true, which guards us against a lot of bad costs. More accurate cost modelling now that it is being used more generally is certainly a good thing.
Once the cost-model is more reasonable, we might want to also just let loop vectorize rely on the cost. Using the costs more widely should flush out any potential issues much quicker.
>> Yes it probably should better account for the fact that the mem ops need to be executed conditionally. Not really sure how to best cost that. I updated the cost to include extracting the individual conditions, doing a branch and a PHI to combine the result. With that, the cost is not really that much higher. Perhaps something else is missing, or it might be better to just return a sufficiently high number in that case?
> The costs of branches and phi's is often 0 unfortunately (due to branch predictors and phis being folded away and whatnot) - they make a good default for the normal type of branch/phi in a loop or if. With so many branches together here that might not be true and the real cost perhaps should be higher. Hmm. Not sure what do suggest. Do we think these scores are high enough? (at least for the moment, we can always adjust them later). As far as I understand the SLP vectorizer will use constant masks which makes it the more immediate concern.
Yeah, the estimate is very rough. I think they costs should be sufficiently high so that it won't get picked in practice, but if there's a better way to model the fact that we need to introduce multiple blocks & branches, I'm happy to adjust the code. Alternatively we could also use a very high number for the VariableMask case. In any case, this should be a substantial improvement to returning `1` as cost, as we do now.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the llvm-commits