[PATCH] D53865: [LoopVectorizer] Improve computation of scalarization overhead.

Tue Nov 27 10:32:25 PST 2018

hsaito added a comment.

Sorry, I must have missed this review.

VPlan based cost modeling (plus VPlan based code motion) should naturally capture this kind of situation ----- but only to the extent that producer/consumer can reside in the same BB. It's taking a lot longer than I wanted to stabilize (compute exactly the same value as existing cost model in LV). If you are interested in developing that area of VPlan based cost model, I can clean up my workspace and upload what I have to Phabricator as WIP patch.

When producer/consumer has to reside in separate BBs for some reason, the current recipe (which resides in BB) based modeling won't help much. As such, from the generalized implementation perspective, some kind of U-D based mapping (like the one used here) may be inevitable. So, from the technical aspect, we should discuss what are the plausible scenarios that make producer and consumer to reside in separate BBs and whether that situations are rare enough to ignore.

> typically with the only benefit being vectorized memory operations

I have a mixed feeling here. ICC vectorizes relatively aggressively and it has good enough reasons to do so. Having said that, it comes with the code size and compile time associated to aggressive vectorization (on top of vectorization not always profitable in execution time reduction). So, if we are doing this, we should make it easy to tune (by vectorizer developer as well as by the programmer using the compiler). This comment is not specific to this patch, though.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D53865/new/

https://reviews.llvm.org/D53865