[llvm-dev] MatchLoadCombine(): handling for vectorized loop.
Friedman, Eli via llvm-dev
llvm-dev at lists.llvm.org
Mon Dec 3 15:37:29 PST 2018
On 12/3/2018 8:20 AM, Jonas Paulsson wrote:
> I have noticed some loops that build a wider element by loading small
> elements, zero-extending them, shifting them (with different amounts)
> to then 'or' them all together. They are either equivalent of a wider
> load, or to that of a byte-swapped one.
> DAGCombiner::MatchLoadCombine() will combine this to a single wide
> load, but only in the scalar cases of i16, i32 and i64. The result is
> that these loops (I have seen a dozen or so on SPEC) get vectorized
> with a lot of ugly code.
> I have begun to experiment with handling the vectorized loop also, and
> would like to know if people think this would be a good idea? Also, am
> I right to assume that it probably should be run before type
You mean, trying to merge some combination of vector loads and shuffles
into a single vector load in DAGCombine? That seems sort of late, given
the cost modeling involved in vectorization.
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
More information about the llvm-dev