[PATCH] D22867: [LV] Mark scalarized GEPs uniform
Matthew Simpson via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 27 12:23:04 PDT 2016
mssimpso added inline comments.
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1518-1519
@@ -1506,4 +1517,4 @@
/// Collect the variables that need to stay uniform after vectorization.
void collectLoopUniforms();
----------------
anemet wrote:
> I was looking at this code recently and was surprised to see that we never actually describe what uniformity is (especially because we use it for other things than just loop-invariant addresses). As a first step, would you mind filling this gap?
>
> My take is that we currently call something uniform if we don't need to generate values for each horizontal value in a vector loop iteration, more precisely we only need to generate the first one. This is true for a few things: the induction variable, loop-invariant addresses, pointers for consecutive accesses (because the vector access instruction implicitly generates the horizontal addresses).
I completely agree and don't mind doing this at all. Thanks for putting that into words!
================
Comment at: test/Transforms/LoopVectorize/induction.ll:295
@@ -251,1 +294,3 @@
+}
+
; Make sure that the loop exit count computation does not overflow for i8 and
----------------
wmi wrote:
> Hi Matthew,
>
> A problem I see to make getelementptr as uniform when it is non-consecutive is:
>
> For the testcase here, if we don't enable interleave memory access,
> we will generate vectorized version for "%0 = shl nsw i64 %i, 2". However with your patch "%0 = shl nsw i64 %i, 2" will also be marked as uniform because "%1 = getelementptr inbounds i32, i32* %a, i64 %0" is marked as uniform. These are contradicted results.
>
> Even if we generate scalarized version for "%0 = shl nsw i64 %i, 2", the instruction cost for "%0 = shl nsw i64 %i, 2" should be VF. Marking it as uniform will lower its cost estimation to be only 1.
>
> Thanks,
> Wei.
>
>
You're right - the GEP will only be uniform here if the loads/stores are in interleaved groups. This makes some sense to me because when interleaving we treat the pointer as if it was consecutive. Thanks! I will update the patch.
https://reviews.llvm.org/D22867
More information about the llvm-commits
mailing list