D26905: [SLP] Vectorize loads of consecutive memory accesses, accessed in non-consecutive (jumbled) way.

Shahid, Asghar-ahmad via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 1 02:23:50 PST 2016


Yes, I used update_test_checks and this is an additional shuffle.

-----Original Message-----
From: Simon Pilgrim via Phabricator [mailto:reviews at reviews.llvm.org] 
Sent: Thursday, December 1, 2016 3:13 PM
To: Shahid, Asghar-ahmad <Asghar-ahmad.Shahid at amd.com>; hfinkel at anl.gov; mssimpso at codeaurora.org; mkuper at google.com
Cc: llvm-dev at redking.me.uk; sanjoy at playingwithpointers.com; mzolotukhin at apple.com; llvm-commits at lists.llvm.org
Subject: [PATCH] D26905: [SLP] Vectorize loads of consecutive memory accesses, accessed in non-consecutive (jumbled) way.

RKSimon added inline comments.


================
Comment at: test/Transforms/SLPVectorizer/X86/reduction_loads.ll:20
 ; CHECK-NEXT:    [[TMP1:%.*]] = load <8 x i32>, <8 x i32>* [[TMP0]], align 4
-; CHECK-NEXT:    [[TMP2:%.*]] = mul <8 x i32> <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>, [[TMP1]]
+; CHECK-NEXT:    [[TMP2:%.*]] = shufflevector <8 x i32> [[TMP1]], <8 x i32> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
+; CHECK-NEXT:    [[TMP3:%.*]] = mul <8 x i32> <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>, [[TMP2]]
----------------
mkuper wrote:
> RKSimon wrote:
> > What can be done to avoid this regression?
> Ohh, right, wanted to ask about this as well.
> My guess is that this wasn't actually a regression, but we moved the shuffle from store side to the load side.  Is that right?
If the update_test_checks script has done its job and generated checks for all the IR then this is an additional shuffle, I can't see an equivalent shuffle or set of extracts in the codegen on the left.


https://reviews.llvm.org/D26905





More information about the llvm-commits mailing list