[PATCH] D73023: [DAG] Enable ISD::EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 23 10:21:05 PST 2020
spatel added inline comments.
================
Comment at: llvm/test/CodeGen/X86/pr31956.ll:14
; CHECK-NEXT: vblendps {{.*#+}} ymm0 = ymm0[0,1],mem[2,3,4,5,6,7]
-; CHECK-NEXT: vextractf128 $1, %ymm0, %xmm1
+; CHECK-NEXT: vmovaps G2+{{.*}}(%rip), %xmm1
; CHECK-NEXT: vshufps {{.*#+}} xmm0 = xmm1[0,2],xmm0[2,0]
----------------
Looked at this a bit, and I think this is ok. We are intentionally being aggressive about duplicating multi-use loads because eliminating the dependency and reducing register pressure (assuming load-folding) is probably better for perf if this code is in a loop.
In this particular case, there seems to be an opportunity to commute the shufps masks in lowerShuffleWithSHUFPS() in the case where we create 2 shufps ops. I'm guessing that's a very rare occurrence, so not sure if it's worth a TODO comment/bug report.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73023/new/
https://reviews.llvm.org/D73023
More information about the llvm-commits
mailing list