[PATCH] D109065: [X86] combineX86ShufflesRecursively(): call SimplifyMultipleUseDemandedVectorElts() on after finishing recursing
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 17 09:30:03 PDT 2021
lebedev.ri added inline comments.
================
Comment at: llvm/test/CodeGen/X86/insertelement-ones.ll:311
define <16 x i8> @insert_v16i8_x123456789ABCDEx(<16 x i8> %a) {
; SSE2-LABEL: insert_v16i8_x123456789ABCDEx:
----------------
Here we have:
```
Optimized legalized selection DAG: %bb.0 'insert_v16i8_x123456789ABCDEx:'
SelectionDAG has 20 nodes:
t0: ch = EntryToken
t2: v16i8,ch = CopyFromReg t0, Register:v16i8 %0
t19: v16i8 = and t2, t36
t20: v16i8 = X86ISD::ANDNP t36, t27
t21: v16i8 = or t19, t20
t33: v16i8 = X86ISD::VSHLDQ t27, TargetConstant:i8<15>
t45: v16i8 = or t21, t33
t12: ch,glue = CopyToReg t0, Register:v16i8 $xmm0, t45
t26: v4i32 = scalar_to_vector Constant:i32<255>
t27: v16i8 = bitcast t26
t38: i64 = X86ISD::Wrapper TargetConstantPool:i64<<16 x i8> <i8 0, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>> 0
t36: v16i8,ch = load<(load (s128) from constant-pool)> t0, t38, undef:i64
t13: ch = X86ISD::RET_FLAG t12, TargetConstant:i32<0>, Register:v16i8 $xmm0, t12:1
```
... so `matchBinaryShuffle()` again fails to omit the masking,
even though it's obviously redundant here for the reasons seen in D109726.
I would suspect that is because around `scalar_to_vector` we operate on i32 elt type,
so we don't have all-ones elements until after `bitcast`.
Without changing `computeKnownBits` to operate on a specified element width,
i'm not sure it can help us further, and that does not sound like the right fix.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D109065/new/
https://reviews.llvm.org/D109065
More information about the llvm-commits
mailing list