[PATCH] D141776: [X86] `X86TargetLowering`: override `allowsMemoryAccess()`
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Jan 14 14:11:50 PST 2023
lebedev.ri added a comment.
In D141776#4054185 <https://reviews.llvm.org/D141776#4054185>, @RKSimon wrote:
> can you improve the summary please? you about nt memops, but all the test changes don't have anything to do with them
I'm open to suggestions. As you can check yourself, with the follow-up patch (and without this patch),
in `llvm-project/llvm/test/CodeGen/X86/merge-consecutive-stores-nt.ll`
we endlessly try to combine non-temporal stores, only to split them back again:
Combining: t4164: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
Creating constant: t4166: i64 = Constant<16>
Creating new node: t4167: i64 = add t2, Constant:i64<16>
Creating new node: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Creating new node: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Creating new node: t4170: ch = TokenFactor t4168:1, t4169:1
Creating new node: t4171: v8f32 = concat_vectors t4168, t4169
Replacing.1 t4164: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
With: t4171: v8f32 = concat_vectors t4168, t4169
and 1 other values
Legalizing: t4170: ch = TokenFactor t4168:1, t4169:1
Legal node: nothing to do
Combining: t4170: ch = TokenFactor t4168:1, t4169:1
Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation
Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing: t4167: i64 = add t2, Constant:i64<16>
Legal node: nothing to do
Combining: t4167: i64 = add t2, Constant:i64<16>
Legalizing: t4166: i64 = Constant<16>
Legal node: nothing to do
Combining: t4166: i64 = Constant<16>
Legalizing: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing non-extending load operation
Combining: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing: t4171: v8f32 = concat_vectors t4168, t4169
Trying custom legalization
Creating new node: t4172: v8f32 = undef
Creating constant: t4173: i64 = Constant<0>
Creating new node: t4174: v8f32 = insert_subvector undef:v8f32, t4168, Constant:i64<0>
Creating constant: t4175: i64 = Constant<4>
Creating new node: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
Successfully custom legalized node
... replacing: t4171: v8f32 = concat_vectors t4168, t4169
with: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
Legalizing: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing non-extending load operation
Combining: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation
Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
Legal node: nothing to do
Combining: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
Legalizing: t4175: i64 = Constant<4>
Legal node: nothing to do
Combining: t4175: i64 = Constant<4>
Legalizing: t4174: v8f32 = insert_subvector undef:v8f32, t4168, Constant:i64<0>
Legal node: nothing to do
Combining: t4174: v8f32 = insert_subvector undef:v8f32, t4168, Constant:i64<0>
Creating new node: t4177: v4f32 = undef
Legalizing: t4173: i64 = Constant<0>
Legal node: nothing to do
Combining: t4173: i64 = Constant<0>
Legalizing: t4172: v8f32 = undef
Legal node: nothing to do
Combining: t4172: v8f32 = undef
Legalizing: t4165: ch = store<(non-temporal store (s256) into %ir.a1)> t4163, t4176, t4, undef:i64
Legalizing store operation
Optimizing float store operations
Trying custom lowering
Creating new node: t4178: v4f32 = extract_subvector t4176, Constant:i64<0>
Creating new node: t4179: i64 = add t4, Constant:i64<16>
Creating new node: t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
Creating new node: t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
Creating new node: t4182: ch = TokenFactor t4180, t4181
... replacing: t4165: ch = store<(non-temporal store (s256) into %ir.a1)> t4163, t4176, t4, undef:i64
with: t4182: ch = TokenFactor t4180, t4181
Legalizing: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
Legal node: nothing to do
Combining: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
Legalizing: t4182: ch = TokenFactor t4180, t4181
Legal node: nothing to do
Combining: t4182: ch = TokenFactor t4180, t4181
Legalizing: t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store
Combining: t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
Creating new node: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
Creating new node: t4184: ch = TokenFactor t4163, t4183
Replacing.1 t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
With: t4184: ch = TokenFactor t4163, t4183
and 0 other values
Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation
Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing: t4184: ch = TokenFactor t4163, t4183
Legal node: nothing to do
Combining: t4184: ch = TokenFactor t4163, t4183
Legalizing: t4182: ch = TokenFactor t4180, t4184
Legal node: nothing to do
Combining: t4182: ch = TokenFactor t4180, t4184
Creating new node: t4185: ch = TokenFactor t4180, t4183
... into: t4185: ch = TokenFactor t4180, t4183
Legalizing: t4185: ch = TokenFactor t4180, t4183
Legal node: nothing to do
Combining: t4185: ch = TokenFactor t4180, t4183
Legalizing: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store
Combining: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
Legalizing: t4179: i64 = add t4, Constant:i64<16>
Legal node: nothing to do
Combining: t4179: i64 = add t4, Constant:i64<16>
Legalizing: t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store
Combining: t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
Creating new node: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4178, t4, undef:i64
Creating new node: t4187: ch = TokenFactor t4163, t4186
Replacing.1 t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
With: t4187: ch = TokenFactor t4163, t4186
and 0 other values
Legalizing: t4187: ch = TokenFactor t4163, t4186
Legal node: nothing to do
Combining: t4187: ch = TokenFactor t4163, t4186
Creating new node: t4188: ch = TokenFactor t4186, t4170
... into: t4188: ch = TokenFactor t4186, t4170
Legalizing: t4170: ch = TokenFactor t4168:1, t4169:1
Legal node: nothing to do
Combining: t4170: ch = TokenFactor t4168:1, t4169:1
Legalizing: t4188: ch = TokenFactor t4186, t4170
Legal node: nothing to do
Combining: t4188: ch = TokenFactor t4186, t4170
Creating new node: t4189: ch = TokenFactor t4186, t4169:1
... into: t4189: ch = TokenFactor t4186, t4169:1
Legalizing: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing non-extending load operation
Combining: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation
Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing: t4189: ch = TokenFactor t4186, t4169:1
Legal node: nothing to do
Combining: t4189: ch = TokenFactor t4186, t4169:1
Legalizing: t4185: ch = TokenFactor t4189, t4183
Legal node: nothing to do
Combining: t4185: ch = TokenFactor t4189, t4183
Creating new node: t4190: ch = TokenFactor t4183, t4186
... into: t4190: ch = TokenFactor t4183, t4186
Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation
Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store
Combining: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
Legalizing: t4190: ch = TokenFactor t4183, t4186
Legal node: nothing to do
Combining: t4190: ch = TokenFactor t4183, t4186
Legalizing: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4178, t4, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store
Combining: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4178, t4, undef:i64
Legalizing: t4178: v4f32 = extract_subvector t4176, Constant:i64<0>
Legal node: nothing to do
Combining: t4178: v4f32 = extract_subvector t4176, Constant:i64<0>
... into: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation
Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4168, t4, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store
Combining: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4168, t4, undef:i64
Creating new node: t4191: ch = TokenFactor t4168:1, t4169:1
Creating new node: t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
Creating new node: t4193: ch = store<(non-temporal store (s256) into %ir.a1)> t4191, t4192, t4, undef:i64
Replacing.1 t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4192:1, t4168, t4, undef:i64
With: t4193: ch = store<(non-temporal store (s256) into %ir.a1)> t4191, t4192, t4, undef:i64
and 0 other values
Replacing.1 t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4192:1, t4169, t4179, undef:i64
With: t4193: ch = store<(non-temporal store (s256) into %ir.a1)> t4191, t4192, t4, undef:i64
and 0 other values
Legalizing: t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
Legalizing non-extending load operation
Combining: t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
Creating constant: t4194: i64 = Constant<16>
Creating new node: t4195: i64 = add t2, Constant:i64<16>
Creating new node: t4196: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Creating new node: t4197: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4195, undef:i64
Creating new node: t4198: ch = TokenFactor t4196:1, t4197:1
Creating new node: t4199: v8f32 = concat_vectors t4196, t4197
Replacing.1 t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D141776/new/
https://reviews.llvm.org/D141776
More information about the llvm-commits
mailing list