[PATCH] D141776: [X86] `X86TargetLowering`: override `allowsMemoryAccess()`

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Jan 14 14:11:50 PST 2023


lebedev.ri added a comment.

In D141776#4054185 <https://reviews.llvm.org/D141776#4054185>, @RKSimon wrote:

> can you improve the summary please? you about nt memops, but all the test changes don't have anything to do with them

I'm open to suggestions. As you can check yourself, with the follow-up patch (and without this patch),
in `llvm-project/llvm/test/CodeGen/X86/merge-consecutive-stores-nt.ll`
we endlessly try to combine non-temporal stores, only to split them back again:

  Combining: t4164: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
  Creating constant: t4166: i64 = Constant<16>
  Creating new node: t4167: i64 = add t2, Constant:i64<16>
  Creating new node: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
  Creating new node: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  Creating new node: t4170: ch = TokenFactor t4168:1, t4169:1
  Creating new node: t4171: v8f32 = concat_vectors t4168, t4169
  
  Replacing.1 t4164: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
  
  With: t4171: v8f32 = concat_vectors t4168, t4169
   and 1 other values
  
  Legalizing: t4170: ch = TokenFactor t4168:1, t4169:1
  Legal node: nothing to do
  
  Combining: t4170: ch = TokenFactor t4168:1, t4169:1
  
  Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  
  Legalizing: t4167: i64 = add t2, Constant:i64<16>
  Legal node: nothing to do
  
  Combining: t4167: i64 = add t2, Constant:i64<16>
  
  Legalizing: t4166: i64 = Constant<16>
  Legal node: nothing to do
  
  Combining: t4166: i64 = Constant<16>
  
  Legalizing: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
  
  Legalizing: t4171: v8f32 = concat_vectors t4168, t4169
  Trying custom legalization
  Creating new node: t4172: v8f32 = undef
  Creating constant: t4173: i64 = Constant<0>
  Creating new node: t4174: v8f32 = insert_subvector undef:v8f32, t4168, Constant:i64<0>
  Creating constant: t4175: i64 = Constant<4>
  Creating new node: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
  Successfully custom legalized node
   ... replacing: t4171: v8f32 = concat_vectors t4168, t4169
       with:      t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
  
  Legalizing: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
  
  Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  
  Legalizing: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
  Legal node: nothing to do
  
  Combining: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
  
  Legalizing: t4175: i64 = Constant<4>
  Legal node: nothing to do
  
  Combining: t4175: i64 = Constant<4>
  
  Legalizing: t4174: v8f32 = insert_subvector undef:v8f32, t4168, Constant:i64<0>
  Legal node: nothing to do
  
  Combining: t4174: v8f32 = insert_subvector undef:v8f32, t4168, Constant:i64<0>
  Creating new node: t4177: v4f32 = undef
  
  Legalizing: t4173: i64 = Constant<0>
  Legal node: nothing to do
  
  Combining: t4173: i64 = Constant<0>
  
  Legalizing: t4172: v8f32 = undef
  Legal node: nothing to do
  
  Combining: t4172: v8f32 = undef
  
  Legalizing: t4165: ch = store<(non-temporal store (s256) into %ir.a1)> t4163, t4176, t4, undef:i64
  Legalizing store operation
  Optimizing float store operations
  Trying custom lowering
  Creating new node: t4178: v4f32 = extract_subvector t4176, Constant:i64<0>
  Creating new node: t4179: i64 = add t4, Constant:i64<16>
  Creating new node: t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
  Creating new node: t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
  Creating new node: t4182: ch = TokenFactor t4180, t4181
   ... replacing: t4165: ch = store<(non-temporal store (s256) into %ir.a1)> t4163, t4176, t4, undef:i64
       with:      t4182: ch = TokenFactor t4180, t4181
  
  Legalizing: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
  Legal node: nothing to do
  
  Combining: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
  
  Legalizing: t4182: ch = TokenFactor t4180, t4181
  Legal node: nothing to do
  
  Combining: t4182: ch = TokenFactor t4180, t4181
  
  Legalizing: t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
  Legalizing store operation
  Optimizing float store operations
  Legal store
  
  Combining: t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
  Creating new node: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
  Creating new node: t4184: ch = TokenFactor t4163, t4183
  
  Replacing.1 t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
  
  With: t4184: ch = TokenFactor t4163, t4183
   and 0 other values
  
  Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  
  Legalizing: t4184: ch = TokenFactor t4163, t4183
  Legal node: nothing to do
  
  Combining: t4184: ch = TokenFactor t4163, t4183
  
  Legalizing: t4182: ch = TokenFactor t4180, t4184
  Legal node: nothing to do
  
  Combining: t4182: ch = TokenFactor t4180, t4184
  Creating new node: t4185: ch = TokenFactor t4180, t4183
   ... into: t4185: ch = TokenFactor t4180, t4183
  
  Legalizing: t4185: ch = TokenFactor t4180, t4183
  Legal node: nothing to do
  
  Combining: t4185: ch = TokenFactor t4180, t4183
  
  Legalizing: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
  Legalizing store operation
  Optimizing float store operations
  Legal store
  
  Combining: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
  
  Legalizing: t4179: i64 = add t4, Constant:i64<16>
  Legal node: nothing to do
  
  Combining: t4179: i64 = add t4, Constant:i64<16>
  
  Legalizing: t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
  Legalizing store operation
  Optimizing float store operations
  Legal store
  
  Combining: t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
  Creating new node: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4178, t4, undef:i64
  Creating new node: t4187: ch = TokenFactor t4163, t4186
  
  Replacing.1 t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
  
  With: t4187: ch = TokenFactor t4163, t4186
   and 0 other values
  
  Legalizing: t4187: ch = TokenFactor t4163, t4186
  Legal node: nothing to do
  
  Combining: t4187: ch = TokenFactor t4163, t4186
  Creating new node: t4188: ch = TokenFactor t4186, t4170
   ... into: t4188: ch = TokenFactor t4186, t4170
  
  Legalizing: t4170: ch = TokenFactor t4168:1, t4169:1
  Legal node: nothing to do
  
  Combining: t4170: ch = TokenFactor t4168:1, t4169:1
  
  Legalizing: t4188: ch = TokenFactor t4186, t4170
  Legal node: nothing to do
  
  Combining: t4188: ch = TokenFactor t4186, t4170
  Creating new node: t4189: ch = TokenFactor t4186, t4169:1
   ... into: t4189: ch = TokenFactor t4186, t4169:1
  
  Legalizing: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
  
  Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  
  Legalizing: t4189: ch = TokenFactor t4186, t4169:1
  Legal node: nothing to do
  
  Combining: t4189: ch = TokenFactor t4186, t4169:1
  
  Legalizing: t4185: ch = TokenFactor t4189, t4183
  Legal node: nothing to do
  
  Combining: t4185: ch = TokenFactor t4189, t4183
  Creating new node: t4190: ch = TokenFactor t4183, t4186
   ... into: t4190: ch = TokenFactor t4183, t4186
  
  Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  
  Legalizing: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
  Legalizing store operation
  Optimizing float store operations
  Legal store
  
  Combining: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
  
  Legalizing: t4190: ch = TokenFactor t4183, t4186
  Legal node: nothing to do
  
  Combining: t4190: ch = TokenFactor t4183, t4186
  
  Legalizing: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4178, t4, undef:i64
  Legalizing store operation
  Optimizing float store operations
  Legal store
  
  Combining: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4178, t4, undef:i64
  
  Legalizing: t4178: v4f32 = extract_subvector t4176, Constant:i64<0>
  Legal node: nothing to do
  
  Combining: t4178: v4f32 = extract_subvector t4176, Constant:i64<0>
   ... into: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
  
  Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
  
  Legalizing: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4168, t4, undef:i64
  Legalizing store operation
  Optimizing float store operations
  Legal store
  
  Combining: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4168, t4, undef:i64
  Creating new node: t4191: ch = TokenFactor t4168:1, t4169:1
  Creating new node: t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
  Creating new node: t4193: ch = store<(non-temporal store (s256) into %ir.a1)> t4191, t4192, t4, undef:i64
  
  Replacing.1 t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4192:1, t4168, t4, undef:i64
  
  With: t4193: ch = store<(non-temporal store (s256) into %ir.a1)> t4191, t4192, t4, undef:i64
   and 0 other values
  
  Replacing.1 t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4192:1, t4169, t4179, undef:i64
  
  With: t4193: ch = store<(non-temporal store (s256) into %ir.a1)> t4191, t4192, t4, undef:i64
   and 0 other values
  
  Legalizing: t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
  Legalizing non-extending load operation
  
  Combining: t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
  Creating constant: t4194: i64 = Constant<16>
  Creating new node: t4195: i64 = add t2, Constant:i64<16>
  Creating new node: t4196: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
  Creating new node: t4197: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4195, undef:i64
  Creating new node: t4198: ch = TokenFactor t4196:1, t4197:1
  Creating new node: t4199: v8f32 = concat_vectors t4196, t4197
  
  Replacing.1 t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141776/new/

https://reviews.llvm.org/D141776



More information about the llvm-commits mailing list