[PATCH] D16137: AVX512: VMOVDQU8/16/32/64 (load) intrinsic implementation.
Igor Breger via llvm-commits
llvm-commits at lists.llvm.org
Sun Jan 17 04:44:17 PST 2016
igorb added a comment.
In http://reviews.llvm.org/D16137#327385, @mbodart wrote:
> I'm not sure I understand why this is just now becoming an issue.
>
> Is the need for an X86-specific override of LowerOperationWrapper driven by an existing problem
> with SINT_TO_FP lowering, or by a problem that is only exposed when adding the new masked load intrinsics?
The problem exposed only when adding new masked load intrinsics. In previous implementation only one value was taken ( chain was dropped ).
> Is the chain being dropped for both SINT_TO_FP and masked loads, or just one of them.
Only for SINT_TO_FP ( or any other similar nodes).
> What are the safety consequences of dropping the chain, wrt losing an ordering dependence?
In SINT_TO_FP store->load chain preserved, as i understand LOAD node chain could be dropped.
> And why is I64 write mask legalization fine for most masked intrinsics, but not the masked loads?
> Is it because none of the other existing I64-masked intrinsics produce an additional chain result?
Yes, masked load intrinsic (packed bytes) operand type legalization is the first one that produce additional chain result.
> A concrete example or two, showing the DAG snippets during legalization , would be helpful.
SINT_TO_FP DAG snippet
t6: f64 = sint_to_fp t5
t5: i64 = build_pair t2, t4
t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0
t0: ch = EntryToken
t4: i32,ch = CopyFromReg t0, Register:i32 %vreg1
Transformed to
t13: f64,ch = X86ISD::FILD<LD8[FixedStack0]> t11, FrameIndex:i32<0>, ValueType:ch:i64
t11: ch = store<ST8[FixedStack0](align=4)> t0, t5, FrameIndex:i32<0>, undef:i32
t0: ch = EntryToken
t5: i64 = build_pair t2, t4
t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0
t4: i32,ch = CopyFromReg t0, Register:i32 %vreg1
--------------------------------------------------------
masked loads snippet
t12: v64i8,ch = llvm.x86.avx512.mask.loadu.b.512<LD64[%x0](align=1)> t0, TargetConstant:i32<4681>, t3, t5, t17
t0: ch = EntryToken
t3: i32,ch = load<LD4[FixedStack-1](align=16)> t0, FrameIndex:i32<-1>, undef:i32
t5: v64i8,ch = CopyFromReg t0, Register:v64i8 %vreg0
t17: i64,ch = load<LD8[FixedStack-2](align=4)> t0, FrameIndex:i32<-2>, undef:i32
Transformed to
t30: v64i8,ch = masked_load<LD64[%x0](align=1)> t0, t3, t29, t5
t0: ch = EntryToken
t3: i32,ch = load<LD4[FixedStack-1](align=16)> t0, FrameIndex:i32<-1>, undef:i32
t29: v64i1 = concat_vectors t27, t28
t27: v32i1 = bitcast t24
t24: i32 = extract_element t17, Constant:i32<0>
t17: i64,ch = load<LD8[FixedStack-2](align=4)> t0, FrameIndex:i32<-2>, undef:i32
t28: v32i1 = bitcast t26
t26: i32 = extract_element t17, Constant:i32<1>
t5: v64i8,ch = CopyFromReg t0, Register:v64i8 %vreg0
================
Comment at: lib/Target/X86/X86InstrAVX512.td:2753
@@ -2752,11 +2752,3 @@
HasAVX512>, XS, VEX_W, EVEX_CD8<64, CD8VF>;
def: Pat<(int_x86_avx512_mask_storeu_d_512 addr:$ptr, (v16i32 VR512:$src),
----------------
mbodart wrote:
> Can you please explain why these patterns are being deleted?
This intrinsics is handled by DAG Legalization pass (X86ISelLowering.cpp , lowerINTRINSIC_W_CHAIN() function)
Repository:
rL LLVM
http://reviews.llvm.org/D16137
More information about the llvm-commits
mailing list