<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Ah, gotcha. I'd missed the fact that "extload" was explicitly
meaning "aextload" (i.e. any extend). I agree that an any extend
variant on this pattern seems a lot less likely. Only case I can
see that happening would be if the vector op had been widened
beyond the interesting data type, and we were going to end up
ignoring the high bits in the end. <br>
</p>
<p>Philip<br>
</p>
<div class="moz-cite-prefix">On 3/2/20 12:36 PM, Craig Topper wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAF7ks-Ocoh4xfOjezbcP3GDKiNOa_RwVostvSLUhCLo7p_Fjsg@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>This was specifically looking for extload not
zextload/sextload. So the SelectionDAG said do a 16-bit or
8-bit load, extend it however you like to i32 and broadcast
those 32-bits. The patterns I removed recognized that the load
was aligned and that the upper bits of i32 elements were
allowed to be garbage, so it just loaded 32-bit and
broadcasted it.</div>
<div><br>
</div>
<div>In your example, the upper bits of the i64 elements are
expected to be 0 right?</div>
<div><br clear="all">
<div>
<div dir="ltr" class="gmail_signature"
data-smartmail="gmail_signature">~Craig</div>
</div>
<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Mar 2, 2020 at 12:27
PM Philip Reames <<a
href="mailto:listmail@philipreames.com"
moz-do-not-send="true">listmail@philipreames.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Craig,<br>
<br>
I might not be understanding you correctly, but on the
surface, this <br>
seems like a fairly common case. Wouldn't something like the
following <br>
trigger this?<br>
<br>
struct T {<br>
uint64_t j;<br>
uint8_t k;<br>
}<br>
<br>
void foo(uint64_t *a, struct T& t)<br>
for (int i = 0; i < N; i++) {<br>
a[i] += (uint64_t)t.k;<br>
}<br>
<br>
Given an 8 byte alignment of objects, and a packed layout the
8 bit <br>
field would have an 8 byte starting alignment. After
vectorization, I'd <br>
expect to see a load of the field outside the loop followed by
an extend <br>
and broadcast to VF x i64. Wouldn't that create exactly the
pattern you <br>
removed?<br>
<br>
Philip<br>
<br>
On 2/28/20 4:39 PM, Craig Topper via llvm-commits wrote:<br>
> Author: Craig Topper<br>
> Date: 2020-02-28T16:39:27-08:00<br>
> New Revision: 9fcd212e2f678fdbdf304399a1e58ca490dc54d1<br>
><br>
> URL: <a
href="https://github.com/llvm/llvm-project/commit/9fcd212e2f678fdbdf304399a1e58ca490dc54d1"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://github.com/llvm/llvm-project/commit/9fcd212e2f678fdbdf304399a1e58ca490dc54d1</a><br>
> DIFF: <a
href="https://github.com/llvm/llvm-project/commit/9fcd212e2f678fdbdf304399a1e58ca490dc54d1.diff"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://github.com/llvm/llvm-project/commit/9fcd212e2f678fdbdf304399a1e58ca490dc54d1.diff</a><br>
><br>
> LOG: [X86] Remove isel patterns from broadcast of
loadi32.<br>
><br>
> We already combine non extending loads with broadcasts in
DAG<br>
> combine. All these patterns are picking up is the aligned
extload<br>
> special case. But the only lit test we have that
exercsises it is<br>
> using v8i1 load that datalayout is reporting align 8 for.
That<br>
> seems generous. So without a realistic test case I don't
think<br>
> there is much value in these patterns.<br>
><br>
> Added:<br>
> <br>
><br>
> Modified:<br>
> llvm/lib/Target/X86/X86InstrAVX512.td<br>
> llvm/lib/Target/X86/X86InstrSSE.td<br>
> llvm/test/CodeGen/X86/vector-sext.ll<br>
><br>
> Removed:<br>
> <br>
><br>
><br>
>
################################################################################<br>
> diff --git a/llvm/lib/Target/X86/X86InstrAVX512.td
b/llvm/lib/Target/X86/X86InstrAVX512.td<br>
> index a2bd6a2853a0..1d3ef67c9d3d 100644<br>
> --- a/llvm/lib/Target/X86/X86InstrAVX512.td<br>
> +++ b/llvm/lib/Target/X86/X86InstrAVX512.td<br>
> @@ -1427,10 +1427,6 @@ let Predicates = [HasAVX512] in {<br>
> // 32-bit targets will fail to load a i64 directly
but can use ZEXT_LOAD.<br>
> def : Pat<(v8i64 (X86VBroadcast (v2i64
(X86vzload64 addr:$src)))),<br>
> (VPBROADCASTQZrm addr:$src)>;<br>
> -<br>
> - // FIXME this is to handle aligned extloads from i8.<br>
> - def : Pat<(v16i32 (X86VBroadcast (loadi32
addr:$src))),<br>
> - (VPBROADCASTDZrm addr:$src)>;<br>
> }<br>
> <br>
> let Predicates = [HasVLX] in {<br>
> @@ -1439,12 +1435,6 @@ let Predicates = [HasVLX] in {<br>
> (VPBROADCASTQZ128rm addr:$src)>;<br>
> def : Pat<(v4i64 (X86VBroadcast (v2i64
(X86vzload64 addr:$src)))),<br>
> (VPBROADCASTQZ256rm addr:$src)>;<br>
> -<br>
> - // FIXME this is to handle aligned extloads from i8.<br>
> - def : Pat<(v4i32 (X86VBroadcast (loadi32
addr:$src))),<br>
> - (VPBROADCASTDZ128rm addr:$src)>;<br>
> - def : Pat<(v8i32 (X86VBroadcast (loadi32
addr:$src))),<br>
> - (VPBROADCASTDZ256rm addr:$src)>;<br>
> }<br>
> let Predicates = [HasVLX, HasBWI] in {<br>
> // loadi16 is tricky to fold, because
!isTypeDesirableForOp, justifiably.<br>
><br>
> diff --git a/llvm/lib/Target/X86/X86InstrSSE.td
b/llvm/lib/Target/X86/X86InstrSSE.td<br>
> index e66f15747787..73bba723ab96 100644<br>
> --- a/llvm/lib/Target/X86/X86InstrSSE.td<br>
> +++ b/llvm/lib/Target/X86/X86InstrSSE.td<br>
> @@ -7529,12 +7529,6 @@ let Predicates = [HasAVX2, NoVLX]
in {<br>
> (VPBROADCASTQrm addr:$src)>;<br>
> def : Pat<(v4i64 (X86VBroadcast (v2i64
(X86vzload64 addr:$src)))),<br>
> (VPBROADCASTQYrm addr:$src)>;<br>
> -<br>
> - // FIXME this is to handle aligned extloads from
i8/i16.<br>
> - def : Pat<(v4i32 (X86VBroadcast (loadi32
addr:$src))),<br>
> - (VPBROADCASTDrm addr:$src)>;<br>
> - def : Pat<(v8i32 (X86VBroadcast (loadi32
addr:$src))),<br>
> - (VPBROADCASTDYrm addr:$src)>;<br>
> }<br>
> let Predicates = [HasAVX2, NoVLX_Or_NoBWI] in {<br>
> // loadi16 is tricky to fold, because
!isTypeDesirableForOp, justifiably.<br>
><br>
> diff --git a/llvm/test/CodeGen/X86/vector-sext.ll
b/llvm/test/CodeGen/X86/vector-sext.ll<br>
> index 44ba29d978e2..0b35db5cadb2 100644<br>
> --- a/llvm/test/CodeGen/X86/vector-sext.ll<br>
> +++ b/llvm/test/CodeGen/X86/vector-sext.ll<br>
> @@ -2259,7 +2259,8 @@ define <8 x i32>
@load_sext_8i1_to_8i32(<8 x i1> *%ptr) {<br>
> ;<br>
> ; AVX2-LABEL: load_sext_8i1_to_8i32:<br>
> ; AVX2: # %bb.0: # %entry<br>
> -; AVX2-NEXT: vpbroadcastd (%rdi), %ymm0<br>
> +; AVX2-NEXT: vmovd {{.*#+}} xmm0 =
mem[0],zero,zero,zero<br>
> +; AVX2-NEXT: vpbroadcastd %xmm0, %ymm0<br>
> ; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 =
[1,2,4,8,16,32,64,128]<br>
> ; AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0<br>
> ; AVX2-NEXT: vpcmpeqd %ymm1, %ymm0, %ymm0<br>
><br>
><br>
> <br>
> _______________________________________________<br>
> llvm-commits mailing list<br>
> <a href="mailto:llvm-commits@lists.llvm.org"
target="_blank" moz-do-not-send="true">llvm-commits@lists.llvm.org</a><br>
> <a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote>
</div>
</blockquote>
</body>
</html>