[llvm] r296260 - [ExecutionDepsFix] Don't make copies of LiveReg objects when collecting operands for soft instructions

Craig Topper via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 27 11:11:21 PST 2017


Well the k1 value should correspond to the <16 x i1> undef input to the
select instruction. Maybe something changed about undef propagation after
4.0?

> +define void @f_f___un_3C_unf_3E_un_3C_unf_3E_() {
> +; CHECK-LABEL: f_f___un_3C_unf_3E_un_3C_unf_3E_:
> +; CHECK:       # BB#0:
> +; CHECK-NEXT:    vmovapd 0, %zmm0
> +; CHECK-NEXT:    vmovapd 64, %zmm1
> +; CHECK-NEXT:    vmovapd {{.*#+}} zmm2 = [0,16,0,16,0,16,0,16,0,16,0,
16,0,16,0,16]
> +; CHECK-NEXT:    vorpd %zmm2, %zmm0, %zmm0 {%k1}
> +; CHECK-NEXT:    vorpd %zmm2, %zmm1, %zmm1 {%k1}
> +; CHECK-NEXT:    vmovapd %zmm1, 64
> +; CHECK-NEXT:    vmovapd %zmm0, 0
> +; CHECK-NEXT:    retl
> +  %a_load22 = load <16 x i64>, <16 x i64>* null, align 1
> +  %bitop = or <16 x i64> %a_load22, <i64 68719476736, i64 68719476736,
i64 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
68719476736, i64 68719476736>
> +  %v.i = load <16 x i64>, <16 x i64>* null
> +  %v1.i41 = select <16 x i1> undef, <16 x i64> %bitop, <16 x i64> %v.i
> +  store <16 x i64> %v1.i41, <16 x i64>* null
> +  ret void
> +}

~Craig

On Mon, Feb 27, 2017 at 9:48 AM, Hans Wennborg <hans at chromium.org> wrote:

> Hi Craig,
>
> When I try to merge this to 4.0, I get an extra "kshiftrw" in the asm
> for the test case:
>
>         vmovapd 0, %zmm0
>         vmovapd 64, %zmm1
>         vmovapd .LCPI0_0, %zmm2         # zmm2 =
> [0,16,0,16,0,16,0,16,0,16,0,16,0,16,0,16]
>         kshiftrw        $8, %k0, %k1   <--------------------------
>         vorpd   %zmm2, %zmm1, %zmm1 {%k1}
>         vorpd   %zmm2, %zmm0, %zmm0 {%k1}
>         vmovapd %zmm0, 0
>         vmovapd %zmm1, 64
>         retl
>
> I'm not familiar with AVX-512, but it seems like that is setting up
> the %k1 mask used by the or-instructions below.
>
> Is it benign? How does the trunk version work - is %k1 not initialized
> there?
>
> Thanks,
> Hans
>
>
> On Sat, Feb 25, 2017 at 10:12 AM, Craig Topper via llvm-commits
> <llvm-commits at lists.llvm.org> wrote:
> > Author: ctopper
> > Date: Sat Feb 25 12:12:25 2017
> > New Revision: 296260
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=296260&view=rev
> > Log:
> > [ExecutionDepsFix] Don't make copies of LiveReg objects when collecting
> operands for soft instructions
> >
> > Summary:
> > While collecting operands we make copies of the LiveReg objects which
> are stored in the LiveRegs array. If the instruction uses the same register
> multiple times we end up with multiple copies. Later we iterate through the
> collected list of LiveReg objects and merge DomainValues. In the process of
> doing this the merge function can change the contents of the original
> LiveReg object in the LiveRegs array, but not the copies that have been
> made. So when we get to the second usage of the register we end up seeing a
> stale copy of the LiveReg object.
> >
> > To fix this I've stopped copying and now just store a pointer to the
> original LiveReg object. Another option might be to avoid adding the same
> register to the Regs array twice, but this approach seemed simpler.
> >
> > The included test case exposes this bug due to an AVX-512 masked OR
> instruction using the same register for the passthru operand and one of the
> inputs to the OR operation.
> >
> > Fixes PR30284.
> >
> > Reviewers: RKSimon, stoklund, MatzeB, spatel, myatsina
> >
> > Reviewed By: RKSimon
> >
> > Subscribers: llvm-commits
> >
> > Differential Revision: https://reviews.llvm.org/D30242
> >
> > Added:
> >     llvm/trunk/test/CodeGen/X86/pr30284.ll
> > Modified:
> >     llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
> >
> > Modified: llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/
> CodeGen/ExecutionDepsFix.cpp?rev=296260&r1=296259&r2=296260&view=diff
> > ============================================================
> ==================
> > --- llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp (original)
> > +++ llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp Sat Feb 25 12:12:25 2017
> > @@ -721,7 +721,7 @@ void ExeDepsFix::visitSoftInstr(MachineI
> >
> >    // Kill off any remaining uses that don't match available, and build
> a list of
> >    // incoming DomainValues that we want to merge.
> > -  SmallVector<LiveReg, 4> Regs;
> > +  SmallVector<const LiveReg *, 4> Regs;
> >    for (int rx : used) {
> >      assert(LiveRegs && "no space allocated for live registers");
> >      const LiveReg &LR = LiveRegs[rx];
> > @@ -731,16 +731,11 @@ void ExeDepsFix::visitSoftInstr(MachineI
> >        continue;
> >      }
> >      // Sorted insertion.
> > -    bool Inserted = false;
> > -    for (SmallVectorImpl<LiveReg>::iterator i = Regs.begin(), e =
> Regs.end();
> > -           i != e && !Inserted; ++i) {
> > -      if (LR.Def < i->Def) {
> > -        Inserted = true;
> > -        Regs.insert(i, LR);
> > -      }
> > -    }
> > -    if (!Inserted)
> > -      Regs.push_back(LR);
> > +    auto I = std::upper_bound(Regs.begin(), Regs.end(), &LR,
> > +                              [](const LiveReg *LHS, const LiveReg
> *RHS) {
> > +                                return LHS->Def < RHS->Def;
> > +                              });
> > +    Regs.insert(I, &LR);
> >    }
> >
> >    // doms are now sorted in order of appearance. Try to merge them all,
> giving
> > @@ -748,14 +743,14 @@ void ExeDepsFix::visitSoftInstr(MachineI
> >    DomainValue *dv = nullptr;
> >    while (!Regs.empty()) {
> >      if (!dv) {
> > -      dv = Regs.pop_back_val().Value;
> > +      dv = Regs.pop_back_val()->Value;
> >        // Force the first dv to match the current instruction.
> >        dv->AvailableDomains = dv->getCommonDomains(available);
> >        assert(dv->AvailableDomains && "Domain should have been
> filtered");
> >        continue;
> >      }
> >
> > -    DomainValue *Latest = Regs.pop_back_val().Value;
> > +    DomainValue *Latest = Regs.pop_back_val()->Value;
> >      // Skip already merged values.
> >      if (Latest == dv || Latest->Next)
> >        continue;
> >
> > Added: llvm/trunk/test/CodeGen/X86/pr30284.ll
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
> CodeGen/X86/pr30284.ll?rev=296260&view=auto
> > ============================================================
> ==================
> > --- llvm/trunk/test/CodeGen/X86/pr30284.ll (added)
> > +++ llvm/trunk/test/CodeGen/X86/pr30284.ll Sat Feb 25 12:12:25 2017
> > @@ -0,0 +1,21 @@
> > +; NOTE: Assertions have been autogenerated by
> utils/update_llc_test_checks.py
> > +; RUN: llc < %s -mtriple=i386-unknown-linux-gnu -mattr=avx512dq |
> FileCheck %s
> > +
> > +define void @f_f___un_3C_unf_3E_un_3C_unf_3E_() {
> > +; CHECK-LABEL: f_f___un_3C_unf_3E_un_3C_unf_3E_:
> > +; CHECK:       # BB#0:
> > +; CHECK-NEXT:    vmovapd 0, %zmm0
> > +; CHECK-NEXT:    vmovapd 64, %zmm1
> > +; CHECK-NEXT:    vmovapd {{.*#+}} zmm2 = [0,16,0,16,0,16,0,16,0,16,0,
> 16,0,16,0,16]
> > +; CHECK-NEXT:    vorpd %zmm2, %zmm0, %zmm0 {%k1}
> > +; CHECK-NEXT:    vorpd %zmm2, %zmm1, %zmm1 {%k1}
> > +; CHECK-NEXT:    vmovapd %zmm1, 64
> > +; CHECK-NEXT:    vmovapd %zmm0, 0
> > +; CHECK-NEXT:    retl
> > +  %a_load22 = load <16 x i64>, <16 x i64>* null, align 1
> > +  %bitop = or <16 x i64> %a_load22, <i64 68719476736, i64 68719476736,
> i64 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
> 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
> 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
> 68719476736, i64 68719476736>
> > +  %v.i = load <16 x i64>, <16 x i64>* null
> > +  %v1.i41 = select <16 x i1> undef, <16 x i64> %bitop, <16 x i64> %v.i
> > +  store <16 x i64> %v1.i41, <16 x i64>* null
> > +  ret void
> > +}
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170227/45a2a539/attachment.html>


More information about the llvm-commits mailing list