[llvm] r296260 - [ExecutionDepsFix] Don't make copies of LiveReg objects when collecting operands for soft instructions

Hans Wennborg via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 27 13:06:01 PST 2017


Ah, I didn't see that it was selecting on undef.

Sounds like the extra instruction is benign, then.

I've merged in r296380.

Thanks for fixing this!

 - Hans

On Mon, Feb 27, 2017 at 11:11 AM, Craig Topper <craig.topper at gmail.com> wrote:
> Well the k1 value should correspond to the <16 x i1> undef input to the
> select instruction. Maybe something changed about undef propagation after
> 4.0?
>
>> +define void @f_f___un_3C_unf_3E_un_3C_unf_3E_() {
>> +; CHECK-LABEL: f_f___un_3C_unf_3E_un_3C_unf_3E_:
>> +; CHECK:       # BB#0:
>> +; CHECK-NEXT:    vmovapd 0, %zmm0
>> +; CHECK-NEXT:    vmovapd 64, %zmm1
>> +; CHECK-NEXT:    vmovapd {{.*#+}} zmm2 =
>> [0,16,0,16,0,16,0,16,0,16,0,16,0,16,0,16]
>> +; CHECK-NEXT:    vorpd %zmm2, %zmm0, %zmm0 {%k1}
>> +; CHECK-NEXT:    vorpd %zmm2, %zmm1, %zmm1 {%k1}
>> +; CHECK-NEXT:    vmovapd %zmm1, 64
>> +; CHECK-NEXT:    vmovapd %zmm0, 0
>> +; CHECK-NEXT:    retl
>> +  %a_load22 = load <16 x i64>, <16 x i64>* null, align 1
>> +  %bitop = or <16 x i64> %a_load22, <i64 68719476736, i64 68719476736,
>> i64 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
>> 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
>> 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
>> 68719476736, i64 68719476736>
>> +  %v.i = load <16 x i64>, <16 x i64>* null
>> +  %v1.i41 = select <16 x i1> undef, <16 x i64> %bitop, <16 x i64> %v.i
>> +  store <16 x i64> %v1.i41, <16 x i64>* null
>> +  ret void
>> +}
>
> ~Craig
>
> On Mon, Feb 27, 2017 at 9:48 AM, Hans Wennborg <hans at chromium.org> wrote:
>>
>> Hi Craig,
>>
>> When I try to merge this to 4.0, I get an extra "kshiftrw" in the asm
>> for the test case:
>>
>>         vmovapd 0, %zmm0
>>         vmovapd 64, %zmm1
>>         vmovapd .LCPI0_0, %zmm2         # zmm2 =
>> [0,16,0,16,0,16,0,16,0,16,0,16,0,16,0,16]
>>         kshiftrw        $8, %k0, %k1   <--------------------------
>>         vorpd   %zmm2, %zmm1, %zmm1 {%k1}
>>         vorpd   %zmm2, %zmm0, %zmm0 {%k1}
>>         vmovapd %zmm0, 0
>>         vmovapd %zmm1, 64
>>         retl
>>
>> I'm not familiar with AVX-512, but it seems like that is setting up
>> the %k1 mask used by the or-instructions below.
>>
>> Is it benign? How does the trunk version work - is %k1 not initialized
>> there?
>>
>> Thanks,
>> Hans
>>
>>
>> On Sat, Feb 25, 2017 at 10:12 AM, Craig Topper via llvm-commits
>> <llvm-commits at lists.llvm.org> wrote:
>> > Author: ctopper
>> > Date: Sat Feb 25 12:12:25 2017
>> > New Revision: 296260
>> >
>> > URL: http://llvm.org/viewvc/llvm-project?rev=296260&view=rev
>> > Log:
>> > [ExecutionDepsFix] Don't make copies of LiveReg objects when collecting
>> > operands for soft instructions
>> >
>> > Summary:
>> > While collecting operands we make copies of the LiveReg objects which
>> > are stored in the LiveRegs array. If the instruction uses the same register
>> > multiple times we end up with multiple copies. Later we iterate through the
>> > collected list of LiveReg objects and merge DomainValues. In the process of
>> > doing this the merge function can change the contents of the original
>> > LiveReg object in the LiveRegs array, but not the copies that have been
>> > made. So when we get to the second usage of the register we end up seeing a
>> > stale copy of the LiveReg object.
>> >
>> > To fix this I've stopped copying and now just store a pointer to the
>> > original LiveReg object. Another option might be to avoid adding the same
>> > register to the Regs array twice, but this approach seemed simpler.
>> >
>> > The included test case exposes this bug due to an AVX-512 masked OR
>> > instruction using the same register for the passthru operand and one of the
>> > inputs to the OR operation.
>> >
>> > Fixes PR30284.
>> >
>> > Reviewers: RKSimon, stoklund, MatzeB, spatel, myatsina
>> >
>> > Reviewed By: RKSimon
>> >
>> > Subscribers: llvm-commits
>> >
>> > Differential Revision: https://reviews.llvm.org/D30242
>> >
>> > Added:
>> >     llvm/trunk/test/CodeGen/X86/pr30284.ll
>> > Modified:
>> >     llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
>> >
>> > Modified: llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
>> > URL:
>> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp?rev=296260&r1=296259&r2=296260&view=diff
>> >
>> > ==============================================================================
>> > --- llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp (original)
>> > +++ llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp Sat Feb 25 12:12:25 2017
>> > @@ -721,7 +721,7 @@ void ExeDepsFix::visitSoftInstr(MachineI
>> >
>> >    // Kill off any remaining uses that don't match available, and build
>> > a list of
>> >    // incoming DomainValues that we want to merge.
>> > -  SmallVector<LiveReg, 4> Regs;
>> > +  SmallVector<const LiveReg *, 4> Regs;
>> >    for (int rx : used) {
>> >      assert(LiveRegs && "no space allocated for live registers");
>> >      const LiveReg &LR = LiveRegs[rx];
>> > @@ -731,16 +731,11 @@ void ExeDepsFix::visitSoftInstr(MachineI
>> >        continue;
>> >      }
>> >      // Sorted insertion.
>> > -    bool Inserted = false;
>> > -    for (SmallVectorImpl<LiveReg>::iterator i = Regs.begin(), e =
>> > Regs.end();
>> > -           i != e && !Inserted; ++i) {
>> > -      if (LR.Def < i->Def) {
>> > -        Inserted = true;
>> > -        Regs.insert(i, LR);
>> > -      }
>> > -    }
>> > -    if (!Inserted)
>> > -      Regs.push_back(LR);
>> > +    auto I = std::upper_bound(Regs.begin(), Regs.end(), &LR,
>> > +                              [](const LiveReg *LHS, const LiveReg
>> > *RHS) {
>> > +                                return LHS->Def < RHS->Def;
>> > +                              });
>> > +    Regs.insert(I, &LR);
>> >    }
>> >
>> >    // doms are now sorted in order of appearance. Try to merge them all,
>> > giving
>> > @@ -748,14 +743,14 @@ void ExeDepsFix::visitSoftInstr(MachineI
>> >    DomainValue *dv = nullptr;
>> >    while (!Regs.empty()) {
>> >      if (!dv) {
>> > -      dv = Regs.pop_back_val().Value;
>> > +      dv = Regs.pop_back_val()->Value;
>> >        // Force the first dv to match the current instruction.
>> >        dv->AvailableDomains = dv->getCommonDomains(available);
>> >        assert(dv->AvailableDomains && "Domain should have been
>> > filtered");
>> >        continue;
>> >      }
>> >
>> > -    DomainValue *Latest = Regs.pop_back_val().Value;
>> > +    DomainValue *Latest = Regs.pop_back_val()->Value;
>> >      // Skip already merged values.
>> >      if (Latest == dv || Latest->Next)
>> >        continue;
>> >
>> > Added: llvm/trunk/test/CodeGen/X86/pr30284.ll
>> > URL:
>> > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pr30284.ll?rev=296260&view=auto
>> >
>> > ==============================================================================
>> > --- llvm/trunk/test/CodeGen/X86/pr30284.ll (added)
>> > +++ llvm/trunk/test/CodeGen/X86/pr30284.ll Sat Feb 25 12:12:25 2017
>> > @@ -0,0 +1,21 @@
>> > +; NOTE: Assertions have been autogenerated by
>> > utils/update_llc_test_checks.py
>> > +; RUN: llc < %s -mtriple=i386-unknown-linux-gnu -mattr=avx512dq |
>> > FileCheck %s
>> > +
>> > +define void @f_f___un_3C_unf_3E_un_3C_unf_3E_() {
>> > +; CHECK-LABEL: f_f___un_3C_unf_3E_un_3C_unf_3E_:
>> > +; CHECK:       # BB#0:
>> > +; CHECK-NEXT:    vmovapd 0, %zmm0
>> > +; CHECK-NEXT:    vmovapd 64, %zmm1
>> > +; CHECK-NEXT:    vmovapd {{.*#+}} zmm2 =
>> > [0,16,0,16,0,16,0,16,0,16,0,16,0,16,0,16]
>> > +; CHECK-NEXT:    vorpd %zmm2, %zmm0, %zmm0 {%k1}
>> > +; CHECK-NEXT:    vorpd %zmm2, %zmm1, %zmm1 {%k1}
>> > +; CHECK-NEXT:    vmovapd %zmm1, 64
>> > +; CHECK-NEXT:    vmovapd %zmm0, 0
>> > +; CHECK-NEXT:    retl
>> > +  %a_load22 = load <16 x i64>, <16 x i64>* null, align 1
>> > +  %bitop = or <16 x i64> %a_load22, <i64 68719476736, i64 68719476736,
>> > i64 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
>> > 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
>> > 68719476736, i64 68719476736, i64 68719476736, i64 68719476736, i64
>> > 68719476736, i64 68719476736>
>> > +  %v.i = load <16 x i64>, <16 x i64>* null
>> > +  %v1.i41 = select <16 x i1> undef, <16 x i64> %bitop, <16 x i64> %v.i
>> > +  store <16 x i64> %v1.i41, <16 x i64>* null
>> > +  ret void
>> > +}
>> >
>> >
>> > _______________________________________________
>> > llvm-commits mailing list
>> > llvm-commits at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>


More information about the llvm-commits mailing list