[PATCH] D16693: Make sure all subranges are define during re-materialization of constants in RegCoalescer

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 28 14:24:24 PST 2016


On 01/28/2016 01:03 PM, Marcello Maggioni via llvm-commits wrote:
> kariddi created this revision.
> kariddi added a reviewer: MatzeB.
> kariddi added a subscriber: llvm-commits.
> kariddi set the repository for this revision to rL LLVM.
> Herald added subscribers: qcolombet, MatzeB.
>
> Hello,
>
> In our target we sometimes load multiple constants with the same instruction into registers that have disjoint subregisters and then use the subregisters separately.
>
> In certain cases during constant re-materialization in the register coalescer the one of these multiple constants load instructions can be selected to rematerialize only one of the constants.
>
> The situation is something like this:
>
>   vreg1 = LOAD CONSTANTS 5, 8 ; Loading both 5 and 8 in different subregs
>   ; Copying only part of the register here, but the rest is undef.
>   vreg2:sub_16bit<def, read-undef> = COPY vreg1:sub_16bit
>   ==>
>   ; Materialize all the constants but only using one eventually.
>   vreg2 = LOAD_CONSTANTS 5, 8
>
> In the above case we had an undefined registers that gets partially defined by the copy of one of the two constants in VREG1.
> The re-materialization substitutes the copy with the load instructions defining the whole register now.
> This is fine, because the register was undefined, but if this happens we need to update the sub-register liveness to keep track that there is a dead definition of the lanes that weren't loaded previously.
>
> Repository:
>    rL LLVM
>
> http://reviews.llvm.org/D16693
>
> Files:
>    lib/CodeGen/MachineVerifier.cpp
>    lib/CodeGen/RegisterCoalescer.cpp
>
> Index: lib/CodeGen/RegisterCoalescer.cpp
> ===================================================================
> --- lib/CodeGen/RegisterCoalescer.cpp
> +++ lib/CodeGen/RegisterCoalescer.cpp
> @@ -999,6 +999,38 @@
>   
>       updateRegDefsUses(DstReg, DstReg, DstIdx);
>       NewMI->getOperand(0).setSubReg(NewIdx);
> +    // Add dead subregister definitions if we are defining the whole register
> +    // but only part of it is live.
> +    // This could happen if the rematerialization instruction is rematerializing
> +    // more than actually is used in the register.
> +    // An example would be:
> +    // vreg1 = LOAD CONSTANTS 5, 8 ; Loading both 5 and 8 in different subregs
> +    // ; Copying only part of the register here, but the rest is undef.
> +    // vreg2:sub_16bit<def, read-undef> = COPY vreg1:sub_16bit
> +    // ==>
> +    // ; Materialize all the constants but only using one
> +    // vreg2 = LOAD_CONSTANTS 5, 8
> +    //
> +    // at this point for the part that wasn't defined before we could have
> +    // subranges missing the definition.
> +    if (MRI->shouldTrackSubRegLiveness(DstReg)) {
> +      LiveInterval &DstInt = LIS->getInterval(DstReg);
> +      if (NewIdx == 0 && DstInt.hasSubRanges()) {
> +        SlotIndex CurrIdx = LIS->getInstructionIndex(NewMI);
> +        unsigned MaxMask = MRI->getMaxLaneMaskForVReg(DstReg);
> +        for (auto &SR : DstInt.subranges()) {
> +          if (!SR.liveAt(CurrIdx.getRegSlot())) {
> +            SR.createDeadDef(CurrIdx.getRegSlot(), LIS->getVNInfoAllocator());
> +          }
> +          MaxMask &= ~SR.LaneMask;
> +        }
> +        if (MaxMask != 0) {
> +          LiveInterval::SubRange *SR =
> +              DstInt.createSubRange(LIS->getVNInfoAllocator(), MaxMask);
> +          SR->createDeadDef(CurrIdx.getRegSlot(), LIS->getVNInfoAllocator());
> +        }
> +      }
> +    }
>     } else if (NewMI->getOperand(0).getReg() != CopyDstReg) {
>       // The New instruction may be defining a sub-register of what's actually
>       // been asked for. If so it must implicitly define the whole thing.
> Index: lib/CodeGen/MachineVerifier.cpp
> ===================================================================
> --- lib/CodeGen/MachineVerifier.cpp
> +++ lib/CodeGen/MachineVerifier.cpp
> @@ -1713,6 +1713,32 @@
>   
>     LaneBitmask Mask = 0;
>     LaneBitmask MaxMask = MRI->getMaxLaneMaskForVReg(Reg);
> +  // If the LiveInterval has subregister information we want to check that
> +  // for each definition that defines the whole register every subrange has at
> +  // least
> +  // a dead definition for it.
> +  if (LI.hasSubRanges()) {
> +    for (auto DI = MRI->def_begin(Reg), DE = MachineRegisterInfo::def_end();
> +         DI != DE; ++DI) {
> +      MachineOperand &MO = *DI;
> +      // We are defining the whole register, so all the subranges should have
> +      // at least a dead definition.
> +      if (MO.getSubReg() == 0) {
> +        unsigned LiveMask = 0;
> +        for (auto &SR : LI.subranges()) {
> +          SlotIndex DefSlot =
> +              LiveInts->getInstructionIndex(MO.getParent()).getRegSlot();
> +          if (SR.liveAt(DefSlot))
> +            LiveMask |= SR.LaneMask;
> +        }
> +        // If any lane didn't have a proper definition report.
> +        if (LiveMask != MaxMask)
> +          report("Lane mask defined but no sub range reports its definition",
> +                 MF);
> +      }
> +    }
> +  }
> +
>     for (const LiveInterval::SubRange &SR : LI.subranges()) {
>       if ((Mask & SR.LaneMask) != 0) {
>         report("Lane masks of sub ranges overlap in live interval", MF);
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits

This problem looks similar to one I was recently trying to debug on this 
testcase. The problem seems to show up when a constant is rematerialized 
into a subregister, but there is a remaining undef component to the full 
register, causing an error with -verify-machineinstrs and a scheduler 
crash without. However, this patch seems to not solve it. Do you think 
this is a similar case you may also have a problem with?

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160128/64c42e4b/attachment.html>
-------------- next part --------------
; RUN: llc -march=amdgcn -verify-machineinstrs < %s
; ModuleID = 'bugpoint-reduced-simplified.ll'
target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-p24:64:64-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64"
target triple = "amdgcn--"

; Function Attrs: norecurse nounwind
define void @foo(<2 x i32> addrspace(1)* nocapture readonly %arg, i32 %arg1, i32 %arg2)  {
bb:
  br i1 undef, label %bb3, label %bb4

bb3:                                              ; preds = %bb
  %tmp = ashr exact i32 undef, 8
  br label %bb6

bb4:                                              ; preds = %bb6, %bb
  %tmp5 = phi <2 x i32> [ zeroinitializer, %bb ], [ %tmp13, %bb6 ]
  br i1 undef, label %bb15, label %bb16

bb6:                                              ; preds = %bb6, %bb3
  %tmp7 = phi <2 x i32> [ zeroinitializer, %bb3 ], [ %tmp13, %bb6 ]
  %tmp8 = add nsw i32 0, %arg1
  %tmp9 = add nsw i32 %tmp8, 0
  %tmp10 = sext i32 %tmp9 to i64
  %tmp11 = getelementptr inbounds <2 x i32>, <2 x i32> addrspace(1)* %arg, i64 %tmp10
  %tmp12 = load <2 x i32>, <2 x i32> addrspace(1)* %tmp11, align 8
  %tmp13 = add <2 x i32> %tmp12, %tmp7
  %tmp14 = icmp slt i32 undef, %arg2
  br i1 %tmp14, label %bb6, label %bb4

bb15:                                             ; preds = %bb4
  store <2 x i32> %tmp5, <2 x i32> addrspace(3)* undef, align 8
  br label %bb16

bb16:                                             ; preds = %bb15, %bb4
  unreachable
}


More information about the llvm-commits mailing list