[PATCH] D16693: Make sure all subranges are define during re-materialization of constants in RegCoalescer
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 28 14:24:24 PST 2016
On 01/28/2016 01:03 PM, Marcello Maggioni via llvm-commits wrote:
> kariddi created this revision.
> kariddi added a reviewer: MatzeB.
> kariddi added a subscriber: llvm-commits.
> kariddi set the repository for this revision to rL LLVM.
> Herald added subscribers: qcolombet, MatzeB.
>
> Hello,
>
> In our target we sometimes load multiple constants with the same instruction into registers that have disjoint subregisters and then use the subregisters separately.
>
> In certain cases during constant re-materialization in the register coalescer the one of these multiple constants load instructions can be selected to rematerialize only one of the constants.
>
> The situation is something like this:
>
> vreg1 = LOAD CONSTANTS 5, 8 ; Loading both 5 and 8 in different subregs
> ; Copying only part of the register here, but the rest is undef.
> vreg2:sub_16bit<def, read-undef> = COPY vreg1:sub_16bit
> ==>
> ; Materialize all the constants but only using one eventually.
> vreg2 = LOAD_CONSTANTS 5, 8
>
> In the above case we had an undefined registers that gets partially defined by the copy of one of the two constants in VREG1.
> The re-materialization substitutes the copy with the load instructions defining the whole register now.
> This is fine, because the register was undefined, but if this happens we need to update the sub-register liveness to keep track that there is a dead definition of the lanes that weren't loaded previously.
>
> Repository:
> rL LLVM
>
> http://reviews.llvm.org/D16693
>
> Files:
> lib/CodeGen/MachineVerifier.cpp
> lib/CodeGen/RegisterCoalescer.cpp
>
> Index: lib/CodeGen/RegisterCoalescer.cpp
> ===================================================================
> --- lib/CodeGen/RegisterCoalescer.cpp
> +++ lib/CodeGen/RegisterCoalescer.cpp
> @@ -999,6 +999,38 @@
>
> updateRegDefsUses(DstReg, DstReg, DstIdx);
> NewMI->getOperand(0).setSubReg(NewIdx);
> + // Add dead subregister definitions if we are defining the whole register
> + // but only part of it is live.
> + // This could happen if the rematerialization instruction is rematerializing
> + // more than actually is used in the register.
> + // An example would be:
> + // vreg1 = LOAD CONSTANTS 5, 8 ; Loading both 5 and 8 in different subregs
> + // ; Copying only part of the register here, but the rest is undef.
> + // vreg2:sub_16bit<def, read-undef> = COPY vreg1:sub_16bit
> + // ==>
> + // ; Materialize all the constants but only using one
> + // vreg2 = LOAD_CONSTANTS 5, 8
> + //
> + // at this point for the part that wasn't defined before we could have
> + // subranges missing the definition.
> + if (MRI->shouldTrackSubRegLiveness(DstReg)) {
> + LiveInterval &DstInt = LIS->getInterval(DstReg);
> + if (NewIdx == 0 && DstInt.hasSubRanges()) {
> + SlotIndex CurrIdx = LIS->getInstructionIndex(NewMI);
> + unsigned MaxMask = MRI->getMaxLaneMaskForVReg(DstReg);
> + for (auto &SR : DstInt.subranges()) {
> + if (!SR.liveAt(CurrIdx.getRegSlot())) {
> + SR.createDeadDef(CurrIdx.getRegSlot(), LIS->getVNInfoAllocator());
> + }
> + MaxMask &= ~SR.LaneMask;
> + }
> + if (MaxMask != 0) {
> + LiveInterval::SubRange *SR =
> + DstInt.createSubRange(LIS->getVNInfoAllocator(), MaxMask);
> + SR->createDeadDef(CurrIdx.getRegSlot(), LIS->getVNInfoAllocator());
> + }
> + }
> + }
> } else if (NewMI->getOperand(0).getReg() != CopyDstReg) {
> // The New instruction may be defining a sub-register of what's actually
> // been asked for. If so it must implicitly define the whole thing.
> Index: lib/CodeGen/MachineVerifier.cpp
> ===================================================================
> --- lib/CodeGen/MachineVerifier.cpp
> +++ lib/CodeGen/MachineVerifier.cpp
> @@ -1713,6 +1713,32 @@
>
> LaneBitmask Mask = 0;
> LaneBitmask MaxMask = MRI->getMaxLaneMaskForVReg(Reg);
> + // If the LiveInterval has subregister information we want to check that
> + // for each definition that defines the whole register every subrange has at
> + // least
> + // a dead definition for it.
> + if (LI.hasSubRanges()) {
> + for (auto DI = MRI->def_begin(Reg), DE = MachineRegisterInfo::def_end();
> + DI != DE; ++DI) {
> + MachineOperand &MO = *DI;
> + // We are defining the whole register, so all the subranges should have
> + // at least a dead definition.
> + if (MO.getSubReg() == 0) {
> + unsigned LiveMask = 0;
> + for (auto &SR : LI.subranges()) {
> + SlotIndex DefSlot =
> + LiveInts->getInstructionIndex(MO.getParent()).getRegSlot();
> + if (SR.liveAt(DefSlot))
> + LiveMask |= SR.LaneMask;
> + }
> + // If any lane didn't have a proper definition report.
> + if (LiveMask != MaxMask)
> + report("Lane mask defined but no sub range reports its definition",
> + MF);
> + }
> + }
> + }
> +
> for (const LiveInterval::SubRange &SR : LI.subranges()) {
> if ((Mask & SR.LaneMask) != 0) {
> report("Lane masks of sub ranges overlap in live interval", MF);
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
This problem looks similar to one I was recently trying to debug on this
testcase. The problem seems to show up when a constant is rematerialized
into a subregister, but there is a remaining undef component to the full
register, causing an error with -verify-machineinstrs and a scheduler
crash without. However, this patch seems to not solve it. Do you think
this is a similar case you may also have a problem with?
-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160128/64c42e4b/attachment.html>
-------------- next part --------------
; RUN: llc -march=amdgcn -verify-machineinstrs < %s
; ModuleID = 'bugpoint-reduced-simplified.ll'
target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-p24:64:64-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64"
target triple = "amdgcn--"
; Function Attrs: norecurse nounwind
define void @foo(<2 x i32> addrspace(1)* nocapture readonly %arg, i32 %arg1, i32 %arg2) {
bb:
br i1 undef, label %bb3, label %bb4
bb3: ; preds = %bb
%tmp = ashr exact i32 undef, 8
br label %bb6
bb4: ; preds = %bb6, %bb
%tmp5 = phi <2 x i32> [ zeroinitializer, %bb ], [ %tmp13, %bb6 ]
br i1 undef, label %bb15, label %bb16
bb6: ; preds = %bb6, %bb3
%tmp7 = phi <2 x i32> [ zeroinitializer, %bb3 ], [ %tmp13, %bb6 ]
%tmp8 = add nsw i32 0, %arg1
%tmp9 = add nsw i32 %tmp8, 0
%tmp10 = sext i32 %tmp9 to i64
%tmp11 = getelementptr inbounds <2 x i32>, <2 x i32> addrspace(1)* %arg, i64 %tmp10
%tmp12 = load <2 x i32>, <2 x i32> addrspace(1)* %tmp11, align 8
%tmp13 = add <2 x i32> %tmp12, %tmp7
%tmp14 = icmp slt i32 undef, %arg2
br i1 %tmp14, label %bb6, label %bb4
bb15: ; preds = %bb4
store <2 x i32> %tmp5, <2 x i32> addrspace(3)* undef, align 8
br label %bb16
bb16: ; preds = %bb15, %bb4
unreachable
}
More information about the llvm-commits
mailing list