[PATCH] D112472: [Scalarizer] Do not insert instructions between PHI nodes.
Bjorn Pettersson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 1 10:01:44 PDT 2021
bjope added a comment.
Hi @vettoreldaniele
I noticed that we have two downstream fixes (that for some reason haven't been upstreamed yet, I figure we did not have any reproducers identifying this as an upstream problem) that is a bit related to your patch.
Our patch looks like this:
--- a/llvm/lib/Transforms/Scalar/Scalarizer.cpp
+++ b/llvm/lib/Transforms/Scalar/Scalarizer.cpp
@@ -66,6 +66,15 @@ static cl::opt<bool>
namespace {
+static BasicBlock::iterator skipPastPhiNodesAndDbg(BasicBlock::iterator Itr) {
+ BasicBlock *BB = Itr->getParent();
+ if (isa<PHINode>(Itr))
+ Itr = BB->getFirstInsertionPt();
+ if (Itr != BB->end())
+ Itr = skipDebugIntrinsics(Itr);
+ return Itr;
+}
+
+}
@@ -373,7 +407,7 @@ Scatterer ScalarizerVisitor::scatter(Instruction *Point, Value *V) {
// Put the scattered form of an instruction directly after the
// instruction.
BasicBlock *BB = VOp->getParent();
- return Scatterer(BB, std::next(BasicBlock::iterator(VOp)),
+ return Scatterer(BB, skipPastPhiNodesAndDbg(std::next(VOp->getIterator())),
V, &Scattered[V]);
}
// In the fallback case, just put the scattered before Point and
So there are actually two fixes involved. One is to skip past PHI nodes (same as your problem), and the other is to solve a debug-invariance problem by also skipping past dbg instrinsics.
Our test case for the latter ( test/Transforms/Scalarizer/dbg-invariant.ll ) looks like this:
; RUN: opt -strip-debug -passes=scalarizer -S < %s | FileCheck %s
; RUN: opt -passes=scalarizer -S < %s | FileCheck %s
; CHECK: %0 = load <8 x i16>
; CHECK: %.i0 = extractelement <8 x i16> %0, i32 0
; CHECK-NEXT: %.i01 = add i16 %.i0, 28690
; CHECK: %.i1 = extractelement <8 x i16> %0, i32 1
; CHECK-NEXT: %.i12 = add i16 %.i1, 28690
; CHECK: %.i2 = extractelement <8 x i16> %0, i32 2
; CHECK-NEXT: %.i23 = add i16 %.i2, 28690
; CHECK: %.i3 = extractelement <8 x i16> %0, i32 3
; CHECK-NEXT: %.i34 = add i16 %.i3, 28690
; CHECK: %.i4 = extractelement <8 x i16> %0, i32 4
; CHECK-NEXT: %.i45 = add i16 %.i4, 28690
; CHECK: %.i5 = extractelement <8 x i16> %0, i32 5
; CHECK-NEXT: %.i56 = add i16 %.i5, 28690
; CHECK: %.i6 = extractelement <8 x i16> %0, i32 6
; CHECK-NEXT: %.i67 = add i16 %.i6, 28690
; CHECK: %.i7 = extractelement <8 x i16> %0, i32 7
; CHECK-NEXT: = add i16 %.i7, 28690
@d = external global [8 x i16], align 1
@e = external global i16, align 1
; Function Attrs: nofree norecurse nounwind
define dso_local void @foo() local_unnamed_addr #0 !dbg !7 {
entry:
%0 = load <8 x i16>, <8 x i16>* bitcast ([8 x i16]* @d to <8 x i16>*), align 1
call void @llvm.dbg.value(metadata i16 0, metadata !11, metadata !DIExpression()), !dbg !13
%1 = add <8 x i16> %0, <i16 28690, i16 28690, i16 28690, i16 28690, i16 28690, i16 28690, i16 28690, i16 28690>, !dbg !13
store <8 x i16> %1, <8 x i16>* bitcast ([8 x i16]* @d to <8 x i16>*), align 1, !dbg !13
%2 = extractelement <8 x i16> %1, i32 7, !dbg !13
store i16 %2, i16* @e, align 1, !dbg !13
ret void
}
; Function Attrs: nounwind readnone speculatable willreturn
declare void @llvm.dbg.value(metadata, metadata, metadata) #1
attributes #0 = { nofree norecurse nounwind }
attributes #1 = { nounwind readnone speculatable willreturn }
!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4, !5}
!llvm.ident = !{!6}
!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 11.0.0 ", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: false, nameTableKind: None)
!1 = !DIFile(filename: "foo.c", directory: "/")
!2 = !{}
!3 = !{i32 7, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = !{i32 1, !"wchar_size", i32 1}
!6 = !{!"clang version 11.0.0 "}
!7 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 4, type: !8, scopeLine: 4, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !10)
!8 = !DISubroutineType(types: !9)
!9 = !{null}
!10 = !{!11}
!11 = !DILocalVariable(name: "i", scope: !7, file: !1, line: 5, type: !12)
!12 = !DIBasicType(name: "int", size: 16, encoding: DW_ATE_signed)
!13 = !DILocation(line: 0, scope: !7)
And the motivation in the commit message was:
[Scalarizer] Fix a debug invariance problem
This fixes a debug info invariance problem for IR of the following form:
%0 = load <8 x i16>, [...]
<potentially llvm.dbg.value intrinsics here>
%1 = add <8 x i16> %0, <[...]>
When scalarizing that, the extractvalue instructions were inserted
before the instruction succeeding the load instruction, and the
scalarized add instructions were inserted before the vectorized add
instruction. Without debug intrinsics, those two insertion points were
the same, resulting in IR of the following form:
%0 = load <8 x i16>, [...]
%.i0 = extractelement <8 x i16> %0, i32 0
%.i01 = add i16 %.i0, [...]
%.i1 = extractelement <8 x i16> %0, i32 1
%.i12 = add i16 %.i1, [...]
[...]
but of the following form with debug intrinsics present:
%0 = load <8 x i16>, [...]
%.i0 = extractelement <8 x i16> %0, i32 0
%.i1 = extractelement <8 x i16> %0, i32 1
[...]
<llvm.dbg.value intrinsics>
%.i01 = add i16 %.i0, [...]
%.i12 = add i16 %.i1, [...]
[...]
This patch fixes this debug info variance by skipping past any debug
intrinsics when determining the insertion point for the extractvalue
instructions.
So I wouldn't mind if we try to solve both problems here (by using some kind of helper such as `skipPastPhiNodesAndDbg `from the downstream patch I refer to above).
Let me know if you for example want me to pre-commit the test case for the debug-invariance problem.
Or if you want to focus on the phi node problem at first, then I can follow up with a patch that also skips past the dbg intrinsics.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D112472/new/
https://reviews.llvm.org/D112472
More information about the llvm-commits
mailing list