[PATCH] D115965: [DSE] Fix a case of invalid dead store elimination
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 21 12:59:02 PST 2021
fhahn added a comment.
Thanks for the patch!
In D115965#3202681 <https://reviews.llvm.org/D115965#3202681>, @mamai wrote:
> In D115965#3200873 <https://reviews.llvm.org/D115965#3200873>, @nikic wrote:
>
>> From a cursory look, isn't the actual problem here that the malloc() is treated as loop invariant, even though it isn't?
>
> Maybe. What I had been told was that this code was not really looking at invariance but at cross iteration dependencies.
I originally made the comment about cross-iteration-dependencies (in the same loop), but that was before I had a look at the full test case and the actual problem.
It looks like our handling of `alloca`/alloc-like instructions is not correct (as @mamai mentioned in Discord as well). Instead of the current check, we should probably just check whether the object is defined in the entry block for now:
/// Returns true if \p Ptr is guaranteed to be loop invariant for any possible
/// loop. In particular, this guarantees that it only references a single
/// MemoryLocation during execution of the containing function.
bool isGuaranteedLoopInvariant(const Value *Ptr) {
- auto IsGuaranteedLoopInvariantBase = [this](const Value *Ptr) {
+ auto IsGuaranteedLoopInvariantBase = [](const Value *Ptr) {
Ptr = Ptr->stripPointerCasts();
- if (auto *I = dyn_cast<Instruction>(Ptr)) {
- if (isa<AllocaInst>(Ptr))
- return true;
-
- if (isAllocLikeFn(I, &TLI))
- return true;
-
- return false;
- }
+ if (auto *I = dyn_cast<Instruction>(Ptr))
+ return I->getParent() == &I->getFunction()->getEntryBlock();
return true;
};
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:1
+; RUN: opt -dse -S %s | FileCheck %s
+
----------------
please use the new pass manager syntax (`-passes=dse`)
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:22
+; Function Attrs: nofree nounwind optsize
+define dso_local void @test() local_unnamed_addr #0 {
+ br label %loop1
----------------
nit: no need for `dse_local` or the attributes.
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:26
+1:
+ %2 = bitcast i8* %7 to %struct.ilist*
+ %3 = getelementptr inbounds %struct.ilist, %struct.ilist* %2, i32 0, i32 1
----------------
imo it would make reading the test easier if that was between `%loop1` and `%loop2`.
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:33
+loop1:
+ %5 = phi i32 [ 0, %0 ], [ %13, %loop1 ]
+ %6 = phi %struct.ilist* [ null, %0 ], [ %8, %loop1 ]
----------------
nit: imo it would improve readability if the values would be explicitly named with descriptive names, .e.g `%iv` this one and `iv.next` for `%13` and so on.
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:38
+ %9 = getelementptr inbounds [10 x i32], [10 x i32]* @array, i32 0, i32 %5
+ %10 = load i32, i32* %9, align 4
+ %11 = getelementptr inbounds %struct.ilist, %struct.ilist* %8, i32 0, i32 0
----------------
This doesn't seem necessary for the test. Could we just set the field to a constant?
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:48
+loop2:
+ %15 = phi %struct.ilist* [ %16, %loop2 ], [ %2, %1 ]
+ %16 = phi %struct.ilist* [ %18, %loop2 ], [ %6, %1 ]
----------------
Can this loop be simplified so it doesn't need 2 pointer phis? The problem should also be exposed with an integer induction, as long as we use a pointer based on `%6`?
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:58
+exit: ; preds = %loop2, %1
+ ret void
+}
----------------
This function should probably return a pointer to the list, otherwise the memory is not used and the stores are practically dead?
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:62
+; Function Attrs: inaccessiblememonly mustprogress nofree nounwind optsize willreturn
+declare dso_local noalias noundef align 8 i8* @malloc(i32 noundef) local_unnamed_addr #1
+
----------------
nit: no need for `dso_local`.
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:65
+attributes #0 = { nofree nounwind optsize "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
+attributes #1 = { inaccessiblememonly mustprogress nofree nounwind optsize willreturn "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
+attributes #2 = { optsize }
----------------
nit: only retain the attributes needed for DSE to treat `malloc` as 'alloc-like'
================
Comment at: llvm/test/Transforms/DeadStoreElimination/store-after-loop.ll:66
+attributes #1 = { inaccessiblememonly mustprogress nofree nounwind optsize willreturn "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
+attributes #2 = { optsize }
----------------
not used?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115965/new/
https://reviews.llvm.org/D115965
More information about the llvm-commits
mailing list