<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Figure out what to do about more complex memory optimization in NewGVN"
href="https://llvm.org/bugs/show_bug.cgi?id=31792">31792</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Figure out what to do about more complex memory optimization in NewGVN
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Scalar Optimizations
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>dberlin@dberlin.org
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>After predicate info, we will still miss one optimization in cond_br2.ll
We get very close, but get blocked by a single store, actually.
In cond_br2.ll we have this (with memoryssa annotation):
define void @bar() #0 personality i8* bitcast (i32 (...)* @spam to i8*) {
bb:
%tmp = alloca %struct.baz, align 16
%tmp1 = bitcast %struct.baz* %tmp to i8*
; 1 = MemoryDef(liveOnEntry)
call void @llvm.lifetime.start(i64 64, i8* %tmp1) #4
%tmp2 = getelementptr inbounds %struct.baz, %struct.baz* %tmp, i64 0, i32 0,
i32 0, i32 0, i32 0, i32 0
%tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %tmp, i64 0, i32 0,
i32 0, i32 0, i32 0, i32 3
%tmp4 = bitcast %struct.hoge* %tmp3 to i8*
; 2 = MemoryDef(1)
store i8* %tmp4, i8** %tmp2, align 16, !tbaa !0
%tmp5 = getelementptr inbounds %struct.baz, %struct.baz* %tmp, i64 0, i32 0,
i32 0, i32 0, i32 0, i32 1
; 3 = MemoryDef(2)
store i8* %tmp4, i8** %tmp5, align 8, !tbaa !0
%tmp6 = getelementptr inbounds %struct.baz, %struct.baz* %tmp, i64 0, i32 0,
i32 0, i32 0, i32 0, i32 2
%tmp7 = getelementptr inbounds %struct.hoge, %struct.hoge* %tmp3, i64 2
%tmp8 = bitcast %struct.hoge* %tmp7 to i8*
; 4 = MemoryDef(3)
store i8* %tmp8, i8** %tmp6, align 16, !tbaa !0
%tmp9 = getelementptr inbounds %struct.baz, %struct.baz* %tmp, i64 0, i32 0,
i32 0, i32 0, i32 0, i32 1
; MemoryUse(3)
%tmp10 = load i8*, i8** %tmp9, align 8, !tbaa !0
%tmp11 = getelementptr inbounds %struct.baz, %struct.baz* %tmp, i64 0, i32 0,
i32 0, i32 0, i32 0, i32 2
%tmp12 = icmp ult i8* %tmp10, %tmp8
br i1 %tmp12, label %bb13, label %bb18
bb13: ; preds = %bb20, %bb
; 15 = MemoryPhi({bb,4},{bb20,6})
%tmp14 = phi i8* [ %tmp10, %bb ], [ %tmp21, %bb20 ]
%tmp15 = icmp eq i8* %tmp14, null
br i1 %tmp15, label %bb22, label %bb16
bb16: ; preds = %bb13
%tmp17 = bitcast i8* %tmp14 to i32*
; 5 = MemoryDef(15)
store i32 1, i32* %tmp17, align 4, !tbaa !4
br label %bb22
bb18: ; preds = %bb
%tmp19 = getelementptr inbounds %struct.baz, %struct.baz* %tmp, i64 0, i32 0,
i32 0, i32 0, i32 0
; 6 = MemoryDef(4)
invoke void @ham.1(%struct.quux* %tmp19, i64 0, i64 4)
to label %bb20 unwind label %bb42
bb20: ; preds = %bb18
; MemoryUse(6)
%tmp21 = load i8*, i8** %tmp9, align 8, !tbaa !0
br label %bb13
bb22: ; preds = %bb16, %bb13
; 16 = MemoryPhi({bb13,15},{bb16,5})
%tmp23 = getelementptr inbounds i8, i8* %tmp14, i64 4
; 7 = MemoryDef(16)
store i8* %tmp23, i8** %tmp9, align 8, !tbaa !0
; MemoryUse(16)
%tmp24 = load i8*, i8** %tmp11, align 16, !tbaa !0
%tmp25 = icmp ult i8* %tmp23, %tmp24
br i1 %tmp25, label %bb29, label %bb32
The important load here is tmp24.
When we start, it is a use of the memoryphi at 16.
During propagation, we discover:
bb18 and bb20 are dead:
MemoryPhi at 15 is thus equivalent to 4 = MemoryDef(3)
MemoryPhi at 16 is thus equivalent to MemoryPhi(4, 6)
The 6 is what gets us here.
The problem is that the only reason the use optimizer originally decided this
was killed by the phi was because of bb18 and bb20. *not* because of the store
at 6.
Now that those two blocks are dead, if we removed them and asked memoryssa for
the clobbering access, it would say "4= memorydef (3)", and we'd eliminate the
load.
This also means if you run newgvn twice, it deletes the load :)
So there are a couple options:
1. Do nothing - we decide what we do is good enough already
2. We could write a GVN walker on top of the existing walker, and decide to use
it on some loads where we've heuristically determined we have a good chance of
optimizing them better. The GVN walker would just ignore unreachable blocks.
3. we could modify the existing walker to take a list of unreachable blocks
4. We could remove defs we believe are unreachable in memoryssa, and reinsert
them into memoryssa when we discover they are reachable
I think that's the world of possibilities.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>