[PATCH] D143129: [GVN] Restrict equality propagation for pointers

Juneyoung Lee via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 27 09:56:12 PST 2023


aqjune added a comment.

Thanks! Yes, indeed there are still many files having diffs. But the number shrunk to ~1/5, which seems nice!

I looked at the diffs, and I could see a few replacements in the past that seem buggy...
In 82.ll:

    %call103 = tail call ptr @encodingLine(ptr noundef %messageIn) #17
    %22 = load ptr, ptr %t_next, align 8, !tbaa !30
    %cmp105 = icmp eq ptr %call103, %22
    br i1 %cmp105, label %if.then107, label %do.cond
  
  if.then107:
    %23 = load ptr, ptr %22, align 8, !tbaa !28 // Replacing %22 with %call103 is illegal

`encodingLine` does not have a function body as well as meaningful attributes, so it can return any pointer. This patch fixes the bug, which is good.

In 73.ll: I think we can make an improvement.

    %call.i = tail call noalias dereferenceable_or_null(64) ptr @malloc(i64 noundef 64) #35
    %tobool.not = icmp eq ptr %call.i, null
    ...
  
  cond.end.i.i: // At this point %call.i isn't null.
    %5 = load ptr, ptr %arrayidx6.i.i, align 8, !tbaa !5
    %cmp.i.i = icmp eq ptr %5, %call.i
    br i1 %cmp.i.i, label %_Z15yy_flush_bufferP15yy_buffer_state.exit.thread.i,....
  
  _Z15yy_flush_bufferP15yy_buffer_state.exit.thread.i: ; preds = %cond.end.i.i
    %yy_n_chars.i.i.i = getelementptr inbounds %struct.yy_buffer_state, ptr %5, i64 0, i32 4 // <- This can be replaced with 'gep inbounds %call.i, 0, 4'.
    %6 = load i32, ptr %yy_n_chars.i.i.i, align 4, !tbaa !13

Replacing `p` at `load (gep inbounds p, ofs)` with `q` at is fine if (1) `q` points to the beginning address of a live object (e.g. unfreed malloc), and (2) `ofs` is non-negative.
To guarantee liveness (1), you will probably need to check whether `q` hasn't been escaped, because an unknown function call may deallocate it.
My reasoning is like this:

- If `p` was a freed (dead) pointer, `load` would have already been UB, so `q` can be anything.
- If `p` was an unfreed (live) pointer, `q` must also be a live object because otherwise the replacement will make `load` introduce UB. If `p` and `q` are both live:
  - If `p` and `q` pointed to the same object: replacement is trivially correct
  - If `p` and `q` pointed to different objects: since `p == q` was true, the two different objects must be adjacent in their memory layout, and one of `q` or `p` must point to the beginning address of a live object. 73.ll corresponds to `q` pointing to 'the beginning address of a live object' case.

I hope I didn't miss something obvious in 73.ll & this replacement helps.



================
Comment at: llvm/lib/Analysis/Loads.cpp:699
+  if (isa<PHINode>(U) && all_of(U->users(), [&](const User *User) {
+        return isa<PHINode>(User) ||
+               canReplacePointersRecursive(U, To, User, MaxLookup - 1);
----------------
Could you elaborate a bit about `isa<PHINode>(User)` condition?
I think that in the following case replacing %p with %q should fail, but this condition seems to allow it:
```
BB1:
  %phi1 = phi [%p, <pred>], ..
  br BB2
BB2:
  %phi2 = phi [%phi1, <BB1>], ..
  store i32 0, ptr %phi2

```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143129/new/

https://reviews.llvm.org/D143129



More information about the llvm-commits mailing list