[PATCH] D120153: [instsimplify] Support discharging pointer checks involving two distinct GC objects

Philip Reames via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 18 12:11:51 PST 2022


reames created this revision.
reames added reviewers: anna, apilipenko, nikic, fhahn, eli.friedman.
Herald added subscribers: bollu, hiraditya, mcrosier.
reames requested review of this revision.
Herald added a project: LLVM.

This patch is a bit subtle.

Let's start with the transform we'd like to do, but can't.  If we have two malloc'd objects, we'd like to be able to say that their addresses are not equal (unless null).  However, we can't do this as while the two memory objects don't *alias*, one of them might have been freed and the allocator is allowed to reuse storage.

However, for a language with a garbage collector, we know that this storage reuse is (by definition) invisible in the source program.  During lowering there's a point where reuse becomes possible, but while the program is targeting the abstract machine model and using nonintegral pointers, that reuse isn't visible.

(Side note: The current wording on nonintegral makes it unclear whether comparing two non-integral pointers is merely in-determinant, or something stronger like returning poison.  This patch doesn't care, but it does look like we have some wording to refine.)

I *think* such a GC allocation must also overlap with byval arguments, allocas, and globals, but I'm leaving that to a follow up patch.  (In particular, the dynamic alloca + gc_alloc + dead code case has me a bit hesitant.)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D120153

Files:
  llvm/lib/Analysis/InstructionSimplify.cpp
  llvm/test/Transforms/InstSimplify/compare.ll


Index: llvm/test/Transforms/InstSimplify/compare.ll
===================================================================
--- llvm/test/Transforms/InstSimplify/compare.ll
+++ llvm/test/Transforms/InstSimplify/compare.ll
@@ -2831,4 +2831,57 @@
 ; TODO: Add coverage for global aliases, link once, etc..
 
 
+declare noalias i8* @my_alloc(i64) allocsize(0)
+declare void @my_free(i8*)
+
+; %a and %b might have the same address despite being noalias as
+; the allocator is allowed to reuse the storage after %a was freed
+define i1 @neg_overlapping_malloc() {
+; CHECK-LABEL: @neg_overlapping_malloc(
+; CHECK-NEXT:    [[A:%.*]] = call i8* @my_alloc(i64 8)
+; CHECK-NEXT:    call void @my_free(i8* [[A]])
+; CHECK-NEXT:    [[B:%.*]] = call i8* @my_alloc(i64 8)
+; CHECK-NEXT:    [[RES:%.*]] = icmp ne i8* [[A]], [[B]]
+; CHECK-NEXT:    ret i1 [[RES]]
+;
+  %a = call i8* @my_alloc(i64 8)
+  call void @my_free(i8* %a)
+  %b = call i8* @my_alloc(i64 8)
+  %res = icmp ne i8* %a, %b
+  ret i1 %res
+}
+
+declare noalias i8 addrspace(1)* @gc_alloc(i64) allocsize(0)
+
+; For an allocation which isn't freed (e.g. a allocation in a GCd language
+; before lowering out of the abstract machine model), we *can* assume the
+; pointers are non equal.  Essentially, the abstract model tells us that
+; a comparison which would reveal reuse after lowering is UB.
+define i1 @test_gc_alloc() gc "statepoint-example" {
+; CHECK-LABEL: @test_gc_alloc(
+; CHECK-NEXT:    [[A:%.*]] = call nonnull i8 addrspace(1)* @gc_alloc(i64 8)
+; CHECK-NEXT:    [[B:%.*]] = call nonnull i8 addrspace(1)* @gc_alloc(i64 8)
+; CHECK-NEXT:    ret i1 true
+;
+  %a = call nonnull i8 addrspace(1)* @gc_alloc(i64 8)
+  %b = call nonnull i8 addrspace(1)* @gc_alloc(i64 8)
+  %res = icmp ne i8 addrspace(1)* %a, %b
+  ret i1 %res
+}
+
+; Both could be null and thus equal
+define i1 @neg_gc_alloc_null() gc "statepoint-example" {
+; CHECK-LABEL: @neg_gc_alloc_null(
+; CHECK-NEXT:    [[A:%.*]] = call i8 addrspace(1)* @gc_alloc(i64 8)
+; CHECK-NEXT:    [[B:%.*]] = call i8 addrspace(1)* @gc_alloc(i64 8)
+; CHECK-NEXT:    [[RES:%.*]] = icmp ne i8 addrspace(1)* [[A]], [[B]]
+; CHECK-NEXT:    ret i1 [[RES]]
+;
+  %a = call i8 addrspace(1)* @gc_alloc(i64 8)
+  %b = call i8 addrspace(1)* @gc_alloc(i64 8)
+  %res = icmp ne i8 addrspace(1)* %a, %b
+  ret i1 %res
+}
+
+
 attributes #0 = { null_pointer_is_valid }
Index: llvm/lib/Analysis/InstructionSimplify.cpp
===================================================================
--- llvm/lib/Analysis/InstructionSimplify.cpp
+++ llvm/lib/Analysis/InstructionSimplify.cpp
@@ -2556,6 +2556,17 @@
   // So, we'll assume that two non-empty allocas have different addresses
   // for now.
 
+  if (isNoAliasCall(V1) && isNoAliasCall(V2) &&
+      cast<CallBase>(V1)->isReturnNonNull() && !V1->canBeFreed() &&
+      cast<CallBase>(V2)->isReturnNonNull() && !V2->canBeFreed())
+    // In general, the lifetime of two noalias (e.g. malloc'd) allocations
+    // can be disjoint.  This means that the storage regions can overlap.
+    // However, if we know that the allocation can't be freed, then they
+    // must exist at the same time.  The most common case of an allocation
+    // which can't be freed is an object in a GC'd language before lowering
+    // out of the abstract machine model.
+    return true;
+
   auto isByValArgOrGlobalVarOrAlloca = [](const Value *V) {
     if (const Argument *A = dyn_cast<Argument>(V))
       return A->hasByValAttr();


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D120153.409994.patch
Type: text/x-patch
Size: 3466 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220218/8848aae3/attachment.bin>


More information about the llvm-commits mailing list