[PATCH] D60056: Hoist/sink malloc/free's in LICM.

Mon Apr 1 19:02:23 PDT 2019

nicholas marked an inline comment as done.
nicholas added a comment.

> I've only skimmed through the high level comments so far, so if if any of this is addressed inline, just say so.
> 
> I was wondering why you phrased this as a combination of hoisting and sinking as opposed to promotion. We have the scalar promotion path, and the basic transform you're doing feels a lot like promotion. I suspect that many of the legality aspects will be common.

I didn't frame it as promotion simply because I view promotion as a transform from one thing to another (such as heap to stack, stack to scalar) whereas this optimization was just hoisting out the allocation and sinking the matching deallocations.

The safety checks we need are "is there a path from this malloc back to this malloc without passing through one of the frees we identified?" and "for each exit block, can we see that all paths either pass through one-or-more frees or that all paths do not pass through any frees?". We have no need of MemorySSA or AliasSets like the memory-to-scalar promotion does.

Comparing this code with LICM's promotion does reveal one bug. We might try to insert into catchswitch blocks.

Should I pull it out of hoisting and sinking and into a post-pass like LICM's scalar promotion? Initially I had imagined that it would participate in hoist and sink's queries of whether "are all arguments to this instruction loop invariant (outside the loop)" as hoists instructions iterately, it would hoist malloc like any other instruction. In practice that couldn't happen because the hoist is deferred until sink time. We do need the malloc hoist to occur after hoistRegion in order to make the malloc size argument invariant. So, maybe there's no reason not to place it after hoist and sink? We currently piggy-back on hoist and sink's linear scans, but doing them again is merely a constant factor. It would eliminate the need to expose LoopAllocationInfo as a public interface, which is a benefit.

================
Comment at: llvm/include/llvm/IR/Instruction.h:539

+  /// Return true if this instruction may allocate heap memory.
+  bool mayAllocateMemory() const;
----------------
reames wrote:
> These feel like they need a bit more specification.  In particular, are there any expectations around how new memory is returned?  Is a deallocation routine allowed to free a random pointer read from a global?
> 
> (i.e. which set of locations are we talking about freeing and allocating?)
Acknowledged.

These exist to help us maintain a C++-language invariant that we may not increase the total amount of heap memory in use, so, malloc+free+malloc may not be transformed into malloc+malloc+free. We simply need to avoid moving memory allocations around each other (though we can sort a continuous block of memory allocations if there's no frees in between). As such, there's no consideration of how the allocated memory would be returned or which memory gets deallocated (similar to how `mayReadOrWriteMemory` doesn't). We also don't really talk about what happens when an intrinsic allocates and deallocates internally, I think ultimately whether the intrinsic wants to be treated as something we can move other allocation/deallocation operations around should be left to the intrinsic author. It may not matter to the Objective-C runtime for instance. (But what about Objective-C++?)

I think this will end up getting resolved when we address your comments on llvm/lib/IR/Instruction.cpp:562.

================
Comment at: llvm/lib/IR/Instruction.cpp:562
+    switch (II->getIntrinsicID()) {
+    default:
+      return false;
----------------
reames wrote:
> Err, using a blacklist here feels *really* dangerous.  
Uhh, you aren't suggesting that I actually list them all, right?

The reason I put it here in Instruction instead of hidden away in LICM/LoopAllocationInfo is to make it somewhat clearer that this is a global property that you'll need to update when adding an instruction or intrinsic.

How about if we update Intrinsics.td to include a Mallocs and Frees (similar to Throws) IntrinsicProperty? That makes it opt-in when declaring the intrinsic? (Speaking of which, why Throws and not IntrThrows? Should mine be IntrMallocs/IntrFrees or Mallocs/Frees?)

Also, can a readnone function malloc? Is there any way we could make this stricter?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60056/new/

https://reviews.llvm.org/D60056