[LLVMdev] lifetime.start/end clarification

Wed Nov 5 08:47:34 PST 2014

Here are some comments.

It seems to me there are 2 (mostly separate) aspects:

1. Teaching clang how to do domination / post-domination analysis, so that the lifetime information (alloca/dealloca, or lifetime markers) can be set in the right locations in the IR. Such an analysis should be able to understand the language scopes, as well as labels / gotos and EH. The results of this analysis would be used at IR codegen.

Caveat: I have no expertise in this area of Clang, so take everything with a grain of salt and please feel free to correct me.

Where should this analysis be run ? Presumably at the beginning of each function’s codegen’s time.

This analysis looks a bit special (at least to me), as it will work over the AST, but it also requires a pretty good understanding of LLVM’s IR, which sets it apart from other clang analyses.

Maybe another option (some may call it a poor’s man option) would be to enforce at IR codegen time the dominators / post-dominators on the multiple paths (normal & EH control flows) by inserting basic blocks around each statement which is codegened. Those would be fall-thru most of the time, the llvm optimizers can remove them easily. The obvious drawback is that it will insert lots of small or fall-thru BBs.

2. How liveranges are represented in LLVM’s IR.

I like the idea of pairing alloca / dealloca (and removing the lifetime markers, at least for stack coloring) and I think it could even ease / improve some analysis.

Currently, allocas have to be located in the entry BB in order to get a chance to be promoted to registers by the Mem2Reg pass. Allocas in other BBs are considered to be dynamic. I have no idea how difficult it would be to teach Mem2Reg to consider alloca/dealloca in other basic blocks. 

With the alloca / dealloca solution, in order to do stack colouring, the alloca must _not_ be in the entry block, because all allocas defined there are alive at the same time and cannot be merged. All LLVM passes would need to be teached that those alloca / dealloca pairs correspond to stack slots --- as the alloca in the entry. The pairing would also have to be preserved across transformations (same as lifetime.start/end).

Cheers,

Arnaud

From: Reid Kleckner [mailto:rnk at google.com] 
Sent: 04 November 2014 19:35
To: Arnaud De Grandmaison; Nick Lewycky; Rafael Ávila de Espíndola
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] lifetime.start/end clarification

Short version: I think Clang needs some analysis capabilities to widen the lifetime.

---

I think we need a new approach. These intrinsics were designed to be general between heap and stack, and we don't need that extra generality for the simple problem of stack coloring that we're trying to solve here. See for example the size parameter, which I bet stack coloring doesn't need. If we have a bitcast alloca, we already know the size.

Rafael had an idea at the dev meeting that maybe the IR needs a stack deallocation instruction to pair with alloca. Then we could teach LLVM to consider allocas that are post-dominated by a deallocation instruction to be static, and fold them into the entry block. He pointed out that the Swift IL actually has such a construct.

It would be the responsibility of the frontend to ensure that each alloca is post-dominated by its "dealloca", so in this example with labels, we'd just have to hoist the allocation to the nearest dominating block, or just give up and go to the function entry block. Similarly the deallocation has to be moved to post-dominate the allocation, to handle cases like:

void foo(int x) {

  if (x > 10) {

    // alloca y

    goto lbl;

    while (x) {

      int y;

lbl:

      y = bar();

      x -= y;
    }

    // dealloca y

  }

}

This representation would support more aggressive stack coloring. Furthermore, it supports a much more efficient lowering for inalloca, which is why I'm somewhat interested in it.

If we don't want to do this, we can do something less drastic and either add new intrinsics or modify the current ones with the same rules proposed above. We'd have the alloca in the entry block, the lifetime start at the first block that dominates all uses, and the deallocation at the first block that post-dominates all that stuff.

One other thing to think about is EH. We can often get into a situation where uses of y are statically reachable from cleanup code while being dynamically unreachable. This can happen when cleanups are not simply ordered in a stack-like manner. I think if we can teach clang to do this kind of domination analysis, then we can probably detect this case and give up on it by allocating in the entry block.

On Tue, Nov 4, 2014 at 3:59 AM, Arnaud A. de Grandmaison <arnaud.degrandmaison at arm.com> wrote:

The LRM (http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic) essentially  states that:

- ptr is dead before a call to “lifetime.start size, ptr”

- ptr is dead after a call to “lifetime.end size, ptr”

This is all good and fine, and the expected use case is that all “lifetime.end size, ptr” markers are matched with a preceding “lifetime.start size, ptr” in the CFG.

What is supposed to happen when a “lifetime.end size, ptr” is not matched with a “lifetime.start size, ptr” ? I think there are a few possible answers:

- the memory area pointed to by ptr is assumed to be alive since function entry

- the memory area pointed to by ptr is assumed to be dead since function entry, as it has not been marked alive

- this is an unexpected situation

I think this ambiguity should be cleared in the LRM, because today’s implicit assumption may be broken at any time.

This is not a theoretical question: clang can generate such cases. For example, the following testcase:

struct X {

  void doSomething();

  char b[33];

};

void bar(X &);

void baz();

void test(int i) {

  if (i==9) {

    X x;

    x.doSomething();

label:

    bar(x);

  } else {

    baz();

    if (i==0)

      goto label;

  }

}

Produces:

%struct.X = type { [33 x i8] }

define void @_Z4testi(i32 %i) {

entry:

  %x = alloca %struct.X, align 1

  %cmp = icmp eq i32 %i, 9

  br i1 %cmp, label %if.then, label %if.else

if.then:                                          ; preds = %entry

  %0 = getelementptr inbounds %struct.X* %x, i64 0, i32 0, i64 0

  call void @llvm.lifetime.start(i64 33, i8* %0)

  call void @_ZN1X11doSomethingEv(%struct.X* %x)

  br label %label

label:                                            ; preds = %if.else.label_crit_edge, %if.then

  %.pre-phi = phi i8* [ %.pre, %if.else.label_crit_edge ], [ %0, %if.then ]

  call void @_Z3barR1X(%struct.X* dereferenceable(33) %x)

  call void @llvm.lifetime.end(i64 33, i8* %.pre-phi)

  br label %if.end3

if.else:                                          ; preds = %entry

  tail call void @_Z3bazv()

  %cmp1 = icmp eq i32 %i, 0

  br i1 %cmp1, label %if.else.label_crit_edge, label %if.end3

if.else.label_crit_edge:                          ; preds = %if.else

  %.pre = getelementptr inbounds %struct.X* %x, i64 0, i32 0, i64 0

  br label %label

if.end3:                                          ; preds = %if.else, %label

  ret void

}

Note that the path thru if.else.label_crit_edge has no lifetime start.

Cheers,

--

Arnaud

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141105/a943b444/attachment.html>