[LLVMdev] Alias Analysis: zero terminated strings
Nick Lewycky
nicholas at mxc.ca
Mon Sep 12 09:31:24 PDT 2011
Carl-Philip Hänsch wrote:
> Hello,
>
> I'm developing a programming language that is optimized for strings. A
> first hello world program shows me that llvm needs a lot more work on
> zero terminated strings.
> In the following example, I have an auto generated hello world example
> optimized with -O3. The problem is, that the constant string is copied
> into a malloced mem area, then puts is called and then the memory is
> freed. There is also some leftover from the reference counters. These
> are found by the dead code eliminaton after the puts call. But before
> the puts call, the constant folded number is put into the memory and is
> never used. I was told that llvm assumes that a function also can read
> below the pointer, so dead code elimination does not work here. The
> second thing i would like to have there is to tell LLVM that the
> interesting memory ends after the zero termination. I think these two
> flags: dont_read_below and dont_read_above_zero should be enough to make
> LLVM optimze that example.
LLVM could figure out that there is no "below the pointer" by noticing
that the object came from malloc. I think the missing optimization here
is a heap->stack transform.
I note that in your example the exit is not post-dominated by free().
The transform could still fire by noticing that the pointer returned by
malloc never escaped the function. (A more expensive check would be to
see that it never escaped the function along the path that didn't call
free. This is related to http://llvm.org/PR8908#c1 .)
One other thing we may want is a flag for "does not care about the
pointer itself, only what the pointer points to". Currently, nothing
tells LLVM that puts() doesn't check whether the pointer argument ==
&string_00000001, so we can't actually remove the copy. We could
special-case that optimization into SimplifyLibCalls.
Nick
More information about the llvm-dev
mailing list