[LLVMdev] Alias Analysis: zero terminated strings

Duncan Sands baldrick at free.fr
Mon Sep 12 10:02:56 PDT 2011


Hi Nick,

>> I'm developing a programming language that is optimized for strings. A
>> first hello world program shows me that llvm needs a lot more work on
>> zero terminated strings.
>> In the following example, I have an auto generated hello world example
>> optimized with -O3. The problem is, that the constant string is copied
>> into a malloced mem area, then puts is called and then the memory is
>> freed. There is also some leftover from the reference counters. These
>> are found by the dead code eliminaton after the puts call. But before
>> the puts call, the constant folded number is put into the memory and is
>> never used. I was told that llvm assumes that a function also can read
>> below the pointer, so dead code elimination does not work here. The
>> second thing i would like to have there is to tell LLVM that the
>> interesting memory ends after the zero termination. I think these two
>> flags: dont_read_below and dont_read_above_zero should be enough to make
>> LLVM optimze that example.
>
> LLVM could figure out that there is no "below the pointer" by noticing
> that the object came from malloc.

I think it's more complicated: there is code that writes something before
the string, so the malloc'd memory can't be considered to be all "undef"
except for the string part.

Ciao, Duncan.

  I think the missing optimization here
> is a heap->stack transform.
>
> I note that in your example the exit is not post-dominated by free().
> The transform could still fire by noticing that the pointer returned by
> malloc never escaped the function. (A more expensive check would be to
> see that it never escaped the function along the path that didn't call
> free. This is related to http://llvm.org/PR8908#c1 .)
>
> One other thing we may want is a flag for "does not care about the
> pointer itself, only what the pointer points to". Currently, nothing
> tells LLVM that puts() doesn't check whether the pointer argument ==
> &string_00000001, so we can't actually remove the copy. We could
> special-case that optimization into SimplifyLibCalls.
>
> Nick
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list