[llvm-commits] GC patches yet again

Gordon Henriksen gordonhenriksen at mac.com
Sun Sep 30 07:03:48 PDT 2007


On 2007-09-28, at 17:59, Chris Lattner wrote:

> Okay, I don't think we'll be able to inline code effectively that  
> uses two different collectors.  My brain hurts thinking about what  
> to do if you get to a stop point valid in one collector but not the  
> other etc.

Mine, too.

> Another option is to mark each function that requires GC info with  
> an attribute.  The attribute could encode the GC model to use as  
> well. This is basically the front-end telling the codegen what to  
> do, which is better than relying on a magic command line optn.

Agreed. More patches!

— Gordon


//===-- gc-4-stringpool.patch (+180) --------------------------===//

   include/llvm/Support/StringPool.h (+124)
   lib/Support/StringPool.cpp (+56)

A reference-counted string interning facility. Will be used by the
next patch.

This patch is independent.


//===-- gc-5-funattr.docs.patch (+77 -30) ---------------------===//

   docs/LangRef.html (+22 -3)
   docs/GarbageCollection.html (+55 -27)

Updates the documentation to sync with gc-5-funattr.


//===-- gc-5-funattr.patch (+127 -55) -------------------------===//

   test/CodeGen/Generic/GC/alloc_loop.ll (+38 -40)
   test/Assembler/2007-09-29-GC.ll (+12)
   include/llvm/Bitcode/LLVMBitCodes.h (+3 -1)
   include/llvm/Function.h (+9)
   lib/Bitcode/Reader/BitcodeReader.cpp (+14 -1)
   lib/Bitcode/Writer/BitcodeWriter.cpp (+23 -10)
   lib/VMCore/AsmWriter.cpp (+2)
   lib/VMCore/Function.cpp (+10)
   lib/AsmParser/llvmAsmParser.y (+13 -3)
   lib/AsmParser/Lexer.l (+1)
   lib/Transforms/Utils/CloneModule.cpp (+2)

Adds these methods to Function and makes corresponding changes to
assembly and bitcode:

   bool hasCollector() const;
   const std::string &getCollector() const;
   void setCollector(const std::string &);
   void clearCollector();

The assembly representation is as such:

   define void @f() gc "shadow-stack" { ...

Uses StringPool to unique collector names, which are extremely
likely to be unique per process.

This patch depends on gc-4-stringpool.


//===-- gc-5-funattr.regen.patch (+1879 -1844) ----------------===//

   lib/AsmParser/Lexer.l.cvs (+1)
   lib/AsmParser/llvmAsmParser.cpp.cvs (+1109 -1091)
   lib/AsmParser/llvmAsmParser.h.cvs (+4 -3)
   lib/AsmParser/llvmAsmParser.y.cvs (+13 -3)
   lib/AsmParser/Lexer.cpp.cvs (+752 -747)

Regenerate for gc-5-funattr.


//===-- gc-6-redux.patch (+288 -193) --------------------------===//

   include/llvm/CodeGen/Passes.h (+18)
   include/llvm/CodeGen/Collector.h (+43 -43)
   include/llvm/CodeGen/CollectorMetadata.h (+30 -31)
   include/llvm/CodeGen/Collectors.h (+3)
   lib/CodeGen/Collector.cpp (+125 -80)
   lib/CodeGen/CollectorMetadata.cpp (+69 -39)

CollectorMetadata and Collector are altered to get along with
per-function collector model. The hierarchy is now thus:

   CollectorModuleMetadata (per module)
     CollectorMetadata (by Function*)
     Collectors (by string)
       CollectorMetadata (sequentially)
         Roots (sequentially)
         GCPoints (sequentially)
           Live Roots (sequentially)

Collector is now the factory for CollectorMetadata, so it's possible
to subclass it.

This patch depends on gc-5-funattr.


//===-- gc-7-integration.patch (+100 -16) ---------------------===//

   include/llvm/CodeGen/LinkAllCodegenComponents.h (+4)
   include/llvm/CodeGen/SelectionDAGISel.h (+3 -1)
   include/llvm/CodeGen/AsmPrinter.h (+4)
   lib/CodeGen/LLVMTargetMachine.cpp (+24 -5)
   lib/CodeGen/AsmPrinter.cpp (+19)
   lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+35 -8)
   lib/CodeGen/README.txt (+5)
   lib/Target/CBackend/CBackend.cpp (+3 -1)
   lib/Target/MSIL/MSILWriter.cpp (+3 -1)

Collector tries to integrate with the compiler again.

In general, the function attribute design (collector per function)
was beneficial to the implementation. However, this version has
slightly more overhead than the previous patch.

Collector and TargetMachine still have no way of verifying their
mutual compatibility.

This patch depends on gc-6-redux.


//===-- gc-8-shadowstack.patch (+488 -371) --------------------===//

   test/CodeGen/Generic/GC/lower_gcroot.ll (+1 -1)
   test/CodeGen/Generic/GC/redundant_init.ll (+17)
   runtime/GC/SemiSpace/semispace.c (+16 -14)
   include/llvm/LinkAllPasses.h (-1)
   include/llvm/Transforms/Scalar.h (-7)
   lib/CodeGen/ShadowStackCollector.cpp (+454)
   lib/Transforms/Scalar/LowerGC.cpp (-348)

With this patch, the LowerGC transformation becomes the
ShadowStackCollector, which additionally has reduced overhead with
no sacrifice in portability.

This patch depends on gc-7-integration.


Considering a function @fun with 8 loop-local roots,
ShadowStackCollector introduces the following overhead
(x86):

; shadowstack prologue
         movl    L_llvm_gc_root_chain$non_lazy_ptr, %eax
         movl    (%eax), %ecx
         movl    $___gc_fun, 20(%esp)
         movl    $0, 24(%esp)
         movl    $0, 28(%esp)
         movl    $0, 32(%esp)
         movl    $0, 36(%esp)
         movl    $0, 40(%esp)
         movl    $0, 44(%esp)
         movl    $0, 48(%esp)
         movl    $0, 52(%esp)
         movl    %ecx, 16(%esp)
         leal    16(%esp), %ecx
         movl    %ecx, (%eax)

; shadowstack loop overhead
         (none)

; shadowstack epilogue
         movl    48(%esp), %edx
         movl    %edx, (%ecx)

; shadowstack metadata
         .align  3
___gc_fun:                              # __gc_fun
         .long   8
         .space  4

In comparison to LowerGC:

; lowergc prologue
         movl    L_llvm_gc_root_chain$non_lazy_ptr, %eax
         movl    (%eax), %ecx
         movl    %ecx, 48(%esp)
         movl    $8, 52(%esp)
         movl    $0, 60(%esp)
         movl    $0, 56(%esp)
         movl    $0, 68(%esp)
         movl    $0, 64(%esp)
         movl    $0, 76(%esp)
         movl    $0, 72(%esp)
         movl    $0, 84(%esp)
         movl    $0, 80(%esp)
         movl    $0, 92(%esp)
         movl    $0, 88(%esp)
         movl    $0, 100(%esp)
         movl    $0, 96(%esp)
         movl    $0, 108(%esp)
         movl    $0, 104(%esp)
         movl    $0, 116(%esp)
         movl    $0, 112(%esp)

; lowergc loop overhead
         leal    44(%esp), %eax
         movl    %eax, 56(%esp)
         leal    40(%esp), %eax
         movl    %eax, 64(%esp)
         leal    36(%esp), %eax
         movl    %eax, 72(%esp)
         leal    32(%esp), %eax
         movl    %eax, 80(%esp)
         leal    28(%esp), %eax
         movl    %eax, 88(%esp)
         leal    24(%esp), %eax
         movl    %eax, 96(%esp)
         leal    20(%esp), %eax
         movl    %eax, 104(%esp)
         leal    16(%esp), %eax
         movl    %eax, 112(%esp)

; lowergc epilogue
         movl    48(%esp), %edx
         movl    %edx, (%ecx)

; lowergc metadata
         (none)


//===-- gc-9-ocaml-collector.patch (+218) ---------------------===//

   test/CodeGen/Generic/GC/simple_ocaml.ll (+42)
   test/CodeGen/Generic/GC/frame_size.ll (+13)
   lib/CodeGen/OcamlCollector.cpp (+163)

The new OcamlCollector emits the Ocaml frametable data structure and
related symbols.

This patch depends on gc-7-integration.


$ llvm-as -o simple_ocaml.bc simple_ocaml.ll
$ llc -gc=ocaml -asm-verbose -o - simple_ocaml.bc
         .text
_camlSimple_ocaml__code_begin:
         .data
_camlSimple_ocaml__data_begin:


         .text
         .align  4,0x90
         .globl  _fun
_fun:
         subl    $12, %esp
         movl    $0, 4(%esp)
         movl    $0, 8(%esp)
         movl    16(%esp), %eax
         movl    %eax, 4(%esp)
LBB1_1: # bb.loop
         movl    4(%esp), %eax
         movl    4(%eax), %eax
         testl   %eax, %eax
         je      LBB1_1  # bb.loop
LBB1_2: # bb.end
         movl    $8, (%esp)
         call    L_malloc$stub

Llabel1:
         movl    %eax, 8(%esp)
         movl    4(%esp), %ecx
         movl    %ecx, 4(%ecx)
         addl    $12, %esp
         ret
.section __IMPORT,__jump_table,symbol_stubs,self_modifying_code 
+pure_instructions,5
L_malloc$stub:
         .indirect_symbol _malloc
         hlt ; hlt ; hlt ; hlt ; hlt

         .subsections_via_symbols

         .text
_camlSimple_ocaml__code_end:
         .data
_camlSimple_ocaml__data_end:
         .long   0
_camlSimple_ocaml__frametable:
         # live roots for fun
         .long   Llabel1 # call return address
         .short  0xc     # stack frame size
         .short  0x2     # live root count
         .word   4       # stack offset
         .word   8       # stack offset
         .align  2

-------------- next part --------------
A non-text attachment was scrubbed...
Name: gc.6.zip
Type: application/zip
Size: 84573 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20070930/98caafa7/attachment.zip>


More information about the llvm-commits mailing list