[llvm-commits] GC patches again

Gordon Henriksen gordonhenriksen at mac.com
Wed Sep 26 22:22:21 PDT 2007


Chris,

For your review.

Thanks!

— Gordon


//===-- gc-0-docs.patch (+1170 -241) --------------------------===//

   runtime/GC/SemiSpace/README.txt (+5)
   docs/llvm.css (+1 -1)
   docs/Lexicon.html (+76 -3)
   docs/GarbageCollection.html (+1088 -237)

GarbageCollection.html is expanded to encompass the new
capabilities. This is a major rewrite and is easier to read en toto
rather than patchwise:

http://homepage.mac.com/malichus/GarbageCollection.html

This patch is independent.


//===-- gc-1-registry.patch (+300) ----------------------------===//

   include/llvm/Support/Registry.h (+243)
   include/llvm/CodeGen/Collectors.h (+36)
   lib/CodeGen/Collectors.cpp (+21)

My previous Registry.h header, as well as Collectors.h, which is the
registry for dynamically-loaded garbage collectors.

This patch is independent.


//===-- gc-2-metadata.patch (+383) ----------------------------===//

   include/llvm/CodeGen/CollectorMetadata.h (+198)
   lib/CodeGen/CollectorMetadata.cpp (+185)

CollectorMetadata is the data structure populated by back-ends
during code-generation.

This patch is independent.


//===-- gc-3-collector.patch (+531) ---------------------------===//

   include/llvm/CodeGen/Collector.h (+134)
   lib/CodeGen/Collector.cpp (+359)
   lib/CodeGen/README.txt (+38)

Collector is the base class for garbage collector code generators.
This version enhances the previous patch to add root initialization
as discussed here:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of- 
Mon-20070910/053455.html

Collector gives its subclasses control over generic algorithms:

   unsigned NeededSafePoints; //< Bitmask of required safe points.
   bool CustomReadBarriers;   //< Default is to insert loads.
   bool CustomWriteBarriers;  //< Default is to insert stores.
   bool CustomRoots;          //< Default is to pass through to backend.
   bool InitRoots;            //< If set, roots are nulled during  
lowering.

It also has callbacks which collectors can hook:

   /// If any of the actions are set to Custom, this is expected to
   /// be overriden to create a transform to lower those actions to
   /// LLVM IR.
   virtual Pass *createCustomLoweringPass() const;

   /// beginAssembly/finishAssembly - Emit module metadata as
   /// assembly code.
   virtual void beginAssembly(Module &M, std::ostream &OS,
                              AsmPrinter &AP,
                              const TargetAsmInfo &TAI) const;
   virtual void finishAssembly(Module &M,
                               CollectorModuleMetadata &CMM,
                               std::ostream &OS, AsmPrinter &AP,
                               const TargetAsmInfo &TAI) const;

Various other independent algorithms could be implemented, but were
not necessary for the initial two collectors. Some examples are
listed here:

http://lists.cs.uiuc.edu/pipermail/llvmdev/2007-August/010500.html

This patch depends on gc-2-metadata.


//===-- gc-4-integration.patch (+116 -17) ---------------------===//

   include/llvm/CodeGen/LinkAllCodegenComponents.h (+4)
   include/llvm/CodeGen/SelectionDAGISel.h (+3 -1)
   include/llvm/CodeGen/AsmPrinter.h (+4)
   tools/llc/llc.cpp (+13)
   lib/CodeGen/LLVMTargetMachine.cpp (+28 -4)
   lib/CodeGen/AsmPrinter.cpp (+15)
   lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+41 -9)
   lib/Target/ARM/ARMAsmPrinter.cpp (+1)
   lib/Target/X86/X86AsmPrinter.h (+1 -1)
   lib/Target/CBackend/CBackend.cpp (+3 -1)
   lib/Target/MSIL/MSILWriter.cpp (+3 -1)

In this patch, Collector winds its tendrils throughout the compiler.
Overhead should be minimal when disabled.

I would particularly appreciate any feedback on this interface.
The primary item of concern to me is that I exposed the desired
collector to the compiler using a global. I have not decided on a
better approach. In the meantime, it works and is simple.

Less concretely, I am not convinced that Collector is well-factored
in its interaction with TargetMachine et al. Collector and
TargetMachine have no way of sniffing their respective capabilities
and requirements. So llc -march=c -gc=ocaml with silently do the
wrong thing; likewise the JIT, object writers, and MSIL backend.

Finally, the -gc option is mandatory if collector intrinsics are
used. This is somewhat sensible since there is no default collector
runtime (excluding the MSIL target), but I would rather stuff any
necessary information inside the LLVM representation somehow so that
the .bc/.ll remain self-contained.

This patch depends on gc-1-registry and gc-3-collector.


//===-- gc-5-shadowstack.patch (+493 -371) --------------------===//

   test/CodeGen/Generic/GC/alloc_loop.ll (+1 -1)
   test/CodeGen/Generic/GC/lower_gcroot.ll (+1 -1)
   test/CodeGen/Generic/GC/redundant_init.ll (+17)
   include/llvm/LinkAllPasses.h (-1)
   include/llvm/Transforms/Scalar.h (-7)
   runtime/GC/SemiSpace/semispace.c (+15 -13)
   lib/CodeGen/ShadowStackCollector.cpp (+459)
   lib/Transforms/Scalar/LowerGC.cpp (-348)

With this patch, the LowerGC transformation becomes the
ShadowStackCollector, which additionally has reduced overhead with
no sacrifice in portability.

This patch depends on gc-4-integration.


Considering a function @fun with 8 loop-local roots,
ShadowStackCollector introduces the following overhead
(x86):

; shadowstack prologue
         movl    L_llvm_gc_root_chain$non_lazy_ptr, %eax
         movl    (%eax), %ecx
         movl    $___gc_fun, 20(%esp)
         movl    $0, 24(%esp)
         movl    $0, 28(%esp)
         movl    $0, 32(%esp)
         movl    $0, 36(%esp)
         movl    $0, 40(%esp)
         movl    $0, 44(%esp)
         movl    $0, 48(%esp)
         movl    $0, 52(%esp)
         movl    %ecx, 16(%esp)
         leal    16(%esp), %ecx
         movl    %ecx, (%eax)

; shadowstack loop overhead
         (none)

; shadowstack epilogue
         movl    48(%esp), %edx
         movl    %edx, (%ecx)

; shadowstack metadata
         .align  3
___gc_fun:                              # __gc_fun
         .long   8
         .space  4

In comparison to LowerGC:

; lowergc prologue
         movl    L_llvm_gc_root_chain$non_lazy_ptr, %eax
         movl    (%eax), %ecx
         movl    %ecx, 48(%esp)
         movl    $8, 52(%esp)
         movl    $0, 60(%esp)
         movl    $0, 56(%esp)
         movl    $0, 68(%esp)
         movl    $0, 64(%esp)
         movl    $0, 76(%esp)
         movl    $0, 72(%esp)
         movl    $0, 84(%esp)
         movl    $0, 80(%esp)
         movl    $0, 92(%esp)
         movl    $0, 88(%esp)
         movl    $0, 100(%esp)
         movl    $0, 96(%esp)
         movl    $0, 108(%esp)
         movl    $0, 104(%esp)
         movl    $0, 116(%esp)
         movl    $0, 112(%esp)

; lowergc loop overhead
         leal    44(%esp), %eax
         movl    %eax, 56(%esp)
         leal    40(%esp), %eax
         movl    %eax, 64(%esp)
         leal    36(%esp), %eax
         movl    %eax, 72(%esp)
         leal    32(%esp), %eax
         movl    %eax, 80(%esp)
         leal    28(%esp), %eax
         movl    %eax, 88(%esp)
         leal    24(%esp), %eax
         movl    %eax, 96(%esp)
         leal    20(%esp), %eax
         movl    %eax, 104(%esp)
         leal    16(%esp), %eax
         movl    %eax, 112(%esp)

; lowergc epilogue
         movl    48(%esp), %edx
         movl    %edx, (%ecx)

; lowergc metadata
         (none)


//===-- gc-6-ocaml-collector.patch (+219) ---------------------===//

   test/CodeGen/Generic/GC/simple_ocaml.ll (+42)
   test/CodeGen/Generic/GC/frame_size.ll (+13)
   lib/CodeGen/OcamlCollector.cpp (+164)

The new OcamlCollector emits the Ocaml frametable data structure and
related symbols.

This patch depends on gc-4-integration.


$ llvm-as -o simple_ocaml.bc simple_ocaml.ll
$ llc -gc=ocaml -asm-verbose -o - simple_ocaml.bc
         .text
_camlSimple_ocaml__code_begin:
         .data
_camlSimple_ocaml__data_begin:


         .text
         .align  4,0x90
         .globl  _fun
_fun:
         subl    $12, %esp
         movl    $0, 4(%esp)
         movl    $0, 8(%esp)
         movl    16(%esp), %eax
         movl    %eax, 4(%esp)
LBB1_1: # bb.loop
         movl    4(%esp), %eax
         movl    4(%eax), %eax
         testl   %eax, %eax
         je      LBB1_1  # bb.loop
LBB1_2: # bb.end
         movl    $8, (%esp)
         call    L_malloc$stub

Llabel1:
         movl    %eax, 8(%esp)
         movl    4(%esp), %ecx
         movl    %ecx, 4(%ecx)
         addl    $12, %esp
         ret
.section __IMPORT,__jump_table,symbol_stubs,self_modifying_code 
+pure_instructions,5
L_malloc$stub:
         .indirect_symbol _malloc
         hlt ; hlt ; hlt ; hlt ; hlt

         .subsections_via_symbols

         .text
_camlSimple_ocaml__code_end:
         .data
_camlSimple_ocaml__data_end:
         .long   0
_camlSimple_ocaml__frametable:
         # live roots for fun
         .long   Llabel1 # call return address
         .short  0xc     # stack frame size
         .short  0x2     # live root count
         .word   4       # stack offset
         .word   8       # stack offset
         .align  2


//===-- housekeeping.patch (+16) ------------------------------===//

   Xcode/LLVM.xcodeproj/project.pbxproj (+16)

Just updating the Xcode project.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20070927/654e4d49/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gc.5.zip
Type: application/zip
Size: 50996 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20070927/654e4d49/attachment.zip>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20070927/654e4d49/attachment-0001.html>


More information about the llvm-commits mailing list