[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

Hal Finkel hfinkel at anl.gov
Fri Aug 24 22:50:21 PDT 2012


On Wed, 22 Aug 2012 13:15:30 -0700
Dan Gohman <gohman at apple.com> wrote:

> Hello,
> 
> Currently LLVM expects front-ends to lower struct assignments into
> either individual scalar loads and stores, or calls to @llvm.memcpy.
> For structs with lots of fields, it can take a lot of scalar loads
> and stores, so @llvm.memcpy is used instead. Unfortunately, using
> @llvm.memcpy does not permit full TBAA information to be preserved.
> Also, it unnecessarily copies any padding bytes between fields, which
> can lead to unnecessary copying in the case where the optimizer or
> codegen decide to split it back up into individual loads and stores.
> 
> Chris wrote up some ideas about the struct padding part of this
> problem [0]; this proposal extends that proposal and adds the
> capability to represent TBAA information for the members of the
> fields in a struct assignment.
> 
> Here's an example showing the basic problem:
> 
> struct bar {
>   char x;
>   float y;
>   double z;
> };
> void copy_bar(struct bar *a, struct bar *b) {
>   *a = *b;
> }
> 
> We get this IR:
> 
>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8,
> i1 false)
> 
> This works, but it doesn't retain the information that the bytes
> between fields x and y don't really need to be copied, and it doesn't
> inform the optimizer that there are three fields with TBAA-relevant
> types being copied.
> 
> The solution I propose here is to have front-ends describe the copy
> using metadata. For example:
> 
>   %struct.foo = type { i8, float, double }
>   […]
>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8,
> i1 false), !struct.assignment !4 […]

I think that it would make more sense to name this !struct.tbaa -- it
seems logically similar to existing TBAA metadata (in that it is
attached to the relevant load/store instruction).

 -Hal

>   !0 = metadata !{metadata !"Simple C/C++ TBAA"}
>   !1 = metadata !{metadata !"omnipotent char", metadata !0}
>   !2 = metadata !{metadata !"float", metadata !1}
>   !3 = metadata !{metadata !"double", metadata !1}
>   !4 = metadata !{ %struct.foo* null, metadata !5 }
>   !5 = metadata !{ metadata !1, metadata !2, metadata !3 }
> 
> Metadata nodes !0 through !3 are regular TBAA nodes as are already in
> use.
> 
> Metadata node !4 here is a top-level description of the memcpy. Its
> first operand is a null pointer, which is there just for its type. It
> specifies a (pointer to an) IR-level struct type the memcpy can be
> thought of as copying. The second operand is and MDNode which
> describes the TBAA values for the fields. The indices of the operands
> in that MDNode directly correspond to the indices of the members in
> the IR-level struct type.
> 
> With this information, optimizer and codegen can more aggressively
> optimize the memcpy. In particular, it would be possible for the
> optimizer to expand the memcpy into a series of loads and stores with
> complete TBAA information. Also, the optimize could determine where
> the padding is by examining the struct layout of the IR-level struct
> definition.
> 
> Note that this is not a proposal for struct-access-path aware TBAA, or
> even full struct value TBAA. This is just a way to preserve basic
> scalar TBAA for individual members of structs in a struct assignment.
> 
> Comments and questions are welcome.
> 
> Dan
> 
> [0]
> http://nondot.org/sabre/LLVMNotes/BetterStructureCopyOptimization.txt
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory




More information about the llvm-dev mailing list