[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

Alex Rosenberg alexr at leftfield.org
Fri Aug 24 17:56:29 PDT 2012


If we can also describe the alignment padding inserted at the end of a struct when it is placed in an array, then we can improve the current LoopIdiom pass to build more memcpys. I would think that would be attached to the struct definition. 

Alex

On Aug 22, 2012, at 1:15 PM, Dan Gohman wrote:

> Hello,
> 
> Currently LLVM expects front-ends to lower struct assignments into either
> individual scalar loads and stores, or calls to @llvm.memcpy. For structs
> with lots of fields, it can take a lot of scalar loads and stores, so
> @llvm.memcpy is used instead. Unfortunately, using @llvm.memcpy does not
> permit full TBAA information to be preserved. Also, it unnecessarily copies
> any padding bytes between fields, which can lead to unnecessary copying in
> the case where the optimizer or codegen decide to split it back up into
> individual loads and stores.
> 
> Chris wrote up some ideas about the struct padding part of this problem [0];
> this proposal extends that proposal and adds the capability to represent
> TBAA information for the members of the fields in a struct assignment.
> 
> Here's an example showing the basic problem:
> 
> struct bar {
>   char x;
>   float y;
>   double z;
> };
> void copy_bar(struct bar *a, struct bar *b) {
>   *a =3D *b;
> }
> 
> We get this IR:
> 
>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, i1 fa=
> lse)
> 
> This works, but it doesn't retain the information that the bytes between fi=
> elds
> x and y don't really need to be copied, and it doesn't inform the optimizer
> that there are three fields with TBAA-relevant types being copied.
> 
> The solution I propose here is to have front-ends describe the copy using
> metadata. For example:
> 
>   %struct.foo =3D type { i8, float, double }
>   [=85]
>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, i1 fa=
> lse), !struct.assignment !4
>   [=85]
>   !0 =3D metadata !{metadata !"Simple C/C++ TBAA"}
>   !1 =3D metadata !{metadata !"omnipotent char", metadata !0}
>   !2 =3D metadata !{metadata !"float", metadata !1}
>   !3 =3D metadata !{metadata !"double", metadata !1}
>   !4 =3D metadata !{ %struct.foo* null, metadata !5 }
>   !5 =3D metadata !{ metadata !1, metadata !2, metadata !3 }
> 
> Metadata nodes !0 through !3 are regular TBAA nodes as are already in use.
> 
> Metadata node !4 here is a top-level description of the memcpy. Its first
> operand is a null pointer, which is there just for its type. It specifies
> a (pointer to an) IR-level struct type the memcpy can be thought of as
> copying. The second operand is and MDNode which describes the TBAA values
> for the fields. The indices of the operands in that MDNode directly
> correspond to the indices of the members in the IR-level struct type.
> 
> With this information, optimizer and codegen can more aggressively optimize
> the memcpy. In particular, it would be possible for the optimizer to expand
> the memcpy into a series of loads and stores with complete TBAA information.
> Also, the optimize could determine where the padding is by examining the
> struct layout of the IR-level struct definition.
> 
> Note that this is not a proposal for struct-access-path aware TBAA, or
> even full struct value TBAA. This is just a way to preserve basic scalar
> TBAA for individual members of structs in a struct assignment.
> 
> Comments and questions are welcome.
> 
> Dan
> 
> [0] http://nondot.org/sabre/LLVMNotes/BetterStructureCopyOptimization.txt
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

+------------------------------------------------------------+
| Alexander M. Rosenberg        <mailto:alexr at leftfield.org> |
| Nobody cares what I say, so no disclaimer appears here.    |





More information about the llvm-dev mailing list