[cfe-dev] How to tell if a class contains tail padding?

Han Zhu via cfe-dev cfe-dev at lists.llvm.org
Fri Jul 23 11:54:03 PDT 2021


Eli, Richard,

Thank you both for the advice. It seems indeed non-trivial to handle this
case. I'll evaluate this again to see if it's worth implementing.

Thanks,
Han

On Wed, Jul 21, 2021 at 2:27 PM Richard Smith <richard at metafoo.co.uk> wrote:

> On Wed, 21 Jul 2021 at 13:43, Han Zhu via cfe-dev <cfe-dev at lists.llvm.org>
> wrote:
>
>> After clang codegen, is the class layout final? Does the optimizer modify
>> the class layout?
>>
>
> LLVM IR does not contain enough information to determine whether the
> optimization is valid. The IR that we generate for:
>
> struct S {
>   ~S() {}
>   int a;
>   int b;
>   char c;
>   // 3 bytes padding
> };
>
> unsigned copy_noalias(S* __restrict__ a, S* b, int n) {
>   for (int i = 0; i < n; i++) {
>     a[i] = b[i];
>   }
>   return sizeof(a[0]);
> }
>
> ... would also be correct IR to generate for:
>
> struct T : S {
>   char x, y, z; // stored in S's tail padding
> };
>
> unsigned copy_noalias2(T* __restrict__ a, T* b, int n) {
>   for (int i = 0; i < n; i++) {
>     (S&)a[i] = (S&)b[i];
>   }
>   return sizeof((S&)a[0]);
> }
>
> You'll need to generate some additional information from the frontend if
> you want to be able to do this. You could, in at least some cases, analyze
> the types and expressions involved and locally prove that you know the last
> few bytes are guaranteed to be padding, then generate !tbaa.struct metadata
> and attach it to the @llvm.memcpy call that the frontend emits.
>
> On Wed, Jul 21, 2021 at 12:31 PM Eli Friedman <efriedma at quicinc.com>
>> wrote:
>>
>>> I suspect the transform you’re trying to do is more complicated than
>>> you’re making it out to be.
>>>
>>>
>>>
>>> In general, if you have a class that isn’t “POD for the purpose of
>>> layout” (https://itanium-cxx-abi.github.io/cxx-abi/abi.html), derived
>>> classes can store data in the tail padding.  So the “padding” might contain
>>> data the program cares about.  If you want to overwrite that space, you
>>> need to prove there isn’t a derived class storing data there.
>>>
>>>
>>>
>>> Possible proof approaches:
>>>
>>>
>>>
>>>    1. If the class is marked “final”, there aren’t any derived classes.
>>>    2. Array indexing with the wrong pointer type might be illegal.
>>>
>>>
>>>
>>> -Eli
>>>
>>>
>>>
>>> *From:* cfe-dev <cfe-dev-bounces at lists.llvm.org> *On Behalf Of *Han Zhu
>>> via cfe-dev
>>> *Sent:* Wednesday, July 21, 2021 11:46 AM
>>> *To:* cfe-dev at lists.llvm.org; llvm-dev at lists.llvm.org
>>> *Subject:* [EXT] [cfe-dev] How to tell if a class contains tail padding?
>>>
>>>
>>>
>>> Hi,
>>>
>>> I'm working on an optimization to improve LoopIdiomRecognize pass. For a
>>> trivial loop like this:
>>>
>>> ```
>>> struct S {
>>>   int a;
>>>   int b;
>>>   char c;
>>>   // 3 bytes padding
>>> };
>>>
>>> unsigned copy_noalias(S* __restrict__ a, S* b, int n) {
>>>   for (int i = 0; i < n; i++) {
>>>     a[i] = b[i];
>>>   }
>>>   return sizeof(a[0]);
>>> }
>>> ```
>>>
>>> Clang generates the below loop (some parts of IR omitted):
>>> ```
>>> %struct.S = type { i32, i32, i8 }
>>>
>>> for.body:                                         ; preds = %for.cond
>>>   %2 = load %struct.S*, %struct.S** %b.addr, align 8
>>>   %3 = load i32, i32* %i, align 4
>>>   %idxprom = sext i32 %3 to i64
>>>   %arrayidx = getelementptr inbounds %struct.S, %struct.S* %2, i64
>>> %idxprom
>>>   %4 = load %struct.S*, %struct.S** %a.addr, align 8
>>>   %5 = load i32, i32* %i, align 4
>>>   %idxprom1 = sext i32 %5 to i64
>>>   %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %4, i64
>>> %idxprom1
>>>   %6 = bitcast %struct.S* %arrayidx2 to i8*
>>>   %7 = bitcast %struct.S* %arrayidx to i8*
>>>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %6, i8* align 4 %7,
>>> i64 12, i1 false)
>>>   br label %for.inc
>>> ```
>>>
>>> It can be transformed into a single memcpy:
>>>
>>> ```
>>> for.body.preheader:                               ; preds = %entry
>>>   %b10 = bitcast %struct.S* %b to i8*
>>>   %a9 = bitcast %struct.S* %a to i8*
>>>   %0 = zext i32 %n to i64
>>>   %1 = mul nuw nsw i64 %0, 12
>>>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a9, i8* align 4
>>> %b10, i64 %1, i1 false)
>>>   br label %for.cond.cleanup
>>> ```
>>>
>>> The problem is, if the copied elements are a class, this doesn't work.
>>> For a
>>> class with the same members:
>>> ```
>>> %class.C = type <{ i32, i32, i8, [3 x i8] }>
>>> ```
>>>
>>> Clang does some optimization to generate a memcpy of nine bytes,
>>> omitting the
>>> tail padding:
>>>
>>> ```
>>> call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %6, i8* align 4 %7, i64
>>> 9, i1 false)
>>> ```
>>>
>>> Then in LLVM, we find the memcpy is not touching every byte of the
>>> array, so
>>> we abort the transformation.
>>>
>>> If we could tell the untouched three bytes are padding, we should be
>>> able to
>>> still do the optimization, but LLVM doesn't seem to have this
>>> information. I
>>> tried using `DataLayout::getTypeStoreSize()`, and it returned 12 bytes.
>>> I also
>>> tried `StructLayout`, and it treats the tail padding as a regular class
>>> member.
>>>
>>> Is there an API in LLVM to tell if a class has tail padding? If not,
>>> would it
>>> be useful to add this feature?
>>>
>>> Thanks,
>>> Han
>>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210723/04f98255/attachment.html>


More information about the cfe-dev mailing list