[cfe-dev] How to tell if a class contains tail padding?

Han Zhu via cfe-dev cfe-dev at lists.llvm.org
Wed Jul 21 11:45:44 PDT 2021


Hi,

I'm working on an optimization to improve LoopIdiomRecognize pass. For a
trivial loop like this:

```
struct S {
  int a;
  int b;
  char c;
  // 3 bytes padding
};

unsigned copy_noalias(S* __restrict__ a, S* b, int n) {
  for (int i = 0; i < n; i++) {
    a[i] = b[i];
  }
  return sizeof(a[0]);
}
```

Clang generates the below loop (some parts of IR omitted):
```
%struct.S = type { i32, i32, i8 }

for.body:                                         ; preds = %for.cond
  %2 = load %struct.S*, %struct.S** %b.addr, align 8
  %3 = load i32, i32* %i, align 4
  %idxprom = sext i32 %3 to i64
  %arrayidx = getelementptr inbounds %struct.S, %struct.S* %2, i64 %idxprom
  %4 = load %struct.S*, %struct.S** %a.addr, align 8
  %5 = load i32, i32* %i, align 4
  %idxprom1 = sext i32 %5 to i64
  %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %4, i64
%idxprom1
  %6 = bitcast %struct.S* %arrayidx2 to i8*
  %7 = bitcast %struct.S* %arrayidx to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %6, i8* align 4 %7, i64
12, i1 false)
  br label %for.inc
```

It can be transformed into a single memcpy:

```
for.body.preheader:                               ; preds = %entry
  %b10 = bitcast %struct.S* %b to i8*
  %a9 = bitcast %struct.S* %a to i8*
  %0 = zext i32 %n to i64
  %1 = mul nuw nsw i64 %0, 12
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a9, i8* align 4 %b10,
i64 %1, i1 false)
  br label %for.cond.cleanup
```

The problem is, if the copied elements are a class, this doesn't work. For a
class with the same members:
```
%class.C = type <{ i32, i32, i8, [3 x i8] }>
```

Clang does some optimization to generate a memcpy of nine bytes, omitting
the
tail padding:

```
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %6, i8* align 4 %7, i64 9,
i1 false)
```

Then in LLVM, we find the memcpy is not touching every byte of the array, so
we abort the transformation.

If we could tell the untouched three bytes are padding, we should be able to
still do the optimization, but LLVM doesn't seem to have this information. I
tried using `DataLayout::getTypeStoreSize()`, and it returned 12 bytes. I
also
tried `StructLayout`, and it treats the tail padding as a regular class
member.

Is there an API in LLVM to tell if a class has tail padding? If not, would
it
be useful to add this feature?

Thanks,
Han
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210721/0f48823c/attachment-0001.html>


More information about the cfe-dev mailing list