[llvm] [SROA] Allow rewriting memcpy depending on tbaa.struct (PR #77597)

Bruno De Fraine via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 19 08:01:15 PDT 2024


brunodf-snps wrote:

Hi Björn and Nikita,

I recently learned about commit 54067c5fbe9fc13ab195cdddb8f17e18d72b5fe4 in LLVM-18. Some time ago, we also discovered issue #64081 downstream in the context of an architecture with 20 bit pointers, stored as 4 bytes in memory. The libcxx std::string uses small string optimization that overlays a pointer with string data, and this would in some scenario trigger SROA to copy small string data as a pointer load/store, which does not copy the complete 4 bytes.

We tried the same fix as commit 54067c5fbe9fc13ab195cdddb8f17e18d72b5fe4, but like Björn we found severe quality regressions. SROA would fail to cleanup the stack access for C++ structs/classes containing pointer values, even if the code was benign. I realized then that the replacement of memcpy by a load/store pair (together with splitting) is critical for SROA mem2reg promotion: when an alloca is only used for direct load/stores the alloca can be removed and the values connected through. In the worst case, the memcpy copies between two alloca regions, and if you don't replace it by a load/store pair, both the allocas remain. This is hard to solve in SROA, because you are treating one alloca slice and its uses, but it may depend on the access in the other region that you are copying from/to whether it is safe to replace it by a narrower load/store.

At that point, I had the idea that in a situation where type size does not equal store size, it would be better to let SROA proceed with replacing the memcpy with a load/store pair, but convert the alloca slice to a promoted type that represents the entire storage region, and use (non-memory) operations to access the narrow type inside it. I did not continue working on it at that point, and it is not clear what form of promoted type to use (a wider integer? some special struct type with the other narrow type as its member?). To try to illustrate anyway, I created an example where `i28` is promoted to `i32`, and the conversions are done using `zext` and `trunc` operations: https://godbolt.org/z/E3xh5cerE It's not clear if this is viable.

In any case, if a solution for the regressions (based on TBAA metadata or otherwise) would materialize, I would be interested to know.

Regards,
Bruno

https://github.com/llvm/llvm-project/pull/77597


More information about the llvm-commits mailing list