[llvm-dev] Which pass should be propagating memory copies
Davide Italiano via llvm-dev
llvm-dev at lists.llvm.org
Tue May 16 11:16:42 PDT 2017
On Tue, May 16, 2017 at 10:37 AM, Keno Fischer via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Consider the following IR example:
>
> define void @simple([4 x double] *%ptr, i64 %idx) {
> %stack = alloca [4 x double]
> %ptri8 = bitcast [4 x double] *%ptr to i8*
> %stacki8 = bitcast [4 x double] *%stack to i8*
> call void @llvm.memcpy.p0i8.p0i8.i32(i8 *%stacki8, i8 *%ptri8, i32 32,
> i32 0, i1 0)
> %dataptr = getelementptr inbounds [4 x double], [4 x double] *%ptr, i32
> 0, i64 %idx
> store double 0.0, double *%dataptr
> call void @llvm.memcpy.p0i8.p0i8.i32(i8 *%ptri8, i8 *%stacki8, i32 32,
> i32 0, i1 0)
> ret void
> }
>
>
> I would like to see this optimized to just a single store (into %ptr). Right
> now, even at -O3 that doesn't happen. My frontend guarantees that idx is
> always inbounds for the allocation, but I do think the transformation should
> be valid regardless because accessing beyond the bounds of the alloca should
> be undefined behavior. Now, my question is which pass should be responsible
> for doing this? SROA? DSE? GVN? A new pass just to do this kind of thing?
> Maybe there already is some pass that does this, just not in the default
> pipeline? Any hints would be much appreciated.
>
> Thanks,
> Keno
>
This seems like a GVN job to me.
--
Davide
"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare
More information about the llvm-dev
mailing list