[llvm-dev] Which pass should be propagating memory copies

Davide Italiano via llvm-dev llvm-dev at lists.llvm.org
Tue May 16 11:16:42 PDT 2017


On Tue, May 16, 2017 at 10:37 AM, Keno Fischer via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Consider the following IR example:
>
> define void @simple([4 x double] *%ptr, i64 %idx) {
>     %stack = alloca [4 x double]
>     %ptri8 = bitcast [4 x double] *%ptr to i8*
>     %stacki8 = bitcast [4 x double] *%stack to i8*
>     call void @llvm.memcpy.p0i8.p0i8.i32(i8 *%stacki8, i8 *%ptri8, i32 32,
> i32 0, i1 0)
>     %dataptr = getelementptr inbounds [4 x double], [4 x double] *%ptr, i32
> 0, i64 %idx
>     store double 0.0, double *%dataptr
>     call void @llvm.memcpy.p0i8.p0i8.i32(i8 *%ptri8, i8 *%stacki8, i32 32,
> i32 0, i1 0)
>     ret void
> }
>
>
> I would like to see this optimized to just a single store (into %ptr). Right
> now, even at -O3 that doesn't happen. My frontend guarantees that idx is
> always inbounds for the allocation, but I do think the transformation should
> be valid regardless because accessing beyond the bounds of the alloca should
> be undefined behavior. Now, my question is which pass should be responsible
> for doing this? SROA? DSE? GVN? A new pass just to do this kind of thing?
> Maybe there already is some pass that does this, just not in the default
> pipeline? Any hints would be much appreciated.
>
> Thanks,
> Keno
>

This seems like a GVN job to me.

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare


More information about the llvm-dev mailing list