[llvm-dev] Loop vectorization with the loop containing bitcast
Michael Kuperstein via llvm-dev
llvm-dev at lists.llvm.org
Wed Aug 17 13:06:03 PDT 2016
Oh, sorry, I didn't realize you already have a patch for this.
Can you submit it for review using the regular process? See
http://llvm.org/docs/Phabricator.html for details.
Note that you'll need a test included in the patch.
Thanks,
Michael
On Wed, Aug 17, 2016 at 12:05 PM, Lin, Jin <jin.lin at intel.com> wrote:
> Hi Michael,
>
>
>
> Many thanks for your quick response. The PR 29021 has been filed to
> address this issue.
>
>
>
> Jin
>
>
>
> *From:* Michael Kuperstein [mailto:mkuper at google.com]
> *Sent:* Wednesday, August 17, 2016 11:33 AM
> *To:* Lin, Jin <jin.lin at intel.com>
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] Loop vectorization with the loop containing
> bitcast
>
>
>
> Hi Jin,
>
>
>
> I agree, this looks wrong. The bitcasts are fallout from r226781 - and we
> should be able to look through them if the size is the same.
>
> Can you please file a PR?
>
>
>
> Thanks,
>
> Michael
>
>
>
> On Wed, Aug 17, 2016 at 10:39 AM, Lin, Jin via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi ,
>
>
>
> The following loop fails to be vectorized since the load c[i] is casted as
> i64 and the store c[i] is double. The loop access analysis gives up since
> they are in different types.
>
>
>
> Since these two memory operations are in the same size, I believe the loop
> access analysis should return forward dependence and thus the loop can be
> vectorized.
>
>
>
> Any comments?
>
>
>
> Thanks,
>
>
>
> Jin
>
>
>
> #define N 1000
>
> double a[N], b[N],c[N];
>
> void foo() {
>
> for (int i=0;i<N;i++) {
>
> b[i] =c[i];
>
> c[i]=0.0;
>
> }
>
> }
>
>
>
> for.body: ; preds = %for.body,
> %entry
>
> %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
>
> %arrayidx = getelementptr inbounds [1000 x double], [1000 x double]* @c,
> i64 0, i64 %indvars.iv
>
> %0 = bitcast double* %arrayidx to i64*
>
> %1 = load i64, i64* %0, align 8, !tbaa !1
>
> %arrayidx2 = getelementptr inbounds [1000 x double], [1000 x double]*
> @b, i64 0, i64 %indvars.iv
>
> %2 = bitcast double* %arrayidx2 to i64*
>
> store i64 %1, i64* %2, align 8, !tbaa !1
>
> store double 0.000000e+00, double* %arrayidx, align 8, !tbaa !1
>
> %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
>
> %exitcond = icmp eq i64 %indvars.iv.next, 1000
>
> br i1 %exitcond, label %for.cond.cleanup, label %for.body
>
>
>
> LAA: Found a loop in foo: loop.17
>
> LAA: Processing memory accesses...
>
> AST: Alias Set Tracker: 2 alias sets for 3 pointer values.
>
> AliasSet[0x9508b80, 1] must alias, No access Pointers: (<4 x i64>* %1,
> 18446744073709551615)
>
> AliasSet[0x95f8a70, 2] must alias, No access Pointers: (<4 x double>*
> %2, 18446744073709551615), (<4 x i64>* %0, 18446744073709551615)
>
>
>
> LAA: Accesses(3):
>
> %1 = bitcast double* %arrayIdx11 to <4 x i64>* (write)
>
> %2 = bitcast double* %arrayIdx to <4 x double>* (write)
>
> %0 = bitcast double* %arrayIdx to <4 x i64>* (read-only)
>
> Underlying objects for pointer %1 = bitcast double* %arrayIdx11 to <4 x
> i64>*
>
> @b = common local_unnamed_addr global [1000 x double] zeroinitializer,
> align 16
>
> Underlying objects for pointer %2 = bitcast double* %arrayIdx to <4 x
> double>*
>
> @c = common local_unnamed_addr global [1000 x double] zeroinitializer,
> align 16
>
> Underlying objects for pointer %0 = bitcast double* %arrayIdx to <4 x
> i64>*
>
> @c = common local_unnamed_addr global [1000 x double] zeroinitializer,
> align 16
>
> LAA: Found a runtime check ptr: %1 = bitcast double* %arrayIdx11 to <4 x
> i64>*
>
> LAA: Found a runtime check ptr: %2 = bitcast double* %arrayIdx to <4 x
> double>*
>
> LAA: Found a runtime check ptr: %0 = bitcast double* %arrayIdx to <4 x
> i64>*
>
> LAA: We need to do 0 pointer comparisons.
>
> LAA: We can perform a memory runtime check if needed.
>
> LAA: Checking memory dependencies
>
> LAA: Src Scev: {@c,+,32}<nsw><%loop.17>Sink
> <%7b at c,+,32%7d%3cnsw%3e%3c%25loop.17%3eSink> Scev:
> {@c,+,32}<nsw><%loop.17>(Induction
> <%7b at c,+,32%7d%3cnsw%3e%3c%25loop.17%3e(Induction> step: 1)
>
> LAA: Distance for %gepload = load <4 x i64>, <4 x i64>* %0, align 16,
> !tbaa !1 to store <4 x double> zeroinitializer, <4 x double>* %2, align
> 16, !tbaa !1: 0
>
> LAA: Zero dependence difference but different types
>
> Total Dependences: 1
>
> LAA: unsafe dependent memory operations in loop
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160817/a50c2b63/attachment-0001.html>
More information about the llvm-dev
mailing list