[PATCH] D107262: [CodeGenPrepare] The instruction to be sunk should be inserted before its user in a block

Tiehu Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 9 01:14:15 PDT 2021


TiehuZhang added a comment.

In D107262#2930517 <https://reviews.llvm.org/D107262#2930517>, @dmgreen wrote:

> Thanks, looking good. But I do still worry about the order of instructions sunk.
>
> I was trying it out, seeing if it would go wrong when we were sinking a lot of operands. I noticed that the add/sub sinking wasn't really working properly though! There is https://reviews.llvm.org/D107623 to improve that and getting the shuffles to sink.
>
> With that in, can you add these two test to show partially sinking two values at the same time:
>
>   define <4 x i32> @sinkadd_partial(<8 x i16> %a1, <8 x i16> %a2, i8 %f) {
>   for.cond4.preheader.lr.ph:
>     %cmp = icmp slt i8 %f, 0
>     %s2 = shufflevector <8 x i16> %a2, <8 x i16> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
>     %s1 = shufflevector <8 x i16> %a1, <8 x i16> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
>     br i1 %cmp, label %for.cond4.preheader.us.preheader, label %for.cond4.preheader.preheader
>   
>   for.cond4.preheader.us.preheader:                 ; preds = %for.cond4.preheader.lr.ph
>     %e1 = sext <4 x i16> %s1 to <4 x i32>
>     %e2 = sext <4 x i16> %s2 to <4 x i32>
>     %0 = add <4 x i32> %e1, %e2
>     ret <4 x i32> %0
>   
>   for.cond4.preheader.preheader:                    ; preds = %for.cond4.preheader.lr.ph
>     ret <4 x i32> zeroinitializer
>   }
>   
>   define <4 x i32> @sinkadd_partial_rev(<8 x i16> %a1, <8 x i16> %a2, i8 %f) {
>   for.cond4.preheader.lr.ph:
>     %cmp = icmp slt i8 %f, 0
>     %s2 = shufflevector <8 x i16> %a2, <8 x i16> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
>     %s1 = shufflevector <8 x i16> %a1, <8 x i16> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
>     br i1 %cmp, label %for.cond4.preheader.us.preheader, label %for.cond4.preheader.preheader
>   
>   for.cond4.preheader.us.preheader:                 ; preds = %for.cond4.preheader.lr.ph
>     %e2 = sext <4 x i16> %s2 to <4 x i32>
>     %e1 = sext <4 x i16> %s1 to <4 x i32>
>     %0 = add <4 x i32> %e1, %e2
>     ret <4 x i32> %0
>   
>   for.cond4.preheader.preheader:                    ; preds = %for.cond4.preheader.lr.ph
>     ret <4 x i32> zeroinitializer
>   }
>
> The order of extends in the target block become important.

OK, I will, thanks!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107262/new/

https://reviews.llvm.org/D107262



More information about the llvm-commits mailing list