[PATCH] D106053: [CodeGen] Remove pending AssertZext AssertSext in promoting FP_TO_INT

Xiang Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Jul 17 16:37:16 PDT 2021


xiangzhangllvm added inline comments.


================
Comment at: llvm/test/CodeGen/AArch64/arm64-convert-v4f64.ll:26-28
-; CHECK-DAG:  uzp1 v[[UZP0:[0-9]+]].4h, v[[XTN1]].4h, v[[XTN0]].4h
-; CHECK-DAG:  uzp1 v[[UZP1:[0-9]+]].4h, v[[XTN3]].4h, v[[XTN2]].4h
-; CHECK:      uzp1  v0.8b, v[[UZP1:[0-9]+]].8b, v[[UZP0:[0-9]+]].8b
----------------
jrtc27 wrote:
> Did the instruction count actually go down or did you just delete the instructions that were no longer present but not add new ones? You've certainly lost the CHECK line for the final instruction that puts all the temporary values into the return register.
Sorry for I didn't familiar with aarch64 instructions, I didn't clear about what uzp1 mean, and the result didn't have these uzp1 instructions, So I dele them. 
Can I update it with  utils/update_llc_test_checks.py ?


================
Comment at: llvm/test/CodeGen/RISCV/rv64d-double-convert.ll:25
 ; RV64ID-NEXT:    fcvt.l.d a0, ft0, rtz
+; RV64ID-NEXT:    sext.w a0, a0
 ; RV64ID-NEXT:    ret
----------------
jrtc27 wrote:
> All the RISC-V changes look like regressions to me unless I'm missing something...
Yes, if it base on the PromoteIntRes_FP_TO_XINT


================
Comment at: llvm/test/CodeGen/X86/fptoui-may-overflow.ll:5
+; We can't only generate "cvttps2dq %xmm0, %xmm0" for this function, because
+; there may be overflow case in fptoui. The old DAG optimization will optimize
+; "%f = fptoui <4 x float> %arg to <4 x i8>" to
----------------
aqjune wrote:
> xiangzhangllvm wrote:
> > aqjune wrote:
> > > xiangzhangllvm wrote:
> > > > efriedma wrote:
> > > > > aqjune wrote:
> > > > > > According to LangRef (https://llvm.org/docs/LangRef.html#fptoui-to-instruction), the overflow returns `poison` which means that using the value is invalid.
> > > > > > 
> > > > > > This is analogous to using an uninitialized variable in C/C++; there is no guarantee that the compiled program will have a reasonable behavior.
> > > > > > ```
> > > > > > int x; // not initialized
> > > > > > printf("%d", x); // Assume that this printed 0xDEADBEEF
> > > > > > ... (don't update x)
> > > > > > printf("%d", x); // There is no guarantee that this will also print 0xDEADBEEF
> > > > > > ```
> > > > > The comment is wrong.  But the CHECK lines are correct.  LangRef and alive2 say the following transform is invalid:
> > > > > 
> > > > > ```
> > > > > define <16 x i8> @src(<4 x float> %arg1) {
> > > > >   %ss = shufflevector <16 x i8> poison, <16 x i8> zeroinitializer, <16 x i32> <i32 0, i32 17, i32 18, i32 19, i32 1, i32 21, i32 22, i32 23, i32 2, i32 25, i32 26, i32 27, i32 3, i32 29, i32 30, i32 31>
> > > > >   ret <16 x i8> %ss
> > > > > }
> > > > > =>
> > > > > define <16 x i8> @tgt(<4 x float> %arg1) {
> > > > >   ret <16 x i8> poison
> > > > > }
> > > > > ```
> > > > > 
> > > > > (On a side-note, alive2 gives some confusing results for fptoui; apparently it thinks `fptoui float 31.5 to i32` is poison.)
> > > > Sorry, not much understand, Here the test didn't use poison elements, and we don't know the fptoui will be overflow or not, it is runtime.
> > > > 
> > > > 
> > > @efriedma Thank you for the report, I made a pull request at Alive2 that fixes the bugs. @nlopes will be back in a few weeks and have a look at it.
> > > 
> > > @xiangzhangllvm poison is kind of a conceptual value that appears in LLVM IR's abstract machine. 
> > > It is used to carry guarantees from C/C++ that e.g., casting big floats to signed integer is not legal.
> > > Of course, one cannot statically determine if a C program will do such cast or not.
> > > As double free raises segmentation fault, the execution can fail or print a bogus value; integer overflow is just less visible to users.
> > Hi @aqjune ,let's talk on the test. The current problem is 
> > 
> > "%f = fptoui <4 x float> %arg to <4 x i8>" will be convert to "fptosi <4 x float> %1 to <4 x i16>  +  AssertZext"
> > 
> > Do you mean in overflow case, %f should be poison value and report error ? (I agree)
> > But this convert self is not correct.
> > And even we mark some elements of %f is poison, but the following shuffle didn't use (read) this poison element. It just write 0 into these elements.
> > And this action of "write 0" will be remove by AssertZext.  So this patch is try to removing this AssertZext.
> > 
> > "%f = fptoui <4 x float> %arg to <4 x i8>" will be convert to "fptosi <4 x float> %1 to <4 x i16> + AssertZext"
> 
> To me the transformation seems correct; let me explain why I think so.
> I'll use <1 x float> instead of <4 x float> to avoid confusion; it seems the number of elements doesn't matter here.
> 
> Let's assume that %arg[0] fit in [0,256); then AssertZext is fine, because 8 MSBs of i16 is anyway zero.
> The problematic case is when %arg[0] doesn't fit in [0, 256).
> Then, %f[0] is poison; using %f[0] will raise undefined behavior.
> Then, the optimized code can do anything including using the register which isn't filled with zero bits...
> So in either case, having AssertZext is fine.
> 
> To me, efriedma's shufflevector transformation (https://reviews.llvm.org/D106053#2882183) looks fishier. 
> If 2a419a0b9957 is the root cause of the shufflevector transformation, and reverting it (or fixing it using freeze) solves the problem, then what do you think about the alternative solution?
> 2a419a0b9957 has small diffs in the tests which implies that it will have a small impact on performance, maybe.
**I think the key of our discussion should base on the correctness of the transformation.**

The logic of [[ https://reviews.llvm.org/rG2a419a0b9957ebac9e11e4b43bc9fbe42a9207df | 2a419a0b9957]] self is no problem, it base on the how the computeKnownBits handle the AssertZext.
Now the IR already be
```
 "fptosi <4 x float> %1 to <4 x i16> + AssertZext  i8"
```
If you think the transformation is correct, how can we know there is poison bits ?
AssertZext self do not contain the "poison" meaning that the **bits 8-15** is poison.
 it should just tell a truth that the high N bits is zero.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106053/new/

https://reviews.llvm.org/D106053



More information about the llvm-commits mailing list