[cfe-dev] Converting float to int with FJCVTZS
Johannes Hoff via cfe-dev
cfe-dev at lists.llvm.org
Thu Mar 11 07:10:05 PST 2021
Thanks for you relpy. I think exposing those would be great indeed.
Also, having the possibility to pick a UB-free implementation with compiler flags would be great. Be it saturating or anything else.
To be sure, neither of these are possible today?
> On 10 Mar 2021, at 18:40, Roman Lebedev <lebedev.ri at gmail.com> wrote:
> LLVM already has support for UB-free float2int conversions:
> Rather than trying to herd each backend to conditionally do the same thing,
> I think a much more straight-forward solution would be
> to expose those intrinsics as clang builtins.
> On Wed, Mar 10, 2021 at 8:37 PM Craig Topper via cfe-dev
> <cfe-dev at lists.llvm.org> wrote:
>> Oh I think I understand now. For unsigned int to float, x86-64 uses a 64-bit cvttss2si instruction and drops the upper 32 bits because there's no 32-bit unsigned conversion instruction without avx512.
>> So are you asking for AArch64 to also do a 64-bit conversion and truncate the result? Replacing a 32-bit fcvtzu with a 32-bit fjcvtzs wouldn't work would it?
>> On Wed, Mar 10, 2021 at 9:26 AM Craig Topper <craig.topper at gmail.com> wrote:
>>> Hi Johannes,
>>> I don't think cvttss2si wraps around. Instead it returns 0x80000000 for large values. "If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised. If this exception is masked, the indefinite integer value (80000000H or 80000000_00000000H if operand size is 64 bits) is returned."
>>> Also isn't fcvtzu an unsigned conversion while cvttss2si and FJCVTZS are signed conversions? Am I missing something?
>>> On Wed, Mar 10, 2021 at 9:03 AM Johannes Hoff via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>>>> I'm working on a code base where a simulation needs to produce the exact same result on Aarch64 and x86_64 architectures.
>>>> This is indeed the case for the whole codebase, with one exception: Rounding floats to integers. Specifically, when we're in undefined behavior territory. In that case, you notice the difference between the emitted fcvtzu instruction on aarch64 (saturating cast) and cvttss2si on x86 (wrap-around).
>>>> Now, I know undefined behavior is not the main business of LLVM, but I wonder if it would be possible to ask it to emit FJCVTZS instead, which behaves like x86 outside of the integer range. Of course, this would be an opt-in flag.
>>>> What do you think? If it's not something that would be valuable for clang, do you have any pointers on how to patch it myself?
>>>> Of course, I can just use the compiler intrinsic __builtin_arm_jcvt to trigger this behavior, but then I need to be sure to catch all the places, and be sure that everyone on the team remembers to do the same in the future.
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
More information about the cfe-dev